# RAID 1 unexplained mismatches

## bruda

Hello,

I have 4 machines using between 3 and 5 software RAID 1 volumes each.  I perform a data scrub on each volume once a week (Friday at 4 am; at that time the systems are as quiet disk-wise as they ever are).  The scrub keeps reporting mismatched blocks on some volumes on some machines.

I did my homework (multiple times) and so I know that RAID1 mismatches may occur because of swap files or because files change while the scrub is in progress.  Actually in the case of my systems the possible cause is just files changing, as there is no swap on any of the RAID volumes being discussed here.

The root volume typically (but not always) shows a mismatch count of 128 (rarely, 256).  The /var volume shows mismatch counts from 0 to 384.  Sometimes /home shows a mismatch count (up to 256) as well. This all varies from week to week and from machine to machine! I have one particular volume on one particular machine that almost always shows a rather high mismatch count (up to 3328).  This partition holds unchanging data but also the portage tree which is synced via a script in cron.daily (and thus at 3 am, right?).  The sync should finish by 4 am so I do not believe that it is the reason.  A detailed count is included below.

All the disks on all the systems are healthy as far as SMART data is concerned.

Overall, does anybody have any idea what might be the cause of the mismatches?  I kind of understand that /var might show mismatches from time to time (since it changes constantly), but why the other volumes?  Some disks might be going silently bad, but we are talking about too many disks for this to be plausible, right?  Any advice is appreciated.

Many thanks in advance.

P.S. To better see the extent of the problem better ere is a detailed summary of mismatches for the past four weeks.

* Machine 1 (md0 and md1 on one set of disks, the rest on a second set of disks):* 24 Nov: 128 on /home (md1), 128 on / (md3), 384 on /var (md4), 2048 on /portage (md5)

* 1 Dec: 128 on /home (md1), 384 on / (md3), 128 on /var (md4), 128 on /portage (md5)

* 8 Dec: 128 on / (md3), 2816 on /portage (md5)

* 15 Dec: 128 on /var (md4), 3328 on /portage (md5)* Machine 2 (all volumes on the same set of disks):* 24 Nov: 256 on / (md2)

* All zero all the other times* Machine 3 (all volumes on the same set of disks):* 24 Nov: all zero

* 1 Dec: 256 on /var (md3), 128 on /home (md4)

* 8 Dec: 256 on /home (md4)

* 15 Dec: 384 on /home (md4)* Machine 4 (all volumes on the same set of disks):* 15 Dec: 128 on /

* All zero all the other timesAll the machines run mdadm 3.1.4 and gentoo-sources-3.5.7 (with the exception of Machine 2 which is still on gentoo-sources-3.4.9 because 8139cp is broken in 3.5.7).

----------

## frostschutz

 *bruda wrote:*   

> I perform a data scrub on each volume once a week [...] The scrub keeps reporting mismatched blocks on some volumes on some machines.
> 
> I did my homework (multiple times) and so I know that RAID1 mismatches may occur because of swap files or because files change while the scrub is in progress.
> 
> 

 

Uh - thanks for the link but if that's actually true, I think I'd like to see this documented elsewhere. Up until now I thought of mdadm and filesystems as mostly independent systems, not something that is allowed to go out of sync just because the filesystem decides that some changes aren't important anymore. That's something I'd believe possible with ZFS or btrfs, as they're packing their own, filesystem-aware RAID implementations, but for mdadm, now that's news to me...

I tried Google but all I found was posts which link back to the Gentoo wiki  :Laughing:  it's not mentioned in the Linux RAID wiki, nor in linux/Documentation as far as I can see.

Expectation: number of mismatched blocks should aways be 0

Reality: ???

----------

## bruda

 *frostschutz wrote:*   

>  *bruda wrote:*   I perform a data scrub on each volume once a week [...] The scrub keeps reporting mismatched blocks on some volumes on some machines. 
> 
> Uh - thanks for the link but if that's actually true, I think I'd like to see this documented elsewhere. 

 

You mean for the data scrub?  I do not have any other reference for this, but note that I am using the standard (kernel) /sys interface.  Some Redhat/Fedora versions are apparently performing this as well, as questions similar to mine used to pop up relatively frequently on their lists.  Same goes for Debian.

Specifically I noticed a number of questions and bug reports on the matter, all of them relatively old.  See for example http://permalink.gmane.org/gmane.linux.raid/33582,  https://bugzilla.redhat.com/show_bug.cgi?id=543044, https://www.redhat.com/archives/rhelv5-list/2011-January/msg00065.html, http://bergs.biz/blog/2009/03/01/startled-by-component-device-mismatches-on-raid1-volumes/, http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=405919, http://www.centos.org/modules/newbb/viewtopic.php?topic_id=23164&forum=37.

All of them say that the mismatches are normal and due to swap and memory-mapped files.  Problem is, I do not have any swap files and at least some of my partitions have no reason to have memory-mapped files.  It would appear that my mismatches are not explained by the common scenarios.

 *frostschutz wrote:*   

> Up until now I thought of mdadm and filesystems as mostly independent systems, not something that is allowed to go out of sync just because the filesystem decides that some changes aren't important anymore. That's something I'd believe possible with ZFS or btrfs, as they're packing their own, filesystem-aware RAID implementations, but for mdadm, now that's news to me...

 

That's precisely my thought.  I believe that mismatches should not happen on my RAIDs, period.  Note too that this has nothing to do with mdadm, all is done via the kernel RAID interface.  /usr/src/linux/Documentation does not seem to include anything about the RAID module and I could not find any definite answer anywhere...

 *frostschutz wrote:*   

> I tried Google but all I found was posts which link back to the Gentoo wiki  it's not mentioned in the Linux RAID wiki, nor in linux/Documentation as far as I can see.
> 
> Expectation: number of mismatched blocks should always be 0
> 
> Reality: ???

 

Our departmental sysadmin says that I should definitely worry or else I am using some tool that is different from other distributions (he is administering a collection of Linux boxes, none of them Gentoo).  The reality is that all that is involved in this issue is the kernel; I am not sure to what degree the Gentoo patches play any role, but these are the only difference as far as I can see.  I do not have any non-Gentoo machine myself to see if things are different elsewhere though...

----------

## frostschutz

Aha, man 4 md has a section on scrubbing and mismatches, which mentions the issue.

Basically such behaviour makes the check sync_action virtually useless for RAID1. Instead you should probably do a block by block comparison yourself, so you get the sector numbers of mismatching blocks, and then check if those sectors belong to any files in your filesystem. If you have mismatching sectors in files, reading from those files will yield mismatching results, depending on which drive was being used for the reading. In free space it does not matter.

This is so weird. I think sometime I'll have to write a script for this which does memmap/truncate on files just to see if I can provoke such mismatch errors.

----------

## NeddySeagoon

frostschutz,

Hmm - I thought small numbers of mismatching sectors was a feature of kernel raid because all members of a raid set could not be written atomicly.

As such, mismatches were a transient phenomona unless something nasty happed, like loss of power, when you get a write hole

The write hole problem is not limited to raid5 as the article suggests.

----------

## bruda

 *frostschutz wrote:*   

> Basically such behaviour makes the check sync_action virtually useless for RAID1. Instead you should probably do a block by block comparison yourself, so you get the sector numbers of mismatching blocks, and then check if those sectors belong to any files in your filesystem. If you have mismatching sectors in files, reading from those files will yield mismatching results, depending on which drive was being used for the reading. In free space it does not matter.

 

That's an interesting suggestion, many thanks.

 *frostschutz wrote:*   

> This is so weird. I think sometime I'll have to write a script for this which does memmap/truncate on files just to see if I can provoke such mismatch errors.

 

I also find it weird for it to happen on disks that do not seem much activity at check time.  Maybe this happens on the /portage partition because many previous changes, who knows.  In any event, I am going to do the block by block check as soon as I find the time.

Thank you do much.

----------

## frostschutz

 *NeddySeagoon wrote:*   

> Hmm - I thought small numbers of mismatching sectors was a feature of kernel raid because all members of a raid set could not be written atomicly.

 

The writes are done by the md system, and the check is done by the md system - it should be expected to be smart enough to not report mismatches for disk areas it's in the process of writing to.

What's the point of the check otherwise if you can not trust what it returns? You'd be better off writing your own check script then.

It should make a copy of the page before writing it out (if that is indeed the problem). At least there should be an option to enable it. Performance impact should be negligible.

----------

## bruda

 *frostschutz wrote:*   

>  *NeddySeagoon wrote:*   Hmm - I thought small numbers of mismatching sectors was a feature of kernel raid because all members of a raid set could not be written atomicly. 
> 
> The writes are done by the md system, and the check is done by the md system - it should be expected to be smart enough to not report mismatches for disk areas it's in the process of writing to.
> 
> What's the point of the check otherwise if you can not trust what it returns? You'd be better off writing your own check script then.

 

Add to that that any mismatching being reported for a RAID is actually scary...  Overall the mismatch_cnt interface seems to be simply broken for RAID 1, so at the very least it should be disabled!  Anyhoo, back to writing my own check script then...

Cheers!

----------

