# RAID 1 Problem: array keeps resyncing [solved]

## drtebi

I have browsed numerous threads and got my RAID 1 to work just fine. However, there is one strange problem I have that I couldn't get an answer to:

After booting, my /proc/mdstat looked like this:

```

Personalities : [raid1]

read_ahead 1024 sectors

md0 : active raid1 ide/host0/bus0/target0/lun0/part1[0] ide/host0/bus1/target0/lun0/part1[1]

      120053632 blocks [2/2] [UU]

      [>....................]  resync =  1.3% (1601708/120053632) finish=164.9min speed=11969K/sec

unused devices: <none>

```

OK, so I figured the RAID is being built (synced), and waited until it was done. Then the same command showed everything was fine and running.

However, the problem is if I reboot, it starts all over again with the resync, (from 0), every time! Here is what I get from dmesg:

```

--- snip ---

md: raid1 personality registered as nr 3

md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27

md: Autodetecting RAID arrays.

 [events: 00000010]

 [events: 00000010]

md: autorun ...

md: considering ide/host0/bus1/target0/lun0/part1 ...

md:  adding ide/host0/bus1/target0/lun0/part1 ...

md:  adding ide/host0/bus0/target0/lun0/part1 ...

md: created md0

md: bind<ide/host0/bus0/target0/lun0/part1,1>

md: bind<ide/host0/bus1/target0/lun0/part1,2>

md: running: <ide/host0/bus1/target0/lun0/part1><ide/host0/bus0/target0/lun0/part1>

md: ide/host0/bus1/target0/lun0/part1's event counter: 00000010

md: ide/host0/bus0/target0/lun0/part1's event counter: 00000010

md: md0: raid array is not clean -- starting background reconstruction

md: RAID level 1 does not need chunksize! Continuing anyway.

md0: max total readahead window set to 124k

md0: 1 data-disks, max readahead per data-disk: 124k

raid1: device ide/host0/bus1/target0/lun0/part1 operational as mirror 1

raid1: device ide/host0/bus0/target0/lun0/part1 operational as mirror 0

raid1: raid set md0 not clean; reconstructing mirrors

raid1: raid set md0 active with 2 out of 2 mirrors

md: updating md0 RAID superblock on device

md: ide/host0/bus1/target0/lun0/part1 [events: 00000011]<6>(write) ide/host0/bus1/target0/lun0/part1's sb offset: 120053632

md: syncing RAID array md0

md: minimum _guaranteed_ reconstruction speed: 100 KB/sec/disc.

md: using maximum available idle IO bandwith (but not more than 100000 KB/sec) for reconstruction.

md: using 124k window, over a total of 120053632 blocks.

md: ide/host0/bus0/target0/lun0/part1 [events: 00000011]<6>(write) ide/host0/bus0/target0/lun0/part1's sb offset: 120053632

md: ... autorun DONE.

--- snip ---

```

This is my /etc/mdadm.conf:

```

DEVICE /dev/hda1 /dev/hdc1

ARRAY /dev/md0 devices=/dev/hda1,/dev/hdc1

```

and this my /etc/fstab:

```

# <fs>               <mountpoint>   <type>      <opts>            <dump/pass>

/dev/sda1            /boot          ext2        noauto,noatime    1 1

/dev/sda5            /              xfs         noatime           0 0

/dev/dsa2            none           swap        sw                0 0

/dev/md0             /raid          xfs         noatime           0 0

/dev/cdroms/cdrom0   /mnt/cdrom     iso9660     noauto,ro         0 0

proc                 /proc          proc        defaults          0 0

tmpfs                /dev/shm       tmpfs       defaults          0 0

```

I am using two identical Maxtor Diamond 9 120 GB disks.

To create the RAID, I used cfdisk to create one primary partion, so about 114GB, on each drive. I set the partion type to FD.

I rebooted to see if the system read the partions correctly. Then I created the RAID 1 with mdadm:

```

mdadm --create /dev/md0 --chunk=128 --level=1 raid-devices=2 /dev/hd[ac]1

```

This command also starts the RAID. So all that was left to do is create a file system on the disks and start using them. I chose XFS:

```

mkfs.xfs -d agcount=64 -l size=32m /dev/md0

```

Everything like reading and writing to the md0 works just fine, and still does now, except the resync starts again at every boot! 

What is wrong, or what is it that I don't understand? Is it supposed to resync at every boot?

I checked my kernel messages, there was nothing indicating that any of the drives is bad. I am not using the RAID as a boot drive, simply as a storage.

My system:

400 MgHz Pentium III

SuperMicro P6SBS

256MB SDRAM (Crucial)

Quantum Viking II 4.5 GB SCSI Disk (holds the Gentoo OS)

3COM NIC

I used Gentoo's LiveCD "x86-basic-1.4-20030911.iso" and installed everything from scratch, with RAID support:

```

[*] Multiple devices driver support (RAID and LVM)

<*>  RAID support

< >   Linear (append) mode

< >   RAID-0 (striping) mode

<*>   RAID-1 (mirroring) mode

< >   RAID-4/RAID-5 mode

< >   Multipath I/O support

< >  Logical volume manager (LVM) support

```

...please help/explain what the problem is.Last edited by drtebi on Wed Jan 26, 2005 4:54 am; edited 1 time in total

----------

## BradN

It would seem that the raid disks are somehow not being shut down and marked clean before the system reboots...  Try manually stopping the device before you reboot and see if it still does it then.

----------

## Zviratko

Shutdown you machine (don't poweroff - disable power management if you need to) and look for messages like

Stopping array md2

Stopping array md1

Stopping array md0

Array in use, /dev/md0 busy

(I'm inventing, but it's something like that, one of the last messages you get on local console).

I've also run into this problem, and the solution was quite simple - just add "sync;sync" to your shutdown script... on my system, this fixed it (I'm using it on root partition), I suspect this is because of hard disc cache (kernel remounts /dev/md0 read-only or unmounts it, but powers the system down/reboots before the write cache (on discs, not in kernel) gets flushed. If I am right, it's a hardware problem.

So just add

"sync;sync"

somewhere where your shutdown script ends and pray... :)

You can also try

"hdparm -W0 /dev/hda ; hdparm -W0 /dev/hdc" (or whatever your drives are), but that cripples write speed for frequent access...

I wonder what would happen if my first/master hard drive crashed in the first minutes while syncing... would I lose some data? That essentialy eliminates the effect of having RAID1...

I also think this was discussed on LKML or on kerneltrap.org somewhere (tha write cache problem).

----------

## Zviratko

Looks like I just found a more correct solution

 *Quote:*   

> 
> 
> The  -h  flag puts all harddisks in standby mode just before halt or poweroff. Right now this is only implemented for IDE drives. A side effect of putting the drive  in  standby mode  is  that the write cache on the disk is flushed. This is important for IDE drives, since the kernel doesn't flush the write-cache itself before poweroff.
> 
> 

 

so in /etc/init.d/reboot.sh and /etc/init.d/shutdown.sh, add -h to the parameter list of reboot/halt binary. Should do the trick 100% (my original solution is not 100% reliable as the cache is still flushed in kernel only and just creates a small delay for the hard drives to settle down)

----------

## drtebi

Thanks for the replies,

however, I fixed this problem a long time ago. It turned out that one of the hard disks was broken (yes, I bought it new, what a shame).

After replacing the hard disk, everything worked perfectly and still ist.

----------

