# Scrambled s/w raid setup

## mslinn

After I upgraded a s/w raid 1 array from 120GB to 400GB things got a bit out of control.  Right now the system boots from /dev/md1, but isn't using /dev/md3 for the file system, instead it is using /dev/hda3.  I'd like to preserve the contents of /dev/hda3 and get them over to the raid array.

The drives, as reported by fdisk:

 *Quote:*   

> Disk /dev/hda: 400.0 GB, 400088457216 bytes
> 
> 255 heads, 63 sectors/track, 48641 cylinders
> 
> Units = cylinders of 16065 * 512 = 8225280 bytes
> ...

 

The other info is:

```
$ cat /proc/mdstat

Personalities : [raid1]

md1 : active raid1 hdb1[1] hda1[0]

      104320 blocks [2/2] [UU]

md2 : active raid1 hdb2[1] hda2[0]

      1959808 blocks [2/2] [UU]

      bitmap: 0/240 pages [0KB], 4KB chunk

unused devices: <none>

[mslinn@egg ~]$ mount

/dev/hda3 on / type reiserfs (rw,noatime)

proc on /proc type proc (rw)

sysfs on /sys type sysfs (rw,nosuid,nodev,noexec)

udev on /dev type tmpfs (rw,nosuid,size=10240k,mode=755)

devpts on /dev/pts type devpts (rw,nosuid,noexec,gid=5,mode=620)

none on /dev/shm type tmpfs (rw)

none on /tmp type tmpfs (rw)

nfsd on /proc/fs/nfsd type nfsd (rw)

sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

usbfs on /proc/bus/usb type usbfs (rw,noexec,nosuid,devmode=0664,devgid=85)

binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev)

securityfs on /sys/kernel/security type securityfs (rw,noexec,nosuid,nodev)

$ cat /etc/fstab

# <fs>                  <mountpoint>    <type>          <opts>                  <dump/pass>

# NOTE: If your BOOT partition is ReiserFS, add the notail option to opts.

/dev/md1                /boot           ext2            noauto,noatime           1 1

/dev/md2               none            swap            sw                       0 0

/dev/hda3               /               reiserfs        noatime,exec             0 0

/dev/cdrom              /mnt/cdrom      iso9660         users,noauto,ro,unhide   0 0

/dev/cdroms/cdrom1      /mnt/cdrom1     iso9660         users,noauto,ro,unhide   0 0

/dev/fd0                /mnt/floppy     auto            user,auto                0 0

/dev/sda1               /mnt/usb        ext3,vfat       defaults,auto,users,sync 0 0

none                    /proc           proc            defaults                0 0

# glibc 2.2 and above expects tmpfs to be mounted at /dev/shm for

# POSIX shared memory (shm_open, shm_unlink).

# (tmpfs is a dynamically expandable/shrinkable rahdaisk, and will

#  use almost no memory if not populated with files)

# Adding the following line to /etc/fstab should take care of this:

none                    /dev/shm        tmpfs           defaults                0 0

none                    /tmp            tmpfs           defaults                0 0

nfsd /proc/fs/nfsd nfsd auto,defaults 0 0

sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs auto,defaults 0 0

$ sudo cat /boot/grub/menu.lst

timeout 1

default 0

fallback 1

splashimage=(hd0,0)/grub/splash.xpm.gz

title=vmlinuz-2.6.25-gentoo-r8

root (hd0,0)

kernel /boot/vmlinuz-2.6.25-gentoo-r8 root=/dev/hda3

title=vmlinuz-2.6.25-gentoo-r7

root (hd0,0)

kernel /boot/vmlinuz-2.6.25-gentoo-r7 root=/dev/hda3

title=vmlinuz-2.6.18-gentoo-r6

root (hd0,0)

kernel /boot/vmlinuz-2.6.18-gentoo-r6 root=/dev/hda3

$ l /boot

total 16304

drwxr-xr-x  5 root root    2048 Nov  3 08:11 ./

drwxrwxrwx 26 root root     768 Nov  2 23:33 ../

lrwxrwxrwx  1 root root      27 Nov  3 05:23 System.map -> System.map-2.6.25-gentoo-r8

-rw-r--r--  1 root root  829936 Jan 31  2007 System.map-2.6.18-gentoo-r6

-rw-r--r--  1 root root 1036680 Jul 25 20:00 System.map-2.6.25-gentoo-r6

-rw-r--r--  1 root root  946099 Oct 25 08:57 System.map-2.6.25-gentoo-r7

-rw-r--r--  1 root root  946927 Nov  3 05:21 System.map-2.6.25-gentoo-r8

-rw-r--r--  1 root root  791274 Jan 30  2007 System.map-genkernel-x86-2.6.18-gentoo-r6

lrwxrwxrwx  1 root root       1 Oct 28  2004 boot -> ./

lrwxrwxrwx  1 root root      23 Nov  3 05:23 config -> config-2.6.25-gentoo-r8

-rw-r--r--  1 root root   39143 Jan 31  2007 config-2.6.18-gentoo-r6

-rw-r--r--  1 root root   46542 Jul 25 20:00 config-2.6.25-gentoo-r6

-rw-r--r--  1 root root   45267 Oct 25 08:40 config-2.6.25-gentoo-r7

-rw-r--r--  1 root root   45335 Nov  3 05:21 config-2.6.25-gentoo-r8

drwxr-xr-x  2 root root    1024 Oct 28  2004 dev/

drwxr-xr-x  2 root root    1024 Nov  3 08:04 grub/

-rw-r--r--  1 root root 1930936 Jan 30  2007 initramfs-genkernel-x86-2.6.18-gentoo-r6

-rw-r--r--  1 root root 1667917 Jan 30  2007 kernel-genkernel-x86-2.6.18-gentoo-r6

-rw-r--r--  1 root root 2020256 Jul 24 11:18 kernel-genkernel-x86-2.6.18-gentoo-r6-b

drwx------  2 root root   12288 Oct 28  2004 lost+found/

lrwxrwxrwx  1 root root      24 Nov  3 05:24 vmlinuz -> vmlinuz-2.6.25-gentoo-r8

-rw-r--r--  1 root root 1941887 Jan 31  2007 vmlinuz-2.6.18-gentoo-r6

-rw-r--r--  1 root root 2132780 Jul 25 20:00 vmlinuz-2.6.25-gentoo-r6

lrwxrwxrwx  1 root root      22 Oct 25 09:02 vmlinuz-2.6.25-gentoo-r7 -> ../../x86/boot/bzImage

-rw-r--r--  1 root root 2165212 Nov  3 05:21 vmlinuz-2.6.25-gentoo-r8

[mslinn@egg src]$ sudo uname -a

Linux egg 2.6.25-gentoo-r7 #1 PREEMPT Sat Jul 26 07:34:54 PDT 2008 i686 Intel(R) Pentium(R) 4 CPU 2.80GHz GenuineIntel GNU/Linux

```

I realize it would be better to use /dev/hd{a,b}2 as swap instead of /dev/md2.  It also looks like /dev/md3 doesn't yet exist.  

How best should I trash /dev/md2, create /dev/md3 and move the files from /dev/hda3 to /dev/md3?

----------

## mslinn

I didn't get any response, but here is another clue:

```
$ sudo mount -t auto /dev/md1 /boot

mount: unknown filesystem type 'ext2'

$ sudo mount -t mdraid /dev/md1 /boot

mount: unknown filesystem type 'mdraid'

$ sudo fsck

fsck 1.41.2 (02-Oct-2008)

e2fsck 1.41.2 (02-Oct-2008)

The filesystem size (according to the superblock) is 104388 blocks

The physical size of the device is 104320 blocks

Either the superblock or the partition table is likely to be corrupt!

Abort<y>?
```

That's not good!

Here is an excerpt from /var/log/messages:

```
Nov  8 00:24:44 egg md: Autodetecting RAID arrays.

Nov  8 00:24:44 egg md: Scanned 2 and added 2 devices.

Nov  8 00:24:44 egg md: autorun ...

Nov  8 00:24:44 egg md: considering hda2 ...

Nov  8 00:24:44 egg md:  adding hda2 ...

Nov  8 00:24:44 egg md: hda1 has different UUID to hda2

Nov  8 00:24:44 egg md: created md2

Nov  8 00:24:44 egg md: bind<hda2>

Nov  8 00:24:44 egg md: running: <hda2>

Nov  8 00:24:44 egg raid1: raid set md2 active with 1 out of 2 mirrors

Nov  8 00:24:44 egg md2: bitmap initialized from disk: read 16/16 pages, set 0 bits

Nov  8 00:24:44 egg created bitmap (240 pages) for device md2

Nov  8 00:24:44 egg md: considering hda1 ...

Nov  8 00:24:44 egg md:  adding hda1 ...

Nov  8 00:24:44 egg md: created md1

Nov  8 00:24:44 egg md: bind<hda1>

Nov  8 00:24:44 egg md: running: <hda1>

Nov  8 00:24:44 egg raid1: raid set md1 active with 1 out of 2 mirrors

Nov  8 00:24:44 egg md: ... autorun DONE.

Nov  8 00:24:44 egg ReiserFS: hda3: found reiserfs format "3.6" with standard journal

Nov  8 00:24:44 egg ReiserFS: hda3: using ordered data mode

Nov  8 00:24:44 egg ReiserFS: hda3: journal params: device hda3, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

Nov  8 00:24:44 egg ReiserFS: hda3: checking transaction log (hda3)

Nov  8 00:24:44 egg ReiserFS: hda3: replayed 47 transactions in 9 seconds

Nov  8 00:24:44 egg ReiserFS: hda3: Using r5 hash to sort names

Nov  8 00:24:44 egg VFS: Mounted root (reiserfs filesystem) readonly.

Nov  8 00:24:44 egg Freeing unused kernel memory: 184k freed

Nov  8 00:24:44 egg udev: renamed network interface eth0 to eth3

Nov  8 00:24:44 egg PCI: setting IRQ 9 as level-triggered

Nov  8 00:24:44 egg PCI: Found IRQ 9 for device 0000:00:1f.5

Nov  8 00:24:44 egg PCI: Sharing IRQ 9 with 0000:00:1f.3

Nov  8 00:24:44 egg PCI: Setting latency timer of device 0000:00:1f.5 to 64

Nov  8 00:24:44 egg intel8x0_measure_ac97_clock: measured 52909 usecs

Nov  8 00:24:44 egg intel8x0: clocking to 48000

Nov  8 00:24:44 egg ReiserFS: hda3: Removing [6325669 26916685 0x0 SD]..done

Nov  8 00:24:44 egg ReiserFS: hda3: There were 1 uncompleted unlinks/truncates. Completed

Nov  8 00:24:44 egg end_request: I/O error, dev fd0, sector 0

Nov  8 00:24:44 egg end_request: I/O error, dev fd0, sector 0

Nov  8 00:24:44 egg Adding 1959800k swap on /dev/md2.  Priority:-1 extents:1 across:1959800k

```

How do I get myself out of this mess?  I've got a total of 4 400GB drives on this box, of which only one is actually in use. hda and hdb are accessible; hdc & hdd would be accessible if I could install a new kernel with the necessary drivers.  Unfortunately, I cannot mount /boot.

----------

## cyrillic

 *mslinn wrote:*   

> How do I get myself out of this mess?  I've got a total of 4 400GB drives on this box, of which only one is actually in use. hda and hdb are accessible; hdc & hdd would be accessible if I could install a new kernel with the necessary drivers.  Unfortunately, I cannot mount /boot.

 

Lucky for you that grub does not need the kernel to be installed in order to boot it.   :Smile: 

I would probably start by compiling a new kernel with whatever new drivers you need, and then only install the modules not the kernel (make && make modules_install).

Then boot it manually (press "c" at the grub menu)

```
grub> root (hd0,2)

grub> kernel /usr/src/linux/arch/x86/boot/bzImage root=/dev/hda3

grub> boot 
```

Once you are in, try mounting the /boot partition, and if it still doesn't work, no big deal, just erase it and recreate the contents.

```
# mke2fs /dev/md1

# mount /boot   <--- this should work now.

# emerge grub

# grub

grub> root (hd0,0)

grub> setup (hd0)

grub> root (hd1,0)

grub> setup (hd1)

grub> quit

# cd /boot

# ln -s . boot

# nano -w grub/grub.conf

# cd /usr/src/linux

# make install

# umount /boot

# reboot 
```

----------

## mslinn

Your instructions on how to boot from the new kernel worked fine, and I didn't have to reformat /boot.  Thank you!

I have two remaining issues.

1) I still get a warning to run fsck on /dev/md1.  How can I fix that without trashing everything?

```
$ sudo fsck

fsck 1.41.2 (02-Oct-2008)

e2fsck 1.41.2 (02-Oct-2008)

The filesystem size (according to the superblock) is 104388 blocks

The physical size of the device is 104320 blocks

Either the superblock or the partition table is likely to be corrupt!

Abort<y>?
```

2) My drives are /dev/hd{a,c,f,g}; /dev/hdf and /dev/hda should mirror each other.

I created /dev/md3 and added /dev/hdf3, then mounted it on /mnt/md3.  I'm now moving the data from /dev/hda3 to /dev/md3 the only way I can think of while in single user mode:

```
rsync -azx / /mnt/mde
```

Is there a better way to do this?  I believe that simply adding /dev/hda3 to /dev/md3 will wipe it.  

'On a clear disk you can seek forever.'

----------

## cyrillic

 *fsck wrote:*   

> Either the superblock or the partition table is likely to be corrupt! 

 

Since /boot is a small partition, backup/format/restore would be the quickest way to fix the corruption.

 *mslinn wrote:*   

> Is there a better way to do this?  I believe that simply adding /dev/hda3 to /dev/md3 will wipe it. 

 

I have not tried this myself, but I think this is the correct way to create a RAID1 with existing data on 1 disk.

```
# mdadm --create /dev/md3 --level=1 --raid-devices=2 /dev/hda3 missing 
```

This should start the array in degraded mode (check /proc/mdstat).

If this works, change /dev/hda3 to partition type "fd", then adjust fstab and grub.conf to point to /dev/md3 instead of /dev/hda3.

Cross your fingers and reboot.

```
# mdadm /dev/md3 --add /dev/hdf3 
```

This should start to sync your existing data to the new device (check /proc/mdstat).

At this point you should be done.  Just make sure to change /dev/hdf3 to partition type "fd" before your next reboot.

----------

## mslinn

The boot partition was easily fixed:

```
sudo mount /boot

sudo cp -a /boot{,.bak}

sudo mke2fs /boot

sudo cp -a /boot{.bak,}
```

The creation of /dev/md3 caused the error message "mdadm: Cannot open /dev/hda3: Device or resource busy".  I tried booting into single user mode, same problem.  I tried 'umount /dev/hda3' but that didn't change anything.

I then tried booting into LiveCD 2008.0-r1 but /dev/hda3 wasn't available.  Perhaps the new drivers renamed it to something like /dev/sda3 ... but I didn't want to guess incorrectly.

----------

## mslinn

I backed up /dev/hda to /dev/hde and tried creating /dev/md3 from it.

```
$ sudo mdadm --create /dev/md3 --level=1 --raid-devices=2 /dev/hde3 missing

mdadm: /dev/hde3 appears to contain an ext2fs file system

    size=388644416K  mtime=Wed Dec 31 16:00:00 1969

mdadm: /dev/hde3 appears to be part of a raid array:

    level=raid1 devices=1 ctime=Sat Nov  8 11:33:48 2008

Continue creating array? y

mdadm: array /dev/md3 started.

[mslinn@egg md4]$ sudo mount /dev/md3 /mnt/md3

[mslinn@egg md4]$ l /mnt/md3

total 21

drwxr-xr-x  3 root root  4096 Nov  8 11:37 ./

drwxr-xr-x 28 root root   760 Nov  9 05:24 ../

drwx------  2 root root 16384 Nov  8 11:37 lost+found/
```

Now we know that creating an array, even with the 'missing' parameter, wipes the disk.

I'll back up from /dev/hda3 to /dev/md3, change /etc/fstab and /boot/grub/menu.lst, and reboot.  The backup takes many hours (400GB).

----------

## cyrillic

 *mslinn wrote:*   

> Now we know that creating an array, even with the 'missing' parameter, wipes the disk. 

 

I find it a little surprising that mdadm removed the files, but left the filesystem intact.

I would think if mdadm wiped anything (and it shouldn't), you would need to run mkfs again before mounting /dev/md3 would be possible.

I guess I should try something like this on my own machine, to figure out why it didn't work like I expected it to ...

----------

## mslinn

This is wierd... /dev/md3 keeps going missing. I created it (again) this way:

```
sudo mdadm --create /dev/md3 --level=1 --raid-devices=2 /dev/hde3 missing
```

After rebooting, I got a message saying that /dev/md3 did not exist, and I had to boot from /dev/hda3 instead.

Here is what I see:

```
$ sudo mount /dev/md3

mount: special device /dev/md3 does not exist

$ sudo mdadm --create /dev/md3 --level=1 --raid-devices=2 /dev/hde3 missing

mdadm: /dev/hde3 appears to contain an ext2fs file system

    size=388644416K  mtime=Tue Nov 11 12:38:37 2008

mdadm: /dev/hde3 appears to be part of a raid array:

    level=raid1 devices=2 ctime=Tue Nov 11 12:38:17 2008

Continue creating array? n

mdadm: create aborted.

$ cat /proc/mdstat

Personalities : [raid1]

md1 : active raid1 hda1[0]

      104320 blocks [2/1] [U_]

md2 : active raid1 hda2[0]

      1959808 blocks [2/1] [U_]

      bitmap: 1/240 pages [4KB], 4KB chunk

md4 : active raid1 hdc1[0]

      390708736 blocks [2/1] [U_]

unused devices: <none>

```

Seems I need to do this on every boot:

```
$ sudo mdadm --add /dev/md1 /dev/hde1

mdadm: re-added /dev/hde1

$ cat /proc/mdstat

Personalities : [raid1]

md1 : active raid1 hde1[1] hda1[0]

      104320 blocks [2/2] [UU]

md2 : active raid1 hda2[0]

      1959808 blocks [2/1] [U_]

      bitmap: 1/240 pages [4KB], 4KB chunk

md4 : active raid1 hdc1[0]

      390708736 blocks [2/1] [U_]

unused devices: <none>

]$ sudo mdadm --add /dev/md2 /dev/hde2

mdadm: re-added /dev/hde2

$ cat /proc/mdstat

Personalities : [raid1]

md1 : active raid1 hde1[1] hda1[0]

      104320 blocks [2/2] [UU]

md2 : active raid1 hde2[1] hda2[0]

      1959808 blocks [2/2] [UU]

      bitmap: 1/240 pages [4KB], 4KB chunk

md4 : active raid1 hdc1[0]

      390708736 blocks [2/1] [U_]

unused devices: <none>

$ sudo mdadm --add /dev/md3 /dev/hde3

mdadm: cannot get array info for /dev/md3

```

I have a poltergeist in my machine.  Perhaps a sacrifice to the Software God is in order?

----------

