# Partitions lost after power surge

## PietdeBoer

Hey guys,

yesterday.. my power got cut of from some reason i havent figured out yet..

when booting the server, i now get serious errors about the filesystem of my raid0 array

the array contains 4 300gb sata disks wich i gave the ext3 filesystem..

the array cant be mounted at boot.. when trying to mount it manually i get this error:

Failed to mount /ARRAY1 : 

mount: wrong fs type, bad option, bad superblock on /dev/md0,

       or too many mounted file systems

       (could this be the IDE device where you in fact use

       ide-scsi so that sr0 or sda or so is needed?)

though i see md0 does exist.... 

i have lots of data on the array wich i do not wish to lose..

my question to you guys is.. what can i do best to save my data.. what checks do i have to perform?

oh yeah.. HAPPY NEWYEAR!! in advance  :Wink: 

----------

## NeddySeagoon

PietdeBoer,

run e2fsck on /dev/md0 in read only mode, see what it thinks is wrong.

If you have a corrupt superblock, it can be restored from one of the copies.

See man e2fsck

e2fsck is likely to convert your ext3 filesystem to ext2 but you can add the journal with tune2fs.

See man tune2fs

----------

## PietdeBoer

it keeps saying its an ext2 filesystem.. :S

if i do fsck -j xxx /dev/md0 

what should i give as an external journal ????

----------

## NeddySeagoon

PietdeBoer,

ext3 == ext2+journal.  e2fsck may well remove the journal, but not in read onlt mode.

----------

## PietdeBoer

ok, but when i try to check the md0 device.. i still get the error its not a good ext2 filesys...

what cmds do i have to use.. to check the array? im getting a little bit confused.. 

thx for the replys!

----------

## NeddySeagoon

PietdeBoer,

e2fsck -n /dev/....

-n means do the run read only and do not make any changes.

If you have a bad superblock try

e2fsck -n -b 32768 /dev/...

When you are happy the e2fsck will not make things worse than they already are, you can run it without the -n to interactivly allow and deny changes or with -y if you would rather not know whats its doing - that makes all the changes, it feels the need to.

read man e2fsck.  See the bits about  -n -b and -y in particular.

These filesystem repair programs can actually make things worse - be aware of that before you allow any changes to the filesystem. Be sure you have considered your other options.

----------

## PietdeBoer

localhost ~ # e2fsck -n -b 32768 /dev/md0

e2fsck 1.37 (21-Mar-2005)

e2fsck: Invalid argument while trying to open /dev/md0

The superblock could not be read or does not describe a correct ext2

filesystem.  If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate superblock:

    e2fsck -b 8193 <device>

localhost ~ # e2fsck -n -b 32768 /dev/sda1

e2fsck 1.37 (21-Mar-2005)

e2fsck: No such file or directory while trying to open /dev/sda1

The superblock could not be read or does not describe a correct ext2

filesystem.  If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate superblock:

    e2fsck -b 8193 <device>

localhost ~ # fsck -n -b 32768 /dev/sda1

fsck 1.37 (21-Mar-2005)

e2fsck 1.37 (21-Mar-2005)

fsck.ext2: No such file or directory while trying to open /dev/sda1

The superblock could not be read or does not describe a correct ext2

filesystem.  If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate superblock:

    e2fsck -b 8193 <device>

these cmds do not work.. it keeps complaining about the ext2 filesystem :S

----------

## NeddySeagoon

PietdeBoer,

/dev/md0, or whatever the mdX is, not /dev/sda1   you said it was a raid array.

That looks bad. There are other superblocks scattered down the disk.

From man e2fsck

```
              Additional  backup  superblocks  can  be determined by using the

              mke2fs program using the  -n  option  to  print  out  where  the

              superblocks were created.   The -b option to mke2fs, which spec-

              ifies blocksize of the filesystem must be specified in order for

              the superblock locations that are printed out to be accurate.
```

Try to find some more superblocks, providing you are sure it was an ext2 or ext3 filesystem

----------

## PietdeBoer

mke2fs -n /dev/sda

Superblock backups stored on blocks:

        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,

        4096000, 7962624, 11239424, 20480000, 23887872, 71663616

so i could use one of those?

----------

## NeddySeagoon

PietdeBoer,

mke2fs -n /dev/sda is the wrong command.

mke2fs -n /dev/md0 will read the raid device.

Its the right idea though. find another super block to use with the -b option to e2fsck

----------

## PietdeBoer

localhost ~ # mke2fs -n /dev/md0

mke2fs 1.37 (21-Mar-2005)

mke2fs: Device size reported to be zero.  Invalid partition specified, or

        partition table wasn't reread after running fdisk, due to

        a modified partition being busy and in use.  You may need to reboot

        to re-read your partition table.

this is what i got out..

could it be that the disks got damaged? 

hdparm gives this:

localhost ~ # hdparm -tT /dev/sda

/dev/sda:

 Timing cached reads:   2928 MB in  2.00 seconds = 1463.91 MB/sec

HDIO_DRIVE_CMD(null) (wait for flush complete) failed: Inappropriate ioctl for device

 Timing buffered disk reads:  186 MB in  3.02 seconds =  61.59 MB/sec

HDIO_DRIVE_CMD(null) (wait for flush complete) failed: Inappropriate ioctl for device

localhost ~ # hdparm -tT /dev/sdb

/dev/sdb:

 Timing cached reads:   2932 MB in  2.00 seconds = 1465.91 MB/sec

HDIO_DRIVE_CMD(null) (wait for flush complete) failed: Inappropriate ioctl for device

 Timing buffered disk reads:  182 MB in  3.01 seconds =  60.42 MB/sec

HDIO_DRIVE_CMD(null) (wait for flush complete) failed: Inappropriate ioctl for device

so they should be alive.. right?

----------

## NeddySeagoon

PietdeBoer,

The disks look OK but something nast has happened to your filesystem.

Its written in 'chunks' alternately to each drive.

Did the raid set form ?

Look in to see if all is well. If not, operations on /dev/md0 will fail.

You have to get the raid set to form before making further progress.

Also check dmesg for the raid set being started.

The output of mke2fs -n /dev/sda shows that something is there but unfortunately the numbers it provided are not useful.

----------

## PietdeBoer

powernow-k8: Found 1 AMD Athlon 64 / Opteron processors (version 1.50.4)

powernow-k8: BIOS error - no PSB or ACPI _PSS objects

md: Autodetecting RAID arrays.

md: autorun ...

md: ... autorun DONE.

kjournald starting.  Commit interval 5 seconds

EXT3-fs: mounted filesystem with ordered data mode.

VFS: Mounted root (ext3 filesystem) readonly.

Freeing unused kernel memory: 328k freed

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

Adding 500464k swap on /dev/hda2.  Priority:-1 extents:1 across:500464k

EXT3 FS on hda3, internal journal

md: raidstart(pid 303) used deprecated START_ARRAY ioctl. This will not be supported beyond 2.6

md: could not bd_claim sda.

md: autostart failed!

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

device-mapper: dm-linear: Device lookup failed

device-mapper: error adding target to table

EXT3-fs: unable to read superblock

raid array is NOT started ... 

perhaps this info helps

----------

## NeddySeagoon

PietdeBoer,

Yes and no. It explains why you do not get anything useful out of /dev/md0

I'm concerned that it mentions device-mapper. That implies BIOS raid not kernel raid.

It also appears to want to make a linear raid not an interleaved one.

Maybe I was assuming too much ?

I was expecting this bit 

```
md: Autodetecting RAID arrays.

md: autorun ...

md: ... autorun DONE. 
```

to detect the raid, a bit like this

```
[   48.125517] md: considering sdb1 ...

[   48.126423] md:  adding sdb1 ...

[   48.127346] md:  adding sda1 ...

[   48.128245] md: created md0

[   48.129121] md: bind<sda1>

[   48.129972] md: bind<sdb1>

[   48.130800] md: running: <sdb1><sda1>

[   48.131687] raid1: raid set md0 active with 2 out of 2 mirrors

[   48.132552] md: ... autorun DONE.
```

What sort of raid do you have?

----------

## PietdeBoer

a raid0 array.. 4 300gb's

software raid...

raiddev /dev/md0

        raid-level 0

        nr-raid-disks 4

        persistent-superblock 1

        chunk-size 4

        device /dev/sda

        raid-disk 0

        device /dev/sdb

        raid-disk 1

        device /dev/sdc

        raid-disk 2

        device /dev/sdd

        raid-disk 3

----------

## NeddySeagoon

PietdeBoer,

Between us we need to read up on manually starting the raid because the kernel didn't try.

It would also be worth reading the contents of the devices to /dev/null, to ensure they are all still readable. With 1.2Tb, its going to take a while. Maybe let it run overnight?

```
dd if=/dev/sda of=/dev/null 
```

will read all of /dev/sda, or stop with an error.

if=means input file and of is output file. Don't get them the wrong way round, there is no undo.

Are you using a liveCD or is the sysem you are using for testing on another drive ?

----------

## PietdeBoer

im dont fully understand what you are saying here...

what do you do when you read the contents to dev/null?

----------

## PietdeBoer

are you on efnet atm? talks easier i guess  :Wink: 

----------

## PietdeBoer

localhost ~ # dd if=/dev/sdd of=/dev/null

586114704+0 records in

586114704+0 records out

300090728448 bytes (300 GB) copied, 5756.26 seconds, 52.1 MB/s

all 4 disks give this output.. no errors.. seem that the disks are fine  :Smile: 

it has to be the filesystem wich has became corrupted

----------

## NeddySeagoon

PietdeBoer,

Good afternoon,

Thats good to know. From previous posts, you are using kernel raid but the raid autodetect does not now try to see them as a raid set. Please explain how your system boots, where is your root filesystem now? was it ever on this raid system ?

This part of dmesg 

```
md: Autodetecting RAID arrays.

md: autorun ...

md: ... autorun DONE.
```

should show more.

Using fdisk, examine the partition types for the partitions that contribute to the raid. They must be 0xfd

fdisk -l /dev/sda  etc will show this.

----------

## PietdeBoer

here is fdisk -l

localhost linux # fdisk -l

Disk /dev/hda: 122.9 GB, 122942324736 bytes

16 heads, 63 sectors/track, 238216 cylinders

Units = cylinders of 1008 * 512 = 516096 bytes

   Device Boot      Start         End      Blocks   Id  System

/dev/hda1   *           1          63       31720+  83  Linux

/dev/hda2              64        1056      500472   82  Linux swap / Solaris

/dev/hda3            1057      238216   119528640   83  Linux

Disk /dev/sda: 300.0 GB, 300090728448 bytes

255 heads, 63 sectors/track, 36483 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sda doesn't contain a valid partition table

Disk /dev/sdb: 300.0 GB, 300090728448 bytes

255 heads, 63 sectors/track, 36483 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/sdc: 300.0 GB, 300090728448 bytes

255 heads, 63 sectors/track, 36483 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 300.0 GB, 300090728448 bytes

255 heads, 63 sectors/track, 36483 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdd doesn't contain a valid partition table

as you can see /dev/hda is used for my system..

----------

## NeddySeagoon

PietdeBoer,

With no partition tables on your RAID devices, the kernel cannot form the raid set.

If they were partitioned one partiton for the whole drive, its easy, use fdisk to rewrite the partion table the same way again, remembering to set the partition type to fd.

If they were two or more partitions, it gets more complicated, the partiton table must be recreated as it was before.

If your motherboard has BIOS raid, be sure its off, so you can only use kernel RAID. BIOS raid and kernel raid are not compatible.

You may want to keep copies of your Master Boot Records as they are now, in case you want to go back

```
dd if=/dev/sda of=/MBR_sda count=1 bs=512 
```

will read the MBR of /dev/sda to a file called /MBR_sda.

Repeat for all four drives, chngeing the sda bits.

I'm surprised you have lost all four MBRs - however, were you using BIOS RAID, I would not expect to find a valid partition table at all.

----------

## PietdeBoer

would this be a loss of data?... iam using a kernel based software raid0

i mean.. after i rebuild the partitions.. will the old ext3 system be there? with all my old data

----------

## NeddySeagoon

PietdeBoer,

That depends on how much data got trashed - if you remake the partition tables, it may just all work again.

----------

## PietdeBoer

like 600GB's of stored data on the array...

so you suggest just using fdisk.. type n for new partition.. then hit p for prim. and partition nr 1.. and hit enter for first and last cylinder?

----------

## NeddySeagoon

PietdeBoer,

If thats what you did to fdisk the drives when you created the array - yes

----------

