# Disk failing fast, Reiserfs disk errors - Please Help

## CloseYetFar

Hi, First off i'm still not sure if it is a Hardware problem or my file system is just damaged. About a day ago my system crashed when I was loading a VMware image (heavy disk usage). This surprised me because my system never crashes, I even had a uptime of somewhere around 70 days. The system rebooted fine and everything looked good. 

I used the system with out a problem for about a day, then while running a md5 checksum program metalog started using 50% CPU (this never happened before). I checked dmesg and it looks like it was continually trying to write to logs but was getting disk errors, over and over.

There is the out put from dmesg, sda7 is the /var partition:

```

ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one 3

ReiserFS: sda7: warning: vs-5150: search_by_key: invalid format found in block 32941. Fsck?

ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one 3

ReiserFS: sda7: warning: vs-5150: search_by_key: invalid format found in block 32941. Fsck?

ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one 3

ReiserFS: sda7: warning: vs-5150: search_by_key: invalid format found in block 32941. Fsck?

ReiserFS: sda7: warning: vs-13050: reiserfs_update_sd: i/o failure occurred trying to update [1495 2950 0x0 SD] stat data

ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one 3

ReiserFS: sda7: warning: vs-5150: search_by_key: invalid format found in block 32941. Fsck?

ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one 3

ReiserFS: sda7: warning: vs-5150: search_by_key: invalid format found in block 32941. Fsck?

ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one 3

ReiserFS: sda7: warning: vs-5150: search_by_key: invalid format found in block 32941. Fsck?

ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one 3

ReiserFS: sda7: warning: vs-5150: search_by_key: invalid format found in block 32941. Fsck?

ReiserFS: sda7: warning: zam-7001: io error in reiserfs_find_entry

ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one 3

ReiserFS: sda7: warning: vs-5150: search_by_key: invalid format found in block 32941. Fsck?

ReiserFS: sda7: warning: zam-7001: io error in reiserfs_find_entry

```

Right now i'm gonna try to dd the whole sda drive to an external HD. So I have a backup and I have my system on a known good HD, but I have a feeling ill just end up with disk errors which will mess the new image up.

After that I will try to use fsck.reiserfs to try and fix my corrupt HD. I have never used this before, I hear that you can actually damage the file system running this program. How should I do it? Im pretty sure it should be run with the partitions unmounted, am I correct? I guessing the only way to do this is to run fsck off the gentoo livecd.

Please post any info you have for me, I do not what to damage the system anymore. 

Thanks

PS: I just wrote this whole message with the problem computer and no errors are showing up in dmesg, but as I have rebooted a couple of times, the problems start with heavy disk usage.

----------

## eccerr0r

looks like metadata is screwed up somehow, if you just dd'ed your root disk to the external backup disk, you can safely run fsck on that backup disk?  And see if that one's stable (check if everything is still readable?)

There's a lot of tracking information that reiserfsck might get confused on, that's the warning when running it... but if all else is dead and you have no other option (or using a backup disk) then what's the harm?  I suppose the warning is that there's not a whole bunch of redundant tracking information to work with.

I ran reiserfsck on my root partition a few times, and it seemed OK afterwards finding a few errors...

(back to ext3fs now though...)

----------

## CloseYetFar

Hey thanks for the reply.  I ran reiserfsck --check on all my partitions. The only partition that is messed up is /home. This is probably why I have no problem booting.

Right now Im trying to dd it but I keep seeing the same error over and over again. Im afraid its just stuck in a loop and not moving on. Do you know if dd continues after errors or just keeps trying over and over again?

Thanks

----------

## eccerr0r

if your dd is bombing with errors, it sounds like you do have a sick hard drive...  :Sad: 

back up everything you can ASAP, first trying with just cp of all your critical stuff to another disk... if you can grab everything you want to keep, then that's good, you're done. Or if you have up-to-date backups you're done too  :Smile: 

if you can't, then you should do the dd and try to recover from that image.  You may need to specify "noerror" to let it bypass errors it encounters.  first set of errors are kernel-reported errors which it does try to retry until it succeeds, should definately let it have a go with it first.  Then use the 'noerror' option to get an inexact copy.

----------

## CloseYetFar

Even with all the dd errors the system partitions all reiserfscked with 0 errors. Just like the original drive, the /home dir was problematic. I had to use reiserfsck --rebuild-tree, doing this the file system is working perfect so far. I did lose 150 files from my /home dir, which is less then 1%. This is fine because I backup my /home/me dir once a week. I will leave all the .files alone and delete all my data then restore it from a known good source. Im so happy I did not have to rebuild the whole system, plus now I have a backup of the whole working system one another disk.

Thanks for the help.

PS: I will add [SOLVED] to the title in a couple days.

----------

## HeissFuss

dd has nothing to do with the FS since it does a raw read of the device.  If you were getting errors during that, then you probably have bad blocks on that portion of your disk.  You should run badblocks on the device to see how extensive the damage is.  To dd off of a device with hard errors, try installing ddrescue.

----------

## CloseYetFar

dd was able to finish even tho there where errors. Im thinking i will just play it safe and use the new HD, which I dd the system too and send this HD back to Seagate. 

Exactly what is the difference between dd and ddrescue? dd was able to finish. Is it that no matter how bad the disk is ddrescue will continue?

I hear that badblocks should not be used on a HD with important data on it. Is That true?

As I said I think I will just get rid of the origainal HD so I dont think it's worth the trouble

Thanks

----------

## HeissFuss

ddrescue will basically write zeros for areas it couldn't read without the long timeouts or retries so a backup with it off of a bad disk doesn't take forever.  Badblocks by default just reads all of the data sequentially off of the drive to check for read errors.  If you use it with write mode testing though it'll destroy all of your data.  Also don't run it on a mounted partition :p.

----------

## CloseYetFar

Ok so the file system is fixed and I just finished restoring my user data that I had on my backup. I will send the old HD back to Seagate and use the new one. So all is well. 

One thing that is strange is the HD I just put into the system is almost exactly the same as the old one, yet I can see with gkrellm it is much faster. It was transferring data at 70 MB a second vs. the old one would get to 50 once in a while. The comp is a dell, im pretty sure this disk came busted right from dell. I wonder if I was its first owner, no one really knows what dell does at the factory. Both HD's are ST3320620AS 320GB, one is a 7200.9(old) the other is a 7200.10(new).

Thanks for all the help guys.

----------

## eccerr0r

Usually newer disks are faster  :Very Happy: 

Unfortunately I don't have many new disks, my newest seagate is a 7200.7 which gets around 52-55MB/s on hdparm ... it is definately slower than some of my other disks indeed.  I think my fastest disk is a HGST 7k250 (gets around 61MB/s on hdparm) which ironically I'm using as a net backup disk...

Glad you got away with no data loss.  I wasn't so lucky with a couple of my disks that failed a decade or so ago.

----------

