# BAD HDD!! Please help [solved]

## redbeardmcg

So the last few days, I kept forgetting to backup my web server / imap server / portage cache / sql server / everything that i dont want to lose server, and while I was working on it yesterday, I started getting the "read only filesystem" messages. I re-booted and tried to fsck the disk, and when it tried to start back up it kernel panicked   :Evil or Very Mad:  .

NOT GOOD, I've spent the last month working on a web site for my girlfeind who is going in for back surgery and will be out for 2 months and wants something to do (its an online store), so everything else aside (all of my email, my personal blog / web sites, etc...) I NEED to recover the data on this disk. 

Is there any way I can access the information and ignore the filesystem errors? I ran fsck -cp and after 2 hours our power went out, so I don't know if it finished. I just need access to this data.

Thanks for any advice on how to recover this information

Ryan

----------

## NeddySeagoon

redbeardmcg,

Get dd_rhelp and make a image of the drive in a file.

dd_rhelp was not in portage, last time I looked.

You should not need to babysit it unless the drive goes offline as a result of the error.

Avoid operating the failing drive at all, its only going to get worse.

edit ... its is now     emerge sys-fs/dd-rhelp

----------

## forgotten1

First rule of thumb - Don't do anything else until you have a plan, you don't want to cause any more changes to disk than necessary.

Initial questions:When exactly did the kernel fail, at boot, or when you attempted to run fsck?

What type of partitions were in use, primary or extended, was logical volume management in use?

What filesystem types were in use on the disk?

Do you know where the data is you are looking for?

Without knowing any of the above, I'd attempt to use a LiveCD (Gentoo, Knoppix, SystemrescueCD, etc.) to [/list]recover the data along with a USB harddrive to copy files over.  The process would go something like this:

Insert LiveCD

Plug in USB harddrive

Boot box from LiveCD

Use dd to copy entire disk to USB Harddrive.  You will of course need a USB harddrive bigger than that you are trying to clone.

```
dd if=/dev/<bad drive device> of=/dev/<USB drive device>
```

Then attempt to mount the 'faulty' harddrive.

Mount the USB harddrive.

Create a special directory on the USB harddrive for the valuable files.

Locate and copy all the valuable files over the the freshly created directory.

If the harddrive is truly broken, you may not be able to do the above, in which case your only alternative is to send it out to a special data recovery lab.  That will be as expensive as it sounds.

If it's just corrupt, then the above will at least allow you to make a clone, and recover your working files.

Good luck.

----------

## redbeardmcg

Ok, I will pop the drive into a host machine and run dd_rhelp on my lunch break.

This will be the last maxtor drive I run, and the LAST time I have ANYTHING important on a single disk machine.

Thanks for the fast reply!

Ryan

----------

## redbeardmcg

 *Quote:*   

> # When exactly did the kernel fail, at boot, or when you attempted to run fsck?
> 
> # What type of partitions were in use, primary or extended, was logical volume management in use?
> 
> # What filesystem types were in use on the disk?
> ...

 

The kernel fails at boot (started panicking after the power loss while I was running fsick). Before the power loss, it would boot and ask me for root pass for maintenance and refuse to mount the fs.

The disk has an ext2 boot partition, a swap partition, and an ext3 root partition, nothing fancy, all primary, no LVM

Where as in where in the directory structure? yes... Where as in what blocks or sectors on the disk? no

Thanks,

Ryan

----------

## NeddySeagoon

redbeardmcg,

If your disk is actually damaged, real bad sectors, dd_rhelp cannot read it all but will get bcalk what it can.

Read the link to see how it works.

You can then do data recovery on the image file, which you may want a backup of, so you have an undo function.

If the drive is damaged, you may not be able to read it again.

----------

## redbeardmcg

All seemed to be going well... 

It found some bad sectors (nothing big, jus a few k here and there) and then this:

```
> - EOF is not found, but between 20524002.0k and 41048004.0k.

> === BAR === [ 'x' dd_rescued, '*' next jump point, '|' '.' not 

dd_rescued ]

> 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx|..*...... etc... 

Bar was drawn from 0 to hypothetic end : 41048004.0

> === launched via 'dd_rhelp' at 30786003.0k, 10262001k <<< ===

> dd_rescue: (info): ipos:  27262988.0k, opos:  27262988.0k, xferd:   

3523015.0k

>              -  *  errs:      0, errxfer:         0.0k, succxfer:   

3523015.0k

>              +curr.rate:        4kB/s, avg.rate:     1665kB/s, 

avg.load:  1.0%

> dd_rescue: (warning): /dev/hdd3 (27262988.0k): Input/output error!

> dd_rescue: (info): ipos:  27262987.5k, opos:  27262987.5k, xferd:   

3523015.5k

>              -  *  errs:      1, errxfer:         0.5k, succxfer:   

3523015.0k

>              +curr.rate:        0kB/s, avg.rate:     1663kB/s, 

avg.load:  1.0%

> dd_rescue: (warning): /dev/hdd3 (27262987.5k): Input/output error!

> dd_rescue: (info): ipos:  27262987.0k, opos:  27262987.0k, xferd:   

3523016.0k

>              -  *  errs:      2, errxfer:         1.0k, succxfer:   

3523015.0k

>              +curr.rate:        0kB/s, avg.rate:     1662kB/s, 

avg.load:  1.0%

> dd_rescue: (warning): /dev/hdd3 (27262987.0k): Input/output error!

> dd_rescue: (info): ipos:  27262986.5k, opos:  27262986.5k, xferd:   

3523016.5k

>              -  *  errs:      3, errxfer:         1.5k, succxfer:   

3523015.0k

>              +curr.rate:        0kB/s, avg.rate:     1660kB/s, 

avg.load:  1.0%

> dd_rescue: (warning): /dev/hdd3 (27262986.5k): Input/output error!

> dd_rescue: (info): ipos:  27262986.0k, opos:  27262986.0k, xferd:   

3523017.0k

>              -  *  errs:      4, errxfer:         2.0k, succxfer:   

3523015.0k

>              +curr.rate:        0kB/s, avg.rate:     1659kB/s, 

avg.load:  1.0%

> [b]dd_rescue: (warning): /dev/hdd3 (27262986.0k): Input/output error!

> dd_rescue: (fatal): maxerr reached![/b]

> Summary for /dev/hdd3 -> /media/disk/punditdump.img:
```

Am I hosed? 

Ryan

----------

## NeddySeagoon

redbeardmcg,

You can run dd_rhelp again and it will build on what its already done.

Some sanity checking is in order before you try.

How big is the drive ? 

The output snippit suggests 41048004.0k or about 40Gb of which its got 27262986.0k or 27Gb

So ... is the drive really 40Gb or is that wrong?

If it were a 30Gb drive, you are not going to read 40Gb from it. I know it sounds silly but the drive size could have been incorrectly established at the start.

Provided its a 40Gb drive try again.

Its also worth allowing the drive to cool down and running it at odd angles - on either edge or even upside down.

The idea being to get gravity to align worn spindle bearings, which is the normal cause of sudden faileres.

If you are really lucky, the data you want may have been read already. To check that out you need to get into the intracases of the mount command with -o loop,offset= options, where offset is in bytes to the start of the partition you wish to mount.

```
mount -o loop,ro,offset=32256 /path/to/file /some/mountpoint
```

will mount the first partition, read only, in the image file at /some/mountpoint. The first partition is always at offset=32256, the offsets for the rest depend on your partition table, so we ned to read that. Anyway , the above is a good test that it will work when we get the rest of the offsets.

----------

## redbeardmcg

Yes, the drive is a 40 gig drive. I ran dd_rhelp on the third partition (hdd3). Did this really start it there, or is it dumping the whole drive? How do I determine the start of the 3rd partition?

I am no filesystem expert, but the most important information is the information that I recently wrote to the drive, so I am assuming I am more concerned with what is at the end of the drive? I am running it again, and it got 5 errors right around 27262986. Does this mean that the bad sectors are around the 27 gig point, or do I lack a basic understanding of how information is stored to disk in an ext3 fs? 

Thanks again for your help,

I will try putting the disk upside down when I get out of work

Ryan

----------

## NeddySeagoon

redbeardmcg,

I misread the dd_rhelp output. You will only have an image of the partition. No need to mess about with -o offset=.

dd_rhelp is not recovering the filesystem for you. Its making an image of the partition in a file. It cares nothing for the filesystem that may be there. It reads raw disk blocks. When it finds an error,, it begins reading again in the middle of the largest unread area and does a binary search to recover all the 'good' sectors. Eventually, only bad sectors are left. Now it tries harder by stepping the head in from above and below, doing retries and so on.

When you were writing data, ext3 find s some space and saves the file, I'm not sure how it determines where on the partition to save a file but it may not be close to the end. Sight of the output bar chart would be useful

----------

## redbeardmcg

Thanks for the clarification. I have used dd in the past (but not dd_rhelp), and I know how it works, I just didnt know if ext3 stores files in sequential order, so if the most recent modifications would be towards the end of the image. 

Here is a screen 

http://mcguireusa.no-ip.org/dd_rhelp.jpg

It's still running, and my fingers are crossed

Thanks again!

----------

## NeddySeagoon

redbeardmcg,

The important line is the 

```
xferd(succ/err)
```

 Your image says its got 20G, with 2.0k of errors.

Its not given up on the errors, it failed to read them at the first attempt, so it may yet succeed.

----------

## Ast0r

 *redbeardmcg wrote:*   

> This will be the last maxtor drive I run

 Good! I recommend Seagate over Maxtor or Western Digital. They are excellent.

----------

## redbeardmcg

 *Quote:*   

> Good! I recommend Seagate over Maxtor or Western Digital. They are excellent.

 

I run both seagate and western digital in all of my systems, and love them both. I have a western digital in my linux router that I have had since I owned my IBM Aptive 133, and it still runs great if that says anything. I have had nothing but bad luck with maxtor, and the only time I use them is when I come across them for free (as was the case with this drive).

I was able to retrieve 99% of my htdocs dir, and my user dirs, and got all but 3 or 4 of my emails too. Unfortunately my entire /etc dir is no good, but im gonna let this run overnight and see if it is a different story tomorrow.

Seperate issue, but where are my MySQL databases stored on the hdd? Is there a way to back them up without being able to boot to this disk and run mysql?

Thanks again,

Ryan

----------

## NeddySeagoon

redbeardmcg,

You can treat the file image as if it were a real partition, so anything you do to a real partition you can do to the image.

----------

## redbeardmcg

I do know that this file can be treated as a filesystem, I was more asking if you knew where mysql stored its database files so that I could copy them over to the server once the new hdd is in it. 

Let it run overnight, its got 35/40 gigs (or atleast thats what du says about the file, dd_rhelp still says it only has 20 gigs). 

Thanks again for the help, you guys are lifesavers!

Now... linux + raid   :Very Happy: 

----------

## forgotten1

 *mysql 5.1 reference manual wrote:*   

> You can also create a binary backup simply by copying all table files (*.frm, *.MYD, and *.MYI
> 
> files), as long as the server isn't updating anything. The mysqlhotcopy script uses this method.
> 
> (But note that these methods do not work if your database contains InnoDB tables. InnoDB does
> ...

 

----------

## redbeardmcg

Got em! looks like all of my databases are intact! I guess I lucked out, /dev, /etc, /usr seem to be all bad, but everything I needed was good   :Smile: 

Thanks again,

Ryan

----------

