# Should I replace my hard drive?

## Lyude

I'm hoping this is in the right forum. I'm not sure whether not i should replace my hard drive. I've been having I/O issues on my server and dmesg has been giving me I/O errors, so I ran a SMART self-test and at the bottom of the report it shows 3 read and write errors:

 *Quote:*   

> smartctl 5.40 2010-10-16 r3189 [x86_64-pc-linux-gnu] (local build)
> 
> Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
> 
> === START OF INFORMATION SECTION ===
> ...

 

Sad part is this hard drive is but a few months old.

----------

## NeddySeagoon

Lyude,

Its very difficult to read your post as quote tags do not preserve the alignment you get with a fixed width font. Please use code tags in future.

```
SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Extended offline Completed: read failure 90% 2153 293173677

# 2 Extended offline Completed: read failure 90% 2153 293173677

# 3 Extended offline Completed: read failure 90% 2153 293173677 
```

Says you have run the extended off line test three times and its found the same error three times.

The problem is that the drive cannot read block 293173677.  If it could, it would reallocate it and all would be well again.

I cant align the

```
 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 
```

with the headings to decode it but as there are No Errors Logged, its probably ok.

Your drive uses partial response, maximal likelyhood encoding. Crudely put, that means it writes bits on the platter surface so they overlap, which is good for packing the data in, then guesses what it wrote on read and fixes the guess with error correction.  Every now and again, a sector wears out. Normally, the drive can anticipate this and relocate the sector.

There are several things to do. 

1. As the drive is almost new, get the drive vendors test utility.  If it fails, the utility will check your warranty status and offer to print your a RMA, so you can return the drive.  This test and subsequent return, will destroy your data.

2. You can write to the entire drive surface. This will force any sector sparing that is required. This will cost you the file(s) usig any failing sectors.

You don't know hown may you have - only the first one is reported.

3. You can try ddrescue with /dev/null as the output file.  If it works, your bad blocks will be read one more time and the drive will relocate them.

If your drive is really failing, you should use ddrescue to make an image of it.  ddrescue needs a log to work so it can keep track of what its read. The log cannot be on the source drive.

----------

## dE_logics

```
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 20 
```

WD is crap. They're too much MS friendly anyway.

I'd suggest you backup your data (cause it'll get destroyed) and - 

badblocks -n /dev/sda

Just in case you can't backup, you may do ro testing - 

badblocks /dev/sda

After booting from a live CD (in both cases). It'll take quiet a while (a few hours)

Then post your SMART output. Also run an extended self test - 

smartclt -t long /dev/sda

----------

## Lyude

I did do all of that, and that smart output is from an extended test. The bad blocks are always in the same area, and for some reason I can't get badblocks to add them to the badblocks file. Any idea how I'd fix that?

EDIT: Also, the disc refuses to reallocate the sectors using ddrescue, and I'm not sure how you'd go about writing over the entire surface of the disc, unless you mean just using dd to cover over the entire thing.

----------

## dE_logics

I've seen this problem in forged Seagate HDD. That might be your case, but something like this is expected from a Microsoft-fanny company. They only care about your money.

Get a Iomega next time, but do replace your disk fast.

----------

## Lyude

I just realized there's a chance this problem could have come from all the times i accidentally kicked he server, as I just did it hard enough my server got a split second of air time (my server's way too close to my feet <<, I just pushed it back a bit).

----------

## dE_logics

Kicked?  :Exclamation: 

----------

## Aquous

 *Lyude wrote:*   

> I just realized there's a chance this problem could have come from all the times i accidentally kicked he server, as I just did it hard enough my server got a split second of air time (my server's way too close to my feet <<, I just pushed it back a bit).

 That's quite likely - if you kick the box and the hard drive's needle jumps a bit, it could scratch the surface of the platter, damaging the drive.

----------

## Lyude

```
badblocks: Input/output error during ext2fs_sync_device

Pass completed, 331594614 bad blocks found.
```

Well, I'll send it back to WD then.

Should have lasted a bit longer then this <<

----------

