# Impending HD failure?

## Robert S

I'm starting to get these:

```
Apr 10 00:36:57 myserver smartd[21225]: Device: /dev/sda [SAT], 3 Currently unreadable (pending) sectors

Apr 10 00:36:57 myserver smartd[21225]: Device: /dev/sda [SAT], 3 Offline uncorrectable sectors

Apr 10 01:06:57 myserver smartd[21225]: Device: /dev/sda [SAT], 3 Currently unreadable (pending) sectors

Apr 10 01:06:57 myserver smartd[21225]: Device: /dev/sda [SAT], 3 Offline uncorrectable sectors

```

What is the best way of testing my HD?  I'm currently doing a backup of the entire disk with a view to restoring it on another HD.  I'll do a reboot with an automatic fsck when I've finished.  Any other suggestions?

----------

## Thistled

I have been seeing these errors on 2 of my disks since installing gentoo back in 2008.

I thought it may have something to do with dual booting with windoze, as on one occasion I restarted my PC from Windoze str8 to Gentoo and there was a lock on the ntfs disk I was trying to mount. The solution was to shutdown windoze, then boot into Gentoo, and I would subsequently get access to the disk.

I tried defragmenting windoze to see if that would resolve it, but no joy.

The sectors always seem to be of the same size, and no increase over the years.

Just as long as you have made a backup of your important stuff, then I would not worry too much about this.

It sure as hell scared the ***t out of me when I first saw this info, but it has not escalated since the first warning, so I am not too worried.

----------

## Jaglover

You should run something like this

```
smartctl --all /dev/sda | grep -e "Reallocated_Sector_Ct" -e "Current_Pending_Sector" -e "Offline_Uncorrectable" -e "UDMA_CRC_Error_Count" -e "Hardware_ECC_Recovered"
```

to see if the drive is going bad. In my experience once the error count goes out of hand the drive is going to die soon.

----------

## srs5694

I strongly advise both of you to run a full SMART diagnostic on the disk. This can be done with tools like smartctl (text-mode), GSmartControl (GUI), or Palimpsest Disk Utility (SMART options are buried in a menu somewhere). IIRC, smartctl and Palimpsest are available in portage, but for some reason GSmartControl isn't. You might also be able to run a SMART test using a utility provided by the disk manufacturer, but that's likely to be written for Windows. This might be OK if you dual-boot, but on a Linux-only system, this could be problematic.

Unfortunately, SMART diagnostic results can be difficult to interpret. Some manufacturers put weird values in some fields that make things look worse than they are. Some fields are strangely named, and utilities often provide poor descriptions of what they mean. As a general rule, the GUI tools make the results easier to interpret than do the text-mode tools.

If the SMART tool gives you anything but "passed" for its overall assessment, you should probably replace the disk ASAP. Likewise if individual tests look troubling and you get confirmation from an expert that this reflects a real problem. The whole point of SMART is to detect disks that are just starting to flake out, so that you can replace the hardware before it fails entirely. It's possible to go for days, weeks, or even months with a disk that SMART says is problematic, but such disks are much more likely to go south very quickly than is a disk that gets a clean bill of health from a SMART test.

Edit: I posted just seconds after Jaglover. By "both of you" in my first paragraph, I'm referring to the first two posters.

----------

## BillWho

Robert S,

I've had similar errors on a disk for close to three years now. I have gentoo installed as test and break system so there's nothing important on it.

I saved the output of /usr/sbin/smartctl --log=error /dev/sdb and it still reports the exact same info today.

That disk could live another several years with no problems or it could crash and burn tomorrow. 

If you have any critical data on it then for sure back it up - don't take any chances.

Good luck   :Wink: 

----------

## Thistled

In my case all the disks which are reporting errors

```
Apr 10 00:06:49 pig smartd[3636]: Device: /dev/sda [SAT], 1 Offline uncorrectable sectors

Apr 10 00:06:49 pig smartd[3636]: Device: /dev/sda [SAT], 1 Currently unreadable (pending) sectors

Apr 10 00:06:49 pig smartd[3636]: Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors

Apr 10 00:06:49 pig smartd[3636]: Device: /dev/sdb [SAT], 1 Offline uncorrectable sectors

Apr 10 00:06:49 pig smartd[3636]: Device: /dev/sdc [SAT], 5 Currently unreadable (pending) sectors
```

are disks which were initially installed / utilised by Windoze. (i.e. they are ntfs)

These disks are not mounted at boot time. I mount these disks via nautilus, and they are used / shared between Windoze / Gentoo for 

documents, pictures, music etc etc

In my case, I think the unreadable sector errors are because they are not mounted.

I think to ask palimpsest or other programs to repair, will bork my ntfs disks.

----------

## Mad Merlin

Those errors are exactly what they sound like, a sector is unreadable on the hard drive. That sector might be part of your swap file (probably won't matter) or it could be part of your /boot/grub/grub.conf (not so good). Reads to that sector will fail. The next write made to that sector will cause the hard drive to transparently remap that sector to another spare sector and everything will be normal again.

Now, hard drives have a relatively small number of spare sectors (think dozens), and eventually it will run out. What happens after that is left as an exercise to the reader. Ideally, you will replace the drive before you are able to find out.

This might sound bad, but bad sectors are a fact of life, just as are dead pixels on your monitor, hard drives will deal with them just fine in small quantities. In general, if you see a small number of offline uncorrectable sectors and that number is not rising over time, the drive is probably fine. If you see a number that's steadily (or quickly) rising over time, toss the drive, it's going to eat your data.

Of course, I would point out that I've seen plenty of drives die completely out of the blue (SMART had no complaints right up until the drive's block device disappeared). Consequently, it's always a good time to test your backups.

----------

## Robert S

Here's the output.

```
myserver robert # smartctl --all /dev/sda | grep -e "Reallocated_Sector_Ct" -e "Current_Pending_Sector" -e "Offline_Uncorrectable" -e "UDMA_CRC_Error_Count" -e "Hardware_ECC_Recovered"

  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0

195 Hardware_ECC_Recovered  0x001a   036   024   000    Old_age   Always       -       77071283

197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       3

198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       3

199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0

myserver robert # /usr/sbin/smartctl --log=error /dev/sda

smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.2.12-gentoo] (local build)

Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===

SMART Error Log Version: 1

No Errors Logged

```

My problem is that i'm going overseas for a few weeks soon and I can't afford to have this bomb.  It might be easier to bite the bullet and get another HD.

----------

## Thistled

```
195 Hardware_ECC_Recovered  0x001a   036   024   000    Old_age   Always       -       77071283
```

That particular line does give a little cause for concern.

Like you say, back up all important stuff on /dev/sda and probably would be a good idea to replace said disk.

I spent 4 hours last night going through all my windoze partitions, defragmenting and running scan disks. Windoze reported no problems with the disk / partitions, but as soon as I come back into Gentoo, smartd still throws out warnings.

My situation is more akin to BillWhos', as my errors are in the 1 - 5 range, and have been since the installation of a brand new disk so I am not worrying too much.

----------

## BillWho

Thistled,

I don't believe that you can attribute the errors to winblows. I have a winblows installation on my disk and no errors are reported with smartctl. 

```
   Device Boot      Start         End      Blocks   Id  System

/dev/sda1              63    20482874    10241406   27  Hidden NTFS WinRE

/dev/sda2   *    20484096   336990191   158253048    7  HPFS/NTFS/exFAT

```

```
root@gentoo-gateway bill # /usr/sbin/smartctl --log=error /dev/sda

smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.3.0-rc7] (local build)

Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===

SMART Error Log Version: 1

No Errors Logged

```

This is the original installed hd with a vista installation along with a recovery partition and then later upgraded to win7.

----------

## srs5694

I agree with Mad Merlin: Back up your data and either replace the drive ASAP or be prepared to lose it suddenly.

One more point: SMART tools work with the disk hardware itself to detect problems. As such, SMART works at a much lower level than filesystem drivers. SMART can detect errors in parts of the disk that are unused -- unused parts of a filesystem or even gaps between partitions. Thus, you can spend all day running fsck in Linux or defragmenting files in Windows and there's no guarantee that you'll touch the affected sectors. Likewise if the bad sectors are in the middle of a big file that happens not to be adjusted by a defragment operation.

The best way to ensure that you do something with a sector that's going bad is to do a raw write operation to the whole disk, as in:

```

dd if=/dev/zero of=/dev/sdb

```

This is, however, a destructive operation -- it zeroes out the entire disk! If your disk holds important data, you obviously don't want to do this. If you replace the disk, though, and you want to discover how bad it is and perhaps salvage some life from the disk in a non-critical capacity, you could do this and see what happens to the SMART test results. If the "pending sectors" count drops to 0, then it could be there were just a handful of bad sectors and the disk will be good for a while longer. If the values skyrocket, OTOH, then you'll know the disk was in bad shape and you replaced it just in time. (The latter happened to me recently, FWIW. Fortunately, the disk was still under warranty, so now I've got a replacement drive waiting to be used.)

----------

## Thistled

I am a little confused by all of this. This 1st disk is my Winblows disk, and is barely used by Linux, but I can mount it if I want to install any apps using Wine. 

```
   Device Boot      Start         End      Blocks   Id  System

/dev/sda1   *          63    41945714    20972826    7  HPFS/NTFS/exFAT

/dev/sda2        41945715   265168889   111611587+   7  HPFS/NTFS/exFAT

/dev/sda3       265168890   488392064   111611587+   7  HPFS/NTFS/exFAT

```

and smarctl reports the following:

```
/usr/sbin/smartctl --log=error /dev/sda

smartctl 5.42 2011-10-20 r3458 [i686-linux-3.3.1-gentoo] (local build)

Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===

SMART Error Log Version: 1

ATA Error Count: 10 (device log contains only the most recent five errors)

   CR = Command Register [HEX]

   FR = Features Register [HEX]

   SC = Sector Count Register [HEX]

   SN = Sector Number Register [HEX]

   CL = Cylinder Low Register [HEX]

   CH = Cylinder High Register [HEX]

   DH = Device/Head Register [HEX]

   DC = Device Command Register [HEX]

   ER = Error register [HEX]

   ST = Status register [HEX]

Powered_Up_Time is measured from power on, and printed as

DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 10 occurred at disk power-on lifetime: 6151 hours (256 days + 7 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 01 9b 4b 4c e2  Error: UNC at LBA = 0x024c4b9b = 38554523

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  42 d8 01 9b 4b 4c e0 08      00:28:29.100  READ VERIFY SECTOR(S) EXT

  42 d8 02 9d 4b 4c e0 08      00:28:29.100  READ VERIFY SECTOR(S) EXT

  25 d8 01 00 00 00 e0 08      00:28:29.100  READ DMA EXT

  42 d8 02 9b 4b 4c e0 08      00:28:24.700  READ VERIFY SECTOR(S) EXT

  25 d8 01 00 00 00 e0 08      00:28:24.700  READ DMA EXT

Error 9 occurred at disk power-on lifetime: 6151 hours (256 days + 7 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 02 9b 4b 4c e2  Error: UNC at LBA = 0x024c4b9b = 38554523

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  42 d8 02 9b 4b 4c e0 08      00:28:24.700  READ VERIFY SECTOR(S) EXT

  25 d8 01 00 00 00 e0 08      00:28:24.700  READ DMA EXT

  25 d8 01 00 00 00 e0 08      00:28:24.700  READ DMA EXT

  42 d8 04 9b 4b 4c e0 08      00:28:20.100  READ VERIFY SECTOR(S) EXT

  42 d8 04 97 4b 4c e0 08      00:28:20.100  READ VERIFY SECTOR(S) EXT

Error 8 occurred at disk power-on lifetime: 6151 hours (256 days + 7 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 04 9b 4b 4c e2  Error: UNC at LBA = 0x024c4b9b = 38554523

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  42 d8 04 9b 4b 4c e0 08      00:28:20.100  READ VERIFY SECTOR(S) EXT

  42 d8 04 97 4b 4c e0 08      00:28:20.100  READ VERIFY SECTOR(S) EXT

  25 d8 01 00 00 00 e0 08      00:28:20.000  READ DMA EXT

  42 d8 08 97 4b 4c e0 08      00:28:15.700  READ VERIFY SECTOR(S) EXT

  25 d8 01 00 00 00 e0 08      00:28:15.700  READ DMA EXT

Error 7 occurred at disk power-on lifetime: 6151 hours (256 days + 7 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 04 9b 4b 4c e2  Error: UNC at LBA = 0x024c4b9b = 38554523

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  42 d8 08 97 4b 4c e0 08      00:28:15.700  READ VERIFY SECTOR(S) EXT

  25 d8 01 00 00 00 e0 08      00:28:15.700  READ DMA EXT

  42 d8 08 8f 4b 4c e0 08      00:28:15.700  READ VERIFY SECTOR(S) EXT

  25 d8 01 00 00 00 e0 08      00:28:15.600  READ DMA EXT

  42 d8 10 8f 4b 4c e0 08      00:28:11.200  READ VERIFY SECTOR(S) EXT

Error 6 occurred at disk power-on lifetime: 6151 hours (256 days + 7 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 04 9b 4b 4c e2  Error: UNC at LBA = 0x024c4b9b = 38554523

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  42 d8 10 8f 4b 4c e0 08      00:28:11.200  READ VERIFY SECTOR(S) EXT

  42 d8 10 7f 4b 4c e0 08      00:28:11.200  READ VERIFY SECTOR(S) EXT

  25 d8 01 00 00 00 e0 08      00:28:11.200  READ DMA EXT

  42 d8 20 7f 4b 4c e0 08      00:28:06.700  READ VERIFY SECTOR(S) EXT

  25 d8 01 00 00 00 e0 08      00:28:06.700  READ DMA EXT

```

For my "main" Linux disk. i.e. Boot Swap and Root:

```
   Device Boot      Start         End      Blocks   Id  System

/dev/sdb1   *          63      417689      208813+  83  Linux

/dev/sdb2          417690     4401809     1992060   82  Linux swap / Solaris

/dev/sdb3         4401810   312576704   154087447+  83  Linux
```

smartctl reports:

```
/usr/sbin/smartctl --log=error /dev/sdb

smartctl 5.42 2011-10-20 r3458 [i686-linux-3.3.1-gentoo] (local build)

Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===

SMART Error Log Version: 1

ATA Error Count: 166 (device log contains only the most recent five errors)

   CR = Command Register [HEX]

   FR = Features Register [HEX]

   SC = Sector Count Register [HEX]

   SN = Sector Number Register [HEX]

   CL = Cylinder Low Register [HEX]

   CH = Cylinder High Register [HEX]

   DH = Device/Head Register [HEX]

   DC = Device Command Register [HEX]

   ER = Error register [HEX]

   ST = Status register [HEX]

Powered_Up_Time is measured from power on, and printed as

DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 166 occurred at disk power-on lifetime: 20434 hours (851 days + 10 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 00 ed ed ab ea  Error: UNC at LBA = 0x0aabeded = 179039725

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 08 ea ed ab ea 00      03:00:24.681  READ DMA

  27 00 00 00 00 00 e0 00      03:00:24.681  READ NATIVE MAX ADDRESS EXT

  ec 00 00 00 00 00 a0 00      03:00:24.623  IDENTIFY DEVICE

  ef 03 46 00 00 00 a0 00      03:00:24.622  SET FEATURES [Set transfer mode]

  27 00 00 00 00 00 e0 00      03:00:21.710  READ NATIVE MAX ADDRESS EXT

Error 165 occurred at disk power-on lifetime: 20434 hours (851 days + 10 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 00 ed ed ab ea  Error: UNC at LBA = 0x0aabeded = 179039725

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 08 ea ed ab ea 00      03:00:18.565  READ DMA

  27 00 00 00 00 00 e0 00      03:00:15.546  READ NATIVE MAX ADDRESS EXT

  ec 00 00 00 00 00 a0 00      03:00:15.546  IDENTIFY DEVICE

  ef 03 46 00 00 00 a0 00      03:00:15.546  SET FEATURES [Set transfer mode]

  27 00 00 00 00 00 e0 00      03:00:21.710  READ NATIVE MAX ADDRESS EXT

Error 164 occurred at disk power-on lifetime: 20434 hours (851 days + 10 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 00 ed ed ab ea  Error: UNC at LBA = 0x0aabeded = 179039725

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 08 ea ed ab ea 00      03:00:18.565  READ DMA

  27 00 00 00 00 00 e0 00      03:00:15.546  READ NATIVE MAX ADDRESS EXT

  ec 00 00 00 00 00 a0 00      03:00:15.546  IDENTIFY DEVICE

  ef 03 46 00 00 00 a0 00      03:00:15.546  SET FEATURES [Set transfer mode]

  27 00 00 00 00 00 e0 00      03:00:15.546  READ NATIVE MAX ADDRESS EXT

Error 163 occurred at disk power-on lifetime: 20434 hours (851 days + 10 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 00 ed ed ab ea  Error: UNC at LBA = 0x0aabeded = 179039725

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 08 ea ed ab ea 00      03:00:15.545  READ DMA

  ca 00 10 ba 30 9d ea 00      03:00:15.546  WRITE DMA

  ca 00 08 82 4b 9e ea 00      03:00:15.546  WRITE DMA

  ca 00 08 d2 4b 9e ea 00      03:00:15.546  WRITE DMA

  ca 00 08 7a 44 9b ea 00      03:00:15.546  WRITE DMA

Error 162 occurred at disk power-on lifetime: 17827 hours (742 days + 19 hours)

  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 00 bd 82 31 ed  Error: UNC at LBA = 0x0d3182bd = 221348541

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 08 ba 82 31 ed 00      02:17:54.884  READ DMA

  27 00 00 00 00 00 e0 00      02:17:54.828  READ NATIVE MAX ADDRESS EXT

  ec 00 00 00 00 00 a0 02      02:17:54.825  IDENTIFY DEVICE

  ef 03 46 00 00 00 a0 02      02:17:51.922  SET FEATURES [Set transfer mode]

  27 00 00 00 00 00 e0 00      02:17:51.854  READ NATIVE MAX ADDRESS EXT

```

and finally, the disk which is a combination of ntfs and ext3, which is my Linux /home partition (sdc2):

```
   Device Boot      Start         End      Blocks   Id  System

/dev/sdc1              63   244187999   122093968+   7  HPFS/NTFS/exFAT

/dev/sdc2       244188000   349044254    52428127+  83  Linux

/dev/sdc3       349044255   418718159    34836952+   7  HPFS/NTFS/exFAT

/dev/sdc4       418718160   488392064    34836952+   7  HPFS/NTFS/exFAT

```

smartctl reports:

```
/usr/sbin/smartctl --log=error /dev/sdc

smartctl 5.42 2011-10-20 r3458 [i686-linux-3.3.1-gentoo] (local build)

Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===

SMART Error Log Version: 1

No Errors Logged

```

So what is with all of the Errors which

```
occurred at disk power-on lifetime
```

?

----------

## Jaglover

Alright, have you run self-test on this drive? Did it finish? If the drive is bad the test usually will not accomplish.

Below is a sample of a healthy drive.

```
SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Extended offline    Completed without error       00%      4742         -

```

----------

## Thistled

Well palimpsest reports I have 12 bad sectors on both sdb and sdc

and

```
pig ~ # smartctl --attributes --log=selftest --quietmode=errorsonly /dev/sda

pig ~ # smartctl --attributes --log=selftest --quietmode=errorsonly /dev/sdb

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Extended offline    Completed: read failure       90%     33311         221348541

# 2  Extended offline    Completed: read failure       90%     33311         221348541

# 3  Short offline       Completed: read failure       80%     33310         221348541

# 4  Short offline       Completed: read failure       80%     24093         221348541

# 5  Short offline       Completed: read failure       80%     23407         221348541

# 6  Short offline       Completed: read failure       80%     22024         221348541

# 7  Short offline       Completed: read failure       80%     22024         221348541

# 8  Short offline       Completed: read failure       80%     22024         221348541

# 9  Short offline       Completed: read failure       80%     20657         221348541

#10  Short offline       Completed: read failure       80%     19723         221348541

#11  Short offline       Completed: read failure       80%     19070         221348541

#12  Short offline       Completed: read failure       80%     18120         221348541

pig ~ # smartctl --attributes --log=selftest --quietmode=errorsonly /dev/sdc

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Short offline       Completed: read failure       90%     39657         6163165

# 2  Short offline       Completed: read failure       90%     31057         6163165

# 3  Extended offline    Completed: read failure       90%     31057         6163165

# 4  Short offline       Completed: read failure       90%     30648         6163165

# 5  Short offline       Completed: read failure       90%     30648         6163165

# 6  Short offline       Completed: read failure       90%     28018         6163165

# 7  Short offline       Completed: read failure       90%     12615         250373360

# 8  Extended offline    Completed: read failure       90%     12615         250373360

# 9  Short offline       Completed: read failure       90%     12615         250373360

#10  Extended offline    Completed: read failure       90%     12612         250373360

#11  Short offline       Completed: read failure       90%     12612         250373360

#12  Short offline       Completed: read failure       90%     12612         250373360

#13  Short offline       Completed: read failure       90%     11543         250373360

#14  Extended offline    Completed: read failure       90%     10086         250373360

#15  Short offline       Completed: read failure       90%     10079         250373360

#16  Short offline       Completed: read failure       90%     10079         250373360

```

----------

## Hu

 *Thistled wrote:*   

> So what is with all of the Errors which
> 
> ```
> occurred at disk power-on lifetime
> ```
> ...

 The drive failed to complete a command that was sent to it by the OS.  This is a bad sign.  The "disk power-on lifetime" bit is so you can determine whether the error was reported yesterday or last year.  The drive tells you how many power-on hours it has accumulated, so you can work out from that how recently an error occurred.

----------

## Thistled

But this has been like this since the day I bought the disk.

Palimpsest has always reported the current pending sector error.

Like I said in earlier posts, all my important stuff is backed up on my server, I am kind of taking that same approach as BillWho.

 *Quote:*   

> Robert S, 
> 
> I've had similar errors on a disk for close to three years now. I have gentoo installed as test and break system so there's nothing important on it. 
> 
> I saved the output of /usr/sbin/smartctl --log=error /dev/sdb and it still reports the exact same info today. 
> ...

 

I would not be surprised to discover this is because I am overclocking a 2.77Ghz to 3.16Ghz, as I am fully aware overclocking can put a stress on gear.

----------

