# bad disc, bad RAM or xfs error?

## Norick

Hello, I have been using my system for a long time without any serious problem, but now it does problems everyday.

Firstly.. there si a photo of memtest screen http://garfield.kotchka.net/~norick/pict0197.jpg , i remember that a week? ago there was a one more error place... I have two 128MB slots. How can i recognise which one is damaged?

Secondly, there is a piece of log (/var/log/system or messages)

```

Dec 20 18:02:14 echelon kernel: ide: failed opcode was: unknown

Dec 20 18:02:14 echelon kernel: end_request: I/O error, dev hda, sector 41438686

Dec 20 18:02:18 echelon kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }

Dec 20 18:02:18 echelon kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=39853758, high=2, low=6299326, sector=39853758

Dec 20 18:02:18 echelon kernel: ide: failed opcode was: unknown

Dec 20 18:02:18 echelon kernel: end_request: I/O error, dev hda, sector 39853758

Dec 20 18:02:19 echelon kernel: I/O error in filesystem ("hda3") meta-data dev hda3 block 0x886b0       ("xfs_trans_read_buf") error 5 buf count 8192

Dec 20 18:02:22 echelon kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }

Dec 20 18:02:22 echelon kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=39853760, high=2, low=6299328, sector=39853758

Dec 20 18:02:22 echelon kernel: ide: failed opcode was: unknown

Dec 20 18:02:22 echelon kernel: end_request: I/O error, dev hda, sector 39853758

Dec 20 18:02:24 echelon kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }

Dec 20 18:02:24 echelon kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=39853766, high=2, low=6299334, sector=39853766

Dec 20 18:02:24 echelon kernel: ide: failed opcode was: unknown

Dec 20 18:02:24 echelon kernel: end_request: I/O error, dev hda, sector 39853766

Dec 20 18:02:24 echelon kernel: I/O error in filesystem ("hda3") meta-data dev hda3 block 0x886b0       ("xfs_trans_read_buf") error 5 buf count 8192

Dec 20 18:02:26 echelon kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }

Dec 20 18:02:26 echelon kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=49323318, high=2, low=15768886, sector=49323318

Dec 20 18:02:26 echelon kernel: ide: failed opcode was: unknown

Dec 20 18:02:26 echelon kernel: end_request: I/O error, dev hda, sector 49323318

Dec 20 18:02:26 echelon kernel: I/O error in filesystem ("hda3") meta-data dev hda3 block 0x990528       ("xfs_trans_read_buf") error 5 buf count 4096

Dec 20 18:02:36 echelon kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }

Dec 20 18:02:36 echelon kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=41694480, high=2, low=8140048, sector=41694462

```

If this happens, computer is totally unusable. Music is freezeing, mouse too and gkrellm2 disc status shows 0

Sometimes, after the reboot of improperly working system, i cant boot. SMART tells me that my first disc might be damaged(after some time it disappear and is ok). But other time it works perfectly and I dont know about any serious loss of data. Yesterday, i tried xfs_check /dev/hda3 (it showed some errors, my other(2,6,7, :Cool:  xfs partitions were ok) and then xfs_repair /dev/hda3 (it repaired some sectors). Today the problems have occured again, so I suppose it didnt help.

In addition there is also one thing with compilations.. xorg-x11 fails with ld: something error. It looks always same and yesterday failed libquicktime compilation with same error alike... i will add this error message, but unfortunately I havent saved it. I think this is due to bad ram..

So what should I try first? and what all can be damaged? I am going to buy a new RAM slot, but it will take some time... If it's needed i may add any information, you request..

Thanks for any help.

Jan Vyhlidka

----------

## Norick

Ok. there is a mentioned ld error...

```

/bin/sh ../libtool --mode=link i686-pc-linux-gnu-gcc  -O3 -funroll-all-loops -fomit-frame-pointer -falign-loops=2 -falign-jumps=2 -falign-functions=2  -finline-functions -Wall -Wno-unused -Winline   -o lqtplay  lqtplay.o ../src/libquicktime.la  -L/usr/X11R6/lib -lXaw -lXt  -lSM -lICE -lXext -lXv -lGLU -lGL -lX11  -lm -lpthread -lz -ldl 

mkdir .libs

i686-pc-linux-gnu-gcc -O3 -funroll-all-loops -fomit-frame-pointer -falign-loops=2 -falign-jumps=2 -falign-functions=2 -finline-functions -Wall -Wno-unused -Winline -o .libs/lqtplay lqtplay.o  ../src/.libs/libquicktime.so -L/usr/X11R6/lib -lXaw -lXv /usr/lib/libGLU.so -L/usr/lib -lSM -lICE -lXmu -lXt -lXi /usr/lib/opengl/nvidia/lib/libGL.so -lXext -lX11 -lm -lpthread -lz -ldl -Wl,--rpath -Wl,/usr/lib/opengl/nvidia/lib

/usr/lib/libGLU.so: undefined reference to `operator delete(void*)@GLIBCXX_3.4'

/usr/lib/libGLU.so: undefined reference to `vtable for __cxxabiv1::__vmi_class_type_info@CXXABI_1.3'

/usr/lib/libGLU.so: undefined reference to `operator delete[](void*)@GLIBCXX_3.4'

/usr/lib/libGLU.so: undefined reference to `operator new[](unsigned int)@GLIBCXX_3.4'

/usr/lib/libGLU.so: undefined reference to `operator new(unsigned int)@GLIBCXX_3.4'

/usr/lib/libGLU.so: undefined reference to `__cxa_pure_virtual@CXXABI_1.3'

/usr/lib/libGLU.so: undefined reference to `vtable for __cxxabiv1::__si_class_type_info@CXXABI_1.3'

/usr/lib/libGLU.so: undefined reference to `vtable for __cxxabiv1::__class_type_info@CXXABI_1.3'

/usr/lib/libGLU.so: undefined reference to `__gxx_personality_v0@CXXABI_1.3'

collect2: ld returned 1 exit status

make[3]: *** [lqtplay] Error 1

make[3]: Leaving directory `/var/tmp/portage/libquicktime-0.9.3-r1/work/libquicktime-0.9.3/utils'

make[2]: *** [all-recursive] Error 1

make[2]: Leaving directory `/var/tmp/portage/libquicktime-0.9.3-r1/work/libquicktime-0.9.3/utils'

make[1]: *** [all-recursive] Error 1

make[1]: Leaving directory `/var/tmp/portage/libquicktime-0.9.3-r1/work/libquicktime-0.9.3'

make: *** [all] Error 2

!!! ERROR: media-libs/libquicktime-0.9.3-r1 failed.

!!! Function src_compile, Line 61, Exitcode 2

!!! (no error message)

!!! If you need support, post the topmost build error, NOT this status message.

```

----------

## freak4u

Uh...that's your hard drive dying.  The seek errors in your syslog are sectors that can't be read...compiling issues might be due to corruption in swap or your / filesystem...SMART (Self Monitoring Analysis and Report Tool) lets you know when your HD is dying....your's is.  Might want to backup ASAP, buy a new HD, and don't use your computer untill you backup!!!

----------

## evoweiss

 *freak4u wrote:*   

> Uh...that's your hard drive dying.  The seek errors in your syslog are sectors that can't be read...compiling issues might be due to corruption in swap or your / filesystem...SMART (Self Monitoring Analysis and Report Tool) lets you know when your HD is dying....your's is.  Might want to backup ASAP, buy a new HD, and don't use your computer untill you backup!!!

 

I've got a related problem. Basically, it looks like a single sector of my HDD is bad, though I once did a deep scan of the HDD and nothing seemed wrong. Is there a way I can simply mark that sector as "bad" so that it isn't used and no data is stored on that sector, etc.? 

EDIT: I guess it would help if I posted some output and, upon looking at his output, I see that my problem is not as severe, i.e., doesn't tell me the problem cannot be corrected. Still, any idea on how to solve this problem (whether it's marking the sector as bad or something else), would be appreciated. I've used e2fsck, but it seems not to solve the problem on a permanent basis.

The errors are in my dmesg output:

```

hda: read_intr: status=0x51 { DriveReady SeekComplete Error }

hda: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=78135484, sector=78135484

hda: read_intr: status=0x51 { DriveReady SeekComplete Error }

hda: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=78135484, sector=78135484

hda: read_intr: status=0x51 { DriveReady SeekComplete Error }

hda: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=78135484, sector=78135484

hda: read_intr: status=0x51 { DriveReady SeekComplete Error }

hda: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=78135484, sector=78135484

ide0: reset: success

hda: read_intr: status=0x51 { DriveReady SeekComplete Error }

hda: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=78135484, sector=78135484

end_request: I/O error, dev hda, sector 78135484

kjournald starting.  Commit interval 5 seconds

EXT3 FS on hda1, internal journal

EXT3-fs: mounted filesystem with ordered data mode.

End of output

```

Prior to the section I snipped out, the output is identical.

Best,

Alex

----------

## evoweiss

Hi again,

Quick thing I forgot to mention. The problem will arise sometimes when I emerge files (especially large ones). My way of dealing with it is usually to use e2fsck, delete the .bz2 file and .ebuild, resync, and re-emerge it. I imagine this works because the file gets sent to a different part of the HDD which is not problematic.

Best,

Alex

----------

## freak4u

 *evoweiss wrote:*   

> 
> 
> I've got a related problem. Basically, it looks like a single sector of my HDD is bad, though I once did a deep scan of the HDD and nothing seemed wrong. Is there a way I can simply mark that sector as "bad" so that it isn't used and no data is stored on that sector, etc.? 
> 
> 

 

I think so, but the problem is physical.  Might be due to a head crash, or poor design of the HD, but anyway you look at it your HD is dying.  The longer this goes more stuff is going to get damaged untill one day you can't get anything.  My suggestion is get a new HD.  BTW, dd works wonders for cloning HDs.  just  do 

```
dd if=/dev/hdx (just use the full drive) of = /dev/hdx
```

 from a computer that is indep of the two drives.  very fast clone.

----------

## evoweiss

 *freak4u wrote:*   

> I think so, but the problem is physical.  Might be due to a head crash, or poor design of the HD, but anyway you look at it your HD is dying.  The longer this goes more stuff is going to get damaged untill one day you can't get anything.  My suggestion is get a new HD.  BTW, dd works wonders for cloning HDs.  just  do 
> 
> ```
> dd if=/dev/hdx (just use the full drive) of = /dev/hdx
> ```
> ...

 

Thanks for getting back to me. I am beginning to waffle on whether it's the drive or an xfs issue. You see, the problems started immediately after I upgraded to the 2.6 kernel which, to me at least, suggests some sort of incompatibility between version of xfs that was present when I first installed my system the 2.6 kernel. In addition to this, there was another problem that was introduced back then which had to do with xmms and I suspect it is related to the same stuff. I just the health of my boot partition (ext3) of that same drive, it looks fine.

Now, as per how I plan on dealing with what is happening. Fortunately, I keep almost everything of value on a separate HDD and moved the important bits (mail file, etc files, etc.) onto the other HDD already. Sometime next week I am going to reinstall gentoo, but I will use the ext3 file system for the root partition. If I am right, that should be the end of my troubles. If I am wrong, well, I'll just have to buy a new HDD and reinstall again  :Smile: .

Best,

Alex

----------

## evoweiss

Hi Norick,

 *Norick wrote:*   

> Hello, I have been using my system for a long time without any serious problem, but now it does problems everyday.

 

I am almost certain that the problem had something to do with an older version of xfs and the 2.6 series kernel or it's just xfs. The reason I think this is the case is that my problems with those errors and playing music (distortion crept into songs) started right after I upgraded to 2.6 some time ago. Not sure what your situation was, but, from what I've read, xfs is apparently sensitive to stuff like power outages.

I wound up wiping out and re-installing the root partition choosing ext3 as my file system. So far, thoroughout the install, even with programs that gave me trouble before, I've had absolutely no problems at all.

Incidentally, if I run into problems again, I will know it's not related to xfs, but something else (hardware problems).

Best,

Alex

----------

## Norick

Thanks for your replies.

I have returned that disc to my dealer and i am waiting for a new one or assurance that my disc works. I have installed on my second disc(ext3+reiserfs) gentoo again and i will not use xfs, unless i have power backup. But i think it wasnt caused by xfs because grub showed me 'error 17', although i changed /boot file system to ext2.

Jan Vyhlidka

----------

## RikBlankestijn

The errors also occur when you have installed udev and devfs together.

----------

