# ata messages in dmesg: It's not the cable

## Lebkoungcity

Hi,

I recently bought a new WD-SATA-HDD and moved my system to it. A few days ago I checked dmesg and found some repeating and slightly changing messages concerning ata. I don't really understand what it tries to tell me - and with my limited knowledge I don't find a something usefull for me. So my questions are (hopefully) short ones: Should I worry about those messages? What can I do to solve the problem?

Regards,

Andy

dmesg

 *Quote:*   

> 
> 
> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> 
> ata1.00: BMDMA stat 0x5
> ...

 

(please note the differences: UDMA/133 and then UDMA/100)

fdisk -l

 *Quote:*   

> 
> 
> Platte /dev/hda: 82.3 GByte, 82348277760 Byte
> 
> 255 Köpfe, 63 Sektoren/Spur, 10011 Zylinder
> ...

 

smartctl --all /dev/sda

 *Quote:*   

> 
> 
> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
> 
> Home page is http://smartmontools.sourceforge.net/
> ...

 

after activating SMART and taking the short test:

smartctl --all /dev/sda

 *Quote:*   

> 
> 
> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
> 
> Home page is http://smartmontools.sourceforge.net/
> ...

 

hdparm -I /dev/sda

 *Quote:*   

> 
> 
> /dev/sda:
> 
> ATA device, with non-removable media
> ...

 

----------

## eccerr0r

It looks like the errors you are seeing are interface problems as seen as ICRC and UDMA errors. Check cable for interference and connectors, or possibly replace cable.

----------

## Lebkoungcity

Thanks a lot for your answer!

It matches with the increasing number in the 199 UDMA_CRC_Error_Count-line (from last post 8144 to 8256):

smartctl -A /dev/sda

 *Quote:*   

> 
> 
> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
> 
> Home page is http://smartmontools.sourceforge.net/
> ...

 

----------

## Lebkoungcity

 *eccerr0r wrote:*   

> It looks like the errors you are seeing are interface problems as seen as ICRC and UDMA errors. Check cable for interference and connectors, or possibly replace cable.

 

Unfortunately it wasn't the cable. Today I bought a new one and it gives me the same errors...  :Sad: 

Is there anything else I can check?

Could it be a mis-configured kernel module?

I think I need the via_sata module? Or something else? Is there some way to configure it?

Please help me - My system goes down to UDMA/33...   :Evil or Very Mad: 

Thanks in advance!!!

```

lspci

00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge (rev 80)

00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge

00:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)

00:0b.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 02)

00:0b.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 02)

00:0c.0 Multimedia audio controller: ESS Technology ES1969 Solo-1 Audiodrive (rev 01)

00:0d.0 Communication controller: NetMos Technology PCI 1 port parallel adapter (rev 01)

00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80)

00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)

00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)

00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)

00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)

00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)

00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)

00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [KT600/K8T800/K8T890 South]

00:11.5 Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 AC97 Audio Controller (rev 60)

00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78)

01:00.0 VGA compatible controller: nVidia Corporation NV43 [GeForce 6600] (rev a2)

```

----------

## NeddySeagoon

Lebkoungcity,

Google tells us that your drive is SATA 2.

```
WD Caviar Green WD5000AADS - hard drive - 500 GB - SATA-300
```

Your SATA controller is

```
 00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80) 
```

which is a SATA 1 device.

SATA 2 devices connected to SATA 1 controllers are supposed to fall back to SATA 1 speeds. Many don't and produce the errors you are seeing.

Most drives have a tiny jumper that you set to force SATA1 data rates. 

See the bottom of this page

----------

## krinn

last time i plug-in a sata2 on a sata1 controller without the jumper the drive wasn't handle at all.

As i already said in another thread about drive, if the kernel can't make the drive functional at its proper standard, kernel will try to lower the standard until it can make it work without errors.

That's why your kernel pickup a lower standard, because kernel see the errors and try to pickup a lower standard to correct the issue (in fact, it's not a bad idea, a bad 80pins cable might work ok as a 40pins cable, so lowering standard to reach only 40pins usage will make it work). Hence the dma133, dma100.... fall

Anyway, your drive is simply dying, even newer things can die, sometimes your fault (running the drive in a too hot conditions, or extreme ambient humidity... like you can get in some countries), or the drive itself with a faulty manufacturing.

so if it's not the cable, it might be the controller, an irq conflict or the drive.

sadly for you, if we consider the drive let you install your previous gentoo on it, then irq conflict is not the solve, leaving the controller or the drive in fault, i would put my bet on the drive as this thing are more sensible (more mechanical parts) than a controller.

As noone suggest you yet, but as a wise man have in its sig

 *Quote:*   

> Computer users fall into two groups:-
> 
> those that do backups
> 
> those that have never had a hard drive fail.

 

(and but i must admit, this is really a personal thinking there: avoid green hdd, more trouble than benefits)

----------

## eccerr0r

Not that this is necessarily a useful comment, but I despise all VIA products, and will carefully scrutinize any VIA application.  I have a VT6420 SATA board and it has proven to be somewhat unreliable.  I'd chalk this one to incompatibility with the drive and the SATA host adapter.

But the question is, are you noticing any problems?  SATA should detect these UDMA CRC errors and retry them, so it should be fairly transparent other than speed hiccups when they occur.  Is this the case?

----------

## Lebkoungcity

NeddySeagoon:

Sorry that I did not include this information - you're absolutely right and I had to set the jumper to 1.5 Gb/s have the HDD recognized by my system. The next step I took was to boot a live-system, copy my gentoo installation from my PATA-drive to the SATA and did some changes e.g. in /etc/fstab. Then I booted from successfully from the SATA.

krinn:

Yes, I understand the mechanism that leads to the lower speed setting - and I also think that it's a good idea to have such a thing. After reading your post I decided to contact WD-support. Until now I have no real answer from them (I included the outputs of lspci, smartctrl --all /dev/sda and the errors in dmesg as a txt-file - in their answer they complained that they could not open this sort of file... I should send the informations as pdf... and: I should use their diagnostic tool - which is provided for Window$...), and after sending them a answer I'm still waiting for response... 

eccerr0r:

When the CRC errors occur I notice that the system gets painfull slow when accessing the SATA-HDD. I wouldn't call it a hiccup as the system doesn't come back to normal speed but stays in e.g. UDMA/33 until reboot. Despite of this I don't notice any other problems (maybe they're there but until now I don't see any). So if it would recover to UDMA/133 I think I could live with it...

Yes, I understand why you don't like VIA products - unfortunately it is the only mainboard I have... if nothing helps: What kind of controller would you recommend? (I think of getting a not too expensive PCI-controller-card as a last resort - a new mainboard with a faster CPU, better and more RAM, a faster PCIe-graphics-card would be fancy, but I don't have the money and also I have no real need for a more powerfull system...)

Thanks for all the support I get here!

Cheers,

Andy

----------

## Lebkoungcity

It's been a long time since I had time to think over my HDD-problem.

But a few days ago I re-read the manuals provided by WD and I stumbled over the option to enable 'spread spectrum clocking' with a jumper. I thought that it would be worth to try it. So I did and since then I did not have a single CRC-error again. Today I wanted to be sure and disabled SSC again and just after boot I had CRC-errors again - they disappeared after halting the system, re-enabling SSC and then booting my system.

Maybe that is the solution - I hope so!   :Very Happy: 

Now I can try to reduce the increasing Load_Cycle_Count on this 'Caviar Green'-HDD. The instructions given by WD ( http://wdc-de.custhelp.com/cgi-bin/wdc_de.cfg/php/enduser/std_adp.php?p_faqid=5357 ) are no real help as I'm not using syslog but syslog-ng and I have to dive deep in its syntax - but that would be another thread.   :Wink: 

Thanks for all your assistance!

Andy

----------

## NeddySeagoon

Lebkoungcity,

Spread spectrum clocking is an EMC reduction technique.

Essentially, it spreads the interference over a range of frequencies by jittering the clock frequency.

If this fixed your interface problems you probably have poor quality SATA cables, or they are run too close to something else with a similar clock frequency or both.  The important point is that your system has a noise margin problem and it will still be marginal now.

----------

## krinn

oh the drive is green WD one, check this thread (i love quoting myself, but i also love to refer to myself too !)

https://forums.gentoo.org/viewtopic-p-6354995-highlight-green.html#6354995

this is for the WD15EARS, but it's also a green hdd from WD.

if not aware of the issue before, try align your partitions (say thx to green shit) and good luck with the re-install  :Smile: 

----------

## Lebkoungcity

Thanks for your answers!

NeddySeagoon:

Apparently, this was not the solution   :Evil or Very Mad:  The errors came back just in the moment I thought I tracked down the cause...

Now I'm done - tomorrow I'm sending the drive back to it's creator for RMA...

WD-support wrote:

 *Quote:*   

> 
> 
> Ihr Fehler zeigt eindeutig, dass es ein Problem bei der Uebertragung der Daten von Ihrer Festplatte auf den Computer und umgekehrt gibt. Da Sie das Kabel bereits ausgetauscht haben liegt der Fehler bei der Festplattenelektronik. Leider koennen wir Ihnen da keine Hilfe anbieten, ausser einen Umtausch der Festplatte.
> 
> Der Fehler kann entweder an einer defekten Festplatten-Platine oder einer korrupten Firmware liegen.
> ...

 

 *Quote:*   

> 
> 
> Your error shows clearly know that there is a problem with the transmission of data from your HDD to the computer and vice versa. Since you have already replaced the cable the fault lies with the HDD-electronics. Unfortunately, we can offer you no help except an exchange of the HDD.
> 
> The error can be either be caused by a defective HDD-circuit-board or corrupt firmware.
> ...

 

krinn:

Thanks for the hint. But speed wasn't that much of a problem - only when the CRC-errors occurred and the HDD was limited from UDMA/133 down to UDMA/100 and finally to UDMA/33.

----------

## eccerr0r

I wouldn't call being stuck at UDMA33 "painfully slow"... yes, it's slow but still tolerable.  I'd much prefer being stuck at UDMA33 speeds than have data corruption with it running at UDMA133 speeds and missing clocks.

I would not call this a fault of the drive, and I would have to feel pain for WD for problems Via caused... I still would start the blame on Via for your issues not WD (though I can't rule out WD till it's proven to be one or the other).

For add-on boards it's the brand of board that tends to matter more than necessarily chipset; but I think Silicon Logic-based boards tends to be of higher quality than Via based boards; but YMMV - I actually have not tried them myself.  But one thing I -have- tried is JMicron, which should be another cheap low end solution (which tends to also end up in cheap, poor solutions), but had fairly good success with it.  Note that the JMicron I have tested is PCIe.

----------

## joecool

Figured I'd post a followup to this.  I have the same VIA VT6420 on an Asus A8V Deluxe and it's 100% worthless with even the jumper set on a WD20EARS, yes it will finally pick it up but it will shit those errors out everywhere.

This board has an onboard promise controller which works flawless with the drives, and also a PCI silicon image chip which also works flawless.  The VIA SATA-I southbridge controllers just suck.  Do yourself a favor and buy a controller card if you want to use SATA-II drives on your motherboard.

----------

## Cyker

Seconded; VIA SATA chipsets are terrible and it's the first thing I'd blame.

They really won't work with SATA2 drives properly; You NEED a SATA1 drive.

Having a SATA2 drive and jumpering it to 1.5GB/s does NOT make it a SATA1 drive on recent SATA2 disks, and although they will work better, they will still have issues! The Via seems to rely on some quirks of SATA1 which were fixed in SATA2 I guess.

If your mobo has PCIe, there are many many options; Silicon Image and Marvel chipsets see quite popular and common for PCIe expansion 'boards and they work pretty well. They're also dead cheap.

If you only have PCI, you are shafted as 99% of SATA controllers for PCI seem to use the hated VT6421 SATA controller chip which is one of the buggiest pieces of crap I ever wasted money on.

I was able to (eventually!) locate a PCI card based on the Silicon Image Sil3512 which work pretty well, although the one I got was from eBay and didn't support firmware upgrading as the flash chip was replaced with a ROM!!!  :Evil or Very Mad: 

But think about getting a new mobo; Unlike Windows you can still use your same Gentoo with it; Just recompile the kernel to support the new hardware. Not sure who has the best SATA chipsets ATM; ATI/AMD had a terrible reputation but I don't know if they've solved this yet. nVidia is pretty solid and boring. Intel had a good reputation but some of their recent SATA chipsets have had troubles in Linux apparently unless the BIOS is configured correctly.

----------

