# WD Raptor 75GB (WD740ADFD-00NLR1) buggy NCQ implementation

## blubbi

Okay, for everyone who wants to buy this drive.

Western Digital Raptor 75GB WD740ADFD (with firmware 00NLR1)

BE WARNED

The NCQ implementation in this drive (with this firmware) is really bad, so in future Kernels it will be blacklisted.

```
ata1.00: exception Emask 0x2 SAct 0x1fe00 SErr 0x0 action 0x2 frozen

ata1.00: (spurious completions during NCQ issue=0x0 SAct=0x1fe00

FIS=004040a1:00040000)

ata1.00: cmd 61/18:48:d0:4e:6d/00:00:05:00:00/40 tag 9 cdb 0x0 data 12288 out

         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)

ata1.00: cmd 61/10:50:f0:4e:6d/00:00:05:00:00/40 tag 10 cdb 0x0 data 8192 out

         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)

ata1.00: cmd 61/08:58:48:9c:6d/00:00:05:00:00/40 tag 11 cdb 0x0 data 4096 out

         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)

ata1.00: cmd 61/08:60:b0:9c:6d/00:00:05:00:00/40 tag 12 cdb 0x0 data 4096 out

         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)

ata1.00: cmd 61/28:68:90:9d:6d/00:00:05:00:00/40 tag 13 cdb 0x0 data 20480 out

         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)

ata1.00: cmd 61/08:70:50:a1:6d/00:00:05:00:00/40 tag 14 cdb 0x0 data 4096 out

         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)

ata1.00: cmd 61/08:78:a8:a1:6d/00:00:05:00:00/40 tag 15 cdb 0x0 data 4096 out

         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)

ata1.00: cmd 61/08:80:b0:a1:6d/00:00:05:00:00/40 tag 16 cdb 0x0 data 4096 out

         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)

ata1: soft resetting port

ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

ata1.00: configured for UDMA/133

ata1: EH complete

SCSI device sda: 145226112 512-byte hdwr sectors (74356 MB)

sda: Write Protect is off

sda: Mode Sense: 00 3a 00 00

SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO

or FUA
```

To see traces and the bugreport on kernel bugzilla follow the link below:

http://bugzilla.kernel.org/show_bug.cgi?id=8627

I tried to contact WD and requested a new firmware. But no response so far.

If anyone has a 74GB Raptor with another Firmware, please have a look if you get HSM Violations.

If not, than we should only blacklist the drives with firmware 00NLR1

Regards

blubbiLast edited by blubbi on Wed Jun 27, 2007 8:36 am; edited 1 time in total

----------

## tnt

 *blubbi wrote:*   

> The NCQ implementation is really bad, so in future Kernels it will be disabled.
> 
> The NCQ impelementation in this drive causes "HSM violation"

 

is it bad in kernel, so it will be disabled for good for all drives, or you're talking only about raptors?

----------

## blubbi

I am talking about the Raptor I mentioned in the topick.

http://en.wikipedia.org/wiki/WD_Raptor#The_WD740GD

regards

blubbi

----------

## blubbi

If anyone has a 74GB Raptor with another Firmware, please have a look if you get HSM Violations. 

If not, than we should only blacklist the drives with firmware 00NLR1

----------

## oldnavy23

do you know if the new 150gb raptor has this same issue or not ?

----------

## blubbi

No idea... but buy one, try it... post your results here.

If it shows errors just return it and let us know about it.

post the output of

```
hdparm -I /dev/sd?
```

thanks

blubbi

----------

## blubbi

maybe you should read here:

http://en.wikipedia.org/wiki/WD_Raptor#The_WD1500

----------

## oldnavy23

well with that looks like ncq is on the 150 too  i guess it will be trial and error to test it out  and also do you think it would make my server allot faster with a raptor if i can get it too work i have 2 gigs ram 1tb hd and amd 3500

----------

## blubbi

Depends on what kind of server it will be.

But thats OT in this thread. better ask in a new thread or in your previous page.

And I am the wrong person to answere your question... I actually can't say how much faster your system would be with a 7200RPM Hdd WITH NCQ compared to a RAPTOR _without_ NCQ... but Raptor WITH NCQ would speed up your system... but how much and if this gain is noticeable... I don't know.

Maybe a Raptor without NCQ would be faster than an 7200 rpm HDD with NCQ ... I have no clue. Sry

regards

blubbi

----------

## obrut<-

hmm...

i never encountered any errors with my raptor. ncq is enabled. 

```
   41.405326] scsi 1:0:0:0: Direct-Access     ATA      WDC WD740ADFD-60 20.0 PQ: 0 ANSI: 5

[   41.413533] sd 1:0:0:0: [sdb] 145226112 512-byte hardware sectors (74356 MB)

[   41.421724] sd 1:0:0:0: [sdb] Write Protect is off

[   41.429740] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00

[   41.429771] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

[   41.438012] sd 1:0:0:0: [sdb] 145226112 512-byte hardware sectors (74356 MB)

[   41.446144] sd 1:0:0:0: [sdb] Write Protect is off

[   41.454229] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00

[   41.454261] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

[   41.462416]  sdb: sdb1 sdb2 sdb3 < sdb5 sdb6 sdb7 sdb8 > sdb4

[   41.515319] sd 1:0:0:0: [sdb] Attached SCSI disk
```

```
[   40.226872] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

[   40.236471] ata1.00: ATA-7: WDC WD740ADFD-60NLR1, 20.07P20, max UDMA/133

[   40.244523] ata1.00: 145226112 sectors, multi 16: LBA48 NCQ (depth 0/32)

[   40.254988] ata1.00: configured for UDMA/133
```

```
# hdparm -I /dev/sdb

/dev/sdb:

ATA device, with non-removable media

        Model Number:       WDC WD740ADFD-60NLR1

        Serial Number:      WD-WMAN********

        Firmware Revision:  20.07P20

Standards:

        Used: ATA/ATAPI-7 published, ANSI INCITS 397-2005

        Supported: 7 6 5 4

Configuration:

        Logical         max     current

        cylinders       16383   16383

        heads           16      16

        sectors/track   63      63

        --

        CHS current addressable sectors:   16514064

        LBA    user addressable sectors:  145226112

        LBA48  user addressable sectors:  145226112

        device size with M = 1024*1024:       70911 MBytes

        device size with M = 1000*1000:       74355 MBytes (74 GB)

Capabilities:

        LBA, IORDY(can be disabled)

        Queue depth: 32

        Standby timer values: spec'd by Standard, with device specific minimum

        R/W multiple sector transfer: Max = 16  Current = 16

        DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6

             Cycle time: min=120ns recommended=120ns

        PIO: pio0 pio1 pio2 pio3 pio4

             Cycle time: no flow control=120ns  IORDY flow control=120ns

Commands/features:

        Enabled Supported:

           *    SMART feature set

           *    Power Management feature set

           *    Write cache

           *    Look-ahead

           *    WRITE_BUFFER command

           *    READ_BUFFER command

           *    NOP cmd

           *    DOWNLOAD_MICROCODE

           *    48-bit Address feature set

           *    Device Configuration Overlay feature set

           *    Mandatory FLUSH_CACHE

           *    FLUSH_CACHE_EXT

           *    SMART error logging

           *    SMART self-test

           *    General Purpose Logging feature set

           *    SATA-I signaling speed (1.5Gb/s)

           *    Native Command Queueing (NCQ)

           *    Phy event counters

                DMA Setup Auto-Activate optimization

           *    Software settings preservation

           *    SMART Command Transport (SCT) feature set

           *    SCT Long Sector Access (AC1)

           *    SCT LBA Segment Access (AC2)

           *    SCT Error Recovery Control (AC3)

           *    SCT Features Control (AC4)

           *    SCT Data Tables (AC5)

                unknown 206[12]

Checksum: correct

```

----------

## blubbi

Okay, interesting.

I See you got this:

```
ATA device, with non-removable media 

        Model Number:       WDC WD740ADFD-60NLR1 

        Serial Number:      WD-WMAN******** 

        Firmware Revision:  20.07P20
```

And I got this:

```
ATA device, with non-removable media

        Model Number:       WDC WD740ADFD-00NLR1

        Serial Number:      WD-WMAN.........

        Firmware Revision:  20.07P20

```

Now it would be interesting to know the difference between

60NLR1 and 00NLR1 and what exactly that Number expresses.

If you could turn on debugung in your libata driver, and recompile the kernel, you could look for misbehavior of NCQ.

Would be good to know, cause for now the entire WD740ADFD series is on the NCQ blacklist in the 2.6.22 Kernel.

Thanks

blubbi

----------

## obrut<-

when you tell me where to find those debug options i'll do that. 

```
# uname -a

Linux desktop 2.6.21-kamikaze6 #1 SMP PREEMPT Mon Jul 2 00:11:29 CEST 2007 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 3600+ AuthenticAMD GNU/Linux
```

 the corresponding 2.6.22 kernel hast just been compiled but not yet booted.

btw: tach, landsmann! *g*

----------

## blubbi

grep your drivers/ata/libata-core.c

for the string "WDC WD740ADFD-00NLR1" to see if the patch from Tejun Heo has been included.

To enable debugging do the following:

edit include/linux/libata.h and change 2 to the following:

#define ATA_DEBUG

#define ATA_VERBOSE_DEBUG

- Rebuild the kernel, install it and reboot

- Look in /var/log/messages for the debug messages

Look out for

"spurious completions during NCQ"

and/or

"HSM violation"

regards

blubbi

----------

## obrut<-

changed the 2 lines from #undef to #define, but:

```
# make && make install

  CHK     include/linux/version.h

  CHK     include/linux/utsrelease.h

  HOSTCC  scripts/basic/fixdep

  HOSTCC  scripts/basic/docproc

  HOSTCC  scripts/mod/file2alias.o

  HOSTCC  scripts/mod/sumversion.o

  HOSTLD  scripts/mod/modpost

  CHK     include/linux/compile.h

  HOSTCC  usr/gen_init_cpio

  GEN     usr/initramfs_data.cpio.gz

  AS      usr/initramfs_data.o

  LD      usr/built-in.o

  CC      drivers/ata/libata-core.o

In file included from drivers/ata/libata-core.c:55:

include/linux/libata.h:2: Fehler: expected »=«, »,«, »;«, »asm« or »__attribute__« before numeric constant

include/linux/libata.h:6: Fehler: expected »=«, »,«, »;«, »asm« or »__attribute__« before »can«

include/linux/libata.h:8: Fehler: expected »=«, »,«, »;«, »asm« or »__attribute__« before »version«

include/linux/libata.h:12: Fehler: expected »=«, »,«, »;«, »asm« or »__attribute__« before »even«

include/linux/libata.h:17: Fehler: expected »=«, »,«, »;«, »asm« or »__attribute__« before »the«

In Datei, eingefügt von drivers/ata/libata-core.c:55:

include/linux/libata.h:18:63: Fehler: ungültige Ziffer »9« in Oktal-Konstante

include/linux/libata.h:21:43: Warnung: Zeichenkonstante zu lang für ihren Typ

drivers/ata/libata-core.c: In Funktion »ata_dev_configure«:

drivers/ata/libata-core.c:1788: Warnung: in Vergleich verschiedener Zeigertypen fehlt Typkonvertierung

make[2]: *** [drivers/ata/libata-core.o] Fehler 1

make[1]: *** [drivers/ata] Fehler 2

make: *** [drivers] Fehler 2

```

changing back does not help.   :Shocked: 

p.s.:

```
# cat drivers/ata/libata-core.c | grep 740

        { "WDC WD740ADFD-00",   NULL,           ATA_HORKAGE_NONCQ },
```

in 2.6.22 there a 2 lines with raptors

----------

## blubbi

I see you are german, do you know about #gentoo.de in freenode.

If so we could meet there an try to fix it. My nick is blubbi in freenode. Just send me a query.

I wuold suggest a "make clean && make distclean && make mrproper" backup your .config first!

remove all modules and start with this complete new kernel.

regards

blubbi

----------

## obrut<-

the same error again. 

btw i'm in gentoo.de atm

----------

## AndyRTR

any progress? i also have a 74MB Raptor drive with the critical firmware and these entries in dmesg:

with ICH9R controller in IDE mode:

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2

ata1.00: (BMDMA stat 0x26)

ata1.00: cmd ca/00:08:91:0d:45/00:00:00:00:00/e4 tag 0 cdb 0x0 data 4096 out

         res 51/84:08:91:0d:45/00:00:00:00:00/e4 Emask 0x30 (host bus error)

ata1: soft resetting port

ata1.00: configured for UDMA/133

ata1: EH complete

sd 0:0:0:0: [sda] 145226112 512-byte hardware sectors (74356 MB)

sd 0:0:0:0: [sda] Write Protect is off

sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00

sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

in AHCI mode:

Aug 12 03:02:21 workstation64 ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x2 frozen

Aug 12 03:02:21 workstation64 ata1.00: (irq_stat 0x08000000, interface fatal error)

Aug 12 03:02:21 workstation64 ata1.00: cmd 35/00:00:19:7e:3d/00:04:00:00:00/e0 tag 0 cdb 0x0 data 524288 out

Aug 12 03:02:21 workstation64 res 50/00:00:08:26:59/00:00:00:00:00/e5 Emask 0x10 (ATA bus error)

Aug 12 03:02:21 workstation64 ata1: soft resetting port

Aug 12 03:02:21 workstation64 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

Aug 12 03:02:21 workstation64 ata1.00: configured for UDMA/33

Aug 12 03:02:21 workstation64 ata1: EH complete

Aug 12 03:02:21 workstation64 sd 0:0:0:0: [sda] 145226112 512-byte hardware sectors (74356 MB)

Aug 12 03:02:21 workstation64 sd 0:0:0:0: [sda] Write Protect is off

Aug 12 03:02:21 workstation64 sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00

Aug 12 03:02:21 workstation64 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

so it should be more than just disabling NCQ. the problem still exists here with kernel 2.6.23rc3

any idea? 

btw: running ArchLinux x86_64 here.

----------

## blubbi

Mmmh, the patch should be implemented as of vanilla 2.6.22

You can check this by greping your drivers/ata/libata-core.c for "WD740ADFD-00NLR1" You should find this string in the NCQ blacklist section.

You can follow my the solved bug here. http://bugzilla.kernel.org/show_bug.cgi?id=8627

But I guess your bug is not caused by NCQ. Cause you shold read something like this

```
ata1.00: exception Emask 0x2 SAct 0x1fe00 SErr 0x0 action 0x2 frozen

ata1.00: (spurious completions during NCQ issue=0x0 SAct=0x1fe00

FIS=004040a1:00040000)

ata1.00: cmd 61/18:48:d0:4e:6d/00:00:05:00:00/40 tag 9 cdb 0x0 data 12288 out

         res 40/00:90:28:a6:6c/00:00:05:00:00/40 Emask 0x2 (HSM violation)
```

I have found another bug in 2.6.22.2 and it may be present in 2.6.23 have a look here.

http://bugzilla.kernel.org/show_bug.cgi?id=8791

(scroll down to the bottom)

But I still think you have an other problem, cause this bug is IMHO related to MD or pata_hpt37x (not resolved yet)

I do not longer have any issues with the Raptors after the Raptors got blacklisted for NCQ.

Take a chance and and write a bug "http://bugzilla.kernel.org/" the guys are really fast in fixing things!

but first read these thwo threads: http://www.mail-archive.com/linux-ide@vger.kernel.org/msg06991.html and http://groups.google.com/group/fa.linux.kernel/browse_thread/thread/2c895e1ac0a8ccf9/d5e910d37451ca33?lnk=raot

regards

blubbi

----------

