# problem with sata

## menschmeier

Hi,

I checked my syslog (/var/log/messages) recently and saw this the following lines every few minutes:

 *Quote:*   

> Dec 26 18:25:35 pluto ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> 
> Dec 26 18:25:35 pluto ata2.01: cmd a0/00:00:00:00:20/00:00:00:00:00/b0 tag 0 cdb 0x43 data 12 in
> 
> Dec 26 18:25:35 pluto res 40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
> ...

 

Sometimes the system is quite slow - doing a emerge --sync makes the complete system (Core2Duo, 2 GB RAM) very slow. So maybe this problem is responsible for this slowness.

Here the hardware I am using:  IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA IDE Controller (rev 02)

 *Quote:*   

> # lspci
> 
> 00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub (rev 03)
> 
> 00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03)
> ...

 

Here a few lines of my kernel configuration. I am using kernel 2.6.22.5:

 *Quote:*   

> #
> 
> # Block devices
> 
> #
> ...

 

Does anyone has an idea what to do?

menschmeier

----------

## twam

Hi,

I have similar issues. My Board has an ICH7-SATA Controller from Intel 

```
00:1f.2 SATA controller: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA AHCI Controller (rev 02)
```

with 2 harddisks from Samsung

```
sd 0:0:0:0: [sda] 1465149168 512-byte hardware sectors (750156 MB)

sd 0:0:0:0: [sda] Write Protect is off

sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00

sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

sd 2:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB)

sd 2:0:0:0: [sdb] Write Protect is off

sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00

sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
```

The Kernel talks to the controller via the "ahci" driver. Copy amounts of data (or hdparm -tT /dev/sda)  to the large disk (hda) ends up with

```
ata1.00: exception Emask 0x0 SAct 0x27 SErr 0x0 action 0x2 frozen

ata1.00: cmd 60/08:00:08:be:05/00:00:00:00:00/40 tag 0 cdb 0x0 data 4096 in

         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

ata1.00: cmd 60/f8:08:10:be:05/00:00:00:00:00/40 tag 1 cdb 0x0 data 126976 in

         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

ata1.00: cmd 60/08:10:17:b8:7d/00:00:4f:00:00/40 tag 2 cdb 0x0 data 4096 in

         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

ata1.00: cmd 60/f8:28:10:bd:05/00:00:00:00:00/40 tag 5 cdb 0x0 data 126976 in

         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

ata1: soft resetting port

ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

ata1.00: configured for UDMA/133

ata1: EH complete

sd 0:0:0:0: [sda] 1465149168 512-byte hardware sectors (750156 MB)

sd 0:0:0:0: [sda] Write Protect is off

sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00

sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
```

The smaller one is working fine. The 750GB disk supports SATA-II but disabling this with Samsungs tool didn't change anything.

Unfortunately, i don't have any hints to solve this :/

----------

## twam

Hmm.. I thought it could be the ICH7 and bought an Silicon controller card

```
01:00.0 RAID bus controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller (rev 01)
```

but still the same problems :/

```
[ 4353.110753] ata3.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x2 frozen

[ 4353.110768] ata3.00: cmd 61/10:00:0f:04:10/00:00:3f:00:00/40 tag 0 ncq 8192 out

[ 4353.110771]          res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

[ 4353.110778] ata3.00: status: { DRDY }

[ 4353.110789] ata3.00: cmd 61/40:08:f7:77:ad/00:00:2b:00:00/40 tag 1 ncq 32768 out

[ 4353.110792]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

[ 4353.110798] ata3.00: status: { DRDY }

[ 4353.110812] ata3.00: cmd 61/40:10:37:78:ad/00:00:2b:00:00/40 tag 2 ncq 32768 out

[ 4353.110815]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

[ 4353.110821] ata3.00: status: { DRDY }

[ 4353.110829] ata3.00: cmd 61/40:18:77:78:ad/00:00:2b:00:00/40 tag 3 ncq 32768 out

[ 4353.110831]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

[ 4353.110841] ata3.00: status: { DRDY }

[ 4353.110863] ata3: hard resetting link

[ 4355.324670] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 0)

[ 4355.325561] ata3.00: configured for UDMA/100

[ 4355.325582] ata3: EH complete

[ 4355.325632] sd 2:0:0:0: [sdb] 1465149168 512-byte hardware sectors (750156 MB)

[ 4355.325652] sd 2:0:0:0: [sdb] Write Protect is off

[ 4355.325656] sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00

[ 4355.325684] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
```

----------

## menschmeier

Hi,

might be a problem of the optical drive.

Do you have a similar drive:

# sdparm  /dev/sr0

    /dev/sr0: TSSTcorp  CD/DVDW SN-S082D  SS02  [cd/dvd]

Doing a 

```
killall  hald-addon-storage
```

will probably help.

The problem is caused be the bad firmware of the TSST drive.  :Sad: 

menschmeier

----------

## twam

I don't have any optical drives installed. Just 3 Harddisk, 2 via SATA and 1 via PATA.

----------

## twam

Nobody else any hints? Or is this worth bug reporting to the kernel developers?

----------

## twam

Disabling NCQ via 

```
echo 1 > /sys/block/sdb/device/queue_depth
```

seems to help.

No more errors since 2 days with heavy load.

----------

## rroden12

I had a very similar problem the other day, it ended up being a power problem, I removed any non-essential devices from the system (extra fans, PCI cards, CDROM drives) and tested it, problem went away, so I bought a better power supply and it has been fine since.

----------

## twam

Hmm.. I have a 350W power supply and systems needs about 35W in idle and 50W on full load. This should be enough  :Wink: 

----------

## cal22cal

 *menschmeier wrote:*   

> I am using kernel 2.6.22.5

 Try to upgrade to 2.6.24.

My SATA err msg has gone.

----------

## rlholgate

I'm seeing the same error with 2.6.22-gentoo-r5-a and now with 2.6.23-gentoo-r3-a. I was rather hoping someone here might know the 2-line summary of what's going on? I'm currently assuming it's a kernel bug, and not my broken hardware!

----------

## amc1804

I think I have the same disk as twam (mine is a 750 GB Samsung Spinpoint F1, HD753LJ), and I've observed the same symptom.  It's connected to a SATA-150 port.  lspci says:

```

00:1f.2 IDE interface: Intel Corporation 82801FBM (ICH6M) SATA Controller (rev 04)

```

Whenever I write lots of data to the disk, I see this in the logs:

```

kernel: ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x2 frozen

kernel: ata1.00: cmd 61/00:00:12:e1:1d/04:00:2d:00:00/40 tag 0 cdb 0x0 data 524288 out

...lots more like the previous one...

kernel: ata1: soft resetting port

kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

kernel: ata1.00: configured for UDMA/133

kernel: ata1: EH complete

kernel: SCSI device sda: 1465149168 512-byte hdwr sectors (750156 MB)

kernel: sda: Write Protect is off

kernel: sda: Mode Sense: 00 3a 00 00

kernel: SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA

```

twam's suggested fix of writing 1 to /sys/block/sda/device/queue_depth fixed it.  I didn't see the symptom for more than a week, even though I wrote large amounts of data on two occasions.  I tried reverting back to the original value of 31, and within two minutes of heavy writing I saw the symptom again.

AMC

----------

## c3l5o

I'm confused... Because...

I have a 420W PSU (wich I think it's weak)

I have a TSST sata drive  :Sad: 

I have all those fixes you guys talk about here...

I still have problems  :Sad: 

----------

## rlholgate

My case turned out to be a slowly-failing power supply. The disk had been suffering momentary power outages, I believe. When I got the voltmeter out the power supply was delivering only 4.4volts on the 5 volt rail. I've since replaced it, and have not had any disk resets since.

----------

