# sata power link management causes disk errors

## albright

My new thinkpad t440s has this problem

if I switch to min_power for my sata sdd

(e.g. "echo min_power > /sys/class/scsi_host/host0/link_power_management_policy

or via powertop, or automatically via laptop-mode)

and then switch back to max_perfomance (by plugging in to AC or with

the echo command etc)

I will (almost always but not 100% of the time) cause disk errors and remount as

read-only

I managed to get the log off during this problem; I wonder if anyone who knows

more than I do can spot anything that might lead to a solution. 

I should say that

this does NOT happen in windows (I dual boot) and windows at least reports that

it is using sata power management.

smart tools reports no problems with the drive, no bad blocks, etc

If I don't enable sata power management everything works perfectly

(the notebook is a dream except for this issue)

Here's the relevant part of the log:

```

Jan 29 08:21:09 olwe kernel: ata1.00: failed to get NCQ Send/Recv Log Emask 0x1

Jan 29 08:21:09 olwe kernel: ata1.00: failed to get NCQ Send/Recv Log Emask 0x1

Jan 29 08:21:09 olwe kernel: ata1.00: configured for UDMA/133

Jan 29 08:21:09 olwe kernel: ata1: EH complete

Jan 29 08:21:09 olwe kernel: sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA

Jan 29 08:21:09 olwe kernel: EXT4-fs (sda8): re-mounted. Opts: discard,commit=0

Jan 29 08:21:09 olwe kernel: thinkpad_acpi: EC reports that Thermal Table has changed

Jan 29 08:21:09 olwe logger: ACPI event unhandled: ibm/hotkey LEN0068:00 00000080 00006030

Jan 29 08:21:09 olwe logger: ACPI event unhandled: thermal_zone LNXTHERM:00 00000081 00000000

Jan 29 08:21:09 olwe logger: ACPI event unhandled: battery PNP0C0A:01 00000080 00000001

Jan 29 08:21:28 olwe ifplugd(eth0)[1893]: Using detection mode: IFF_RUNNING

Jan 29 08:22:02 olwe logger: ACPI event unhandled: ac_adapter ACPI0003:00 00000080 00000001

Jan 29 08:22:02 olwe kernel: ata1.00: failed to get NCQ Send/Recv Log Emask 0x1

Jan 29 08:22:02 olwe kernel: ata1.00: failed to get NCQ Send/Recv Log Emask 0x1

Jan 29 08:22:02 olwe kernel: ata1.00: configured for UDMA/133

Jan 29 08:22:02 olwe kernel: ata1: EH complete

Jan 29 08:22:02 olwe kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Jan 29 08:22:02 olwe kernel: EXT4-fs (sda8): re-mounted. Opts: discard,commit=0

Jan 29 08:22:02 olwe kernel: thinkpad_acpi: EC reports that Thermal Table has changed

Jan 29 08:22:02 olwe logger: ACPI event unhandled: ibm/hotkey LEN0068:00 00000080 00006030

Jan 29 08:22:02 olwe logger: ACPI event unhandled: thermal_zone LNXTHERM:00 00000081 00000000

Jan 29 08:22:02 olwe logger: ACPI event unhandled: battery PNP0C0A:01 00000080 00000001

Jan 29 08:22:57 olwe kernel: ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x40000 action 0x6 frozen

Jan 29 08:22:57 olwe kernel: ata1: SError: { CommWake }

Jan 29 08:22:57 olwe kernel: ata1.00: failed command: WRITE FPDMA QUEUED

Jan 29 08:22:57 olwe kernel: ata1.00: cmd 61/48:00:80:df:40/00:00:22:00:00/40 tag 0 ncq 36864 out

Jan 29 08:22:57 olwe kernel: res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)

Jan 29 08:22:57 olwe kernel: ata1.00: status: { DRDY }

Jan 29 08:22:57 olwe kernel: ata1.00: failed command: READ FPDMA QUEUED

Jan 29 08:22:57 olwe kernel: ata1.00: cmd 60/08:08:80:d0:7c/00:00:0c:00:00/40 tag 1 ncq 4096 in

Jan 29 08:22:57 olwe kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan 29 08:22:57 olwe kernel: ata1.00: status: { DRDY }

Jan 29 08:22:57 olwe kernel: ata1: hard resetting link

Jan 29 08:23:02 olwe kernel: ata1: link is slow to respond, please be patient (ready=0)

Jan 29 08:23:07 olwe kernel: ata1: COMRESET failed (errno=-16)

Jan 29 08:23:07 olwe kernel: ata1: hard resetting link

Jan 29 08:23:12 olwe kernel: ata1: link is slow to respond, please be patient (ready=0)

Jan 29 08:23:17 olwe kernel: ata1: COMRESET failed (errno=-16)

Jan 29 08:23:17 olwe kernel: ata1: hard resetting link

Jan 29 08:23:22 olwe kernel: ata1: link is slow to respond, please be patient (ready=0)

Jan 29 08:23:52 olwe kernel: ata1: COMRESET failed (errno=-16)

Jan 29 08:23:52 olwe kernel: ata1: limiting SATA link speed to 3.0 Gbps

Jan 29 08:23:52 olwe kernel: ata1: hard resetting link

Jan 29 08:23:57 olwe kernel: ata1: COMRESET failed (errno=-16)

Jan 29 08:23:57 olwe kernel: ata1: reset failed, giving up

Jan 29 08:23:57 olwe kernel: ata1.00: disabled

Jan 29 08:23:57 olwe kernel: ata1.00: device reported invalid CHS sector 0

Jan 29 08:23:57 olwe kernel: ata1.00: device reported invalid CHS sector 0

Jan 29 08:23:57 olwe kernel: ata1: EH complete

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] Unhandled error code

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda]  

Jan 29 08:23:57 olwe kernel: Result: hostbyte=0x04 driverbyte=0x00

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] CDB: 

Jan 29 08:23:57 olwe login[2023]: pam_unix(login:session): session opened for user root by LOGIN(uid=0)

Jan 29 08:23:57 olwe kernel: cdb[0]=0x28: 28 00 0c 7c d0 80 00 00 08 00

Jan 29 08:23:57 olwe kernel: end_request: I/O error, dev sda, sector 209506432

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] Unhandled error code

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda]  

Jan 29 08:23:57 olwe kernel: Result: hostbyte=0x04 driverbyte=0x00

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] CDB: 

Jan 29 08:23:57 olwe kernel: cdb[0]=0x2a: 2a 00 22 40 df 80 00 00 48 00

Jan 29 08:23:57 olwe kernel: end_request: I/O error, dev sda, sector 574676864

Jan 29 08:23:57 olwe kernel: EXT4-fs error (device sda8): ext4_read_inode_bitmap:174: comm bash: Cannot read inode bitmap - block_group = 160, inode_bitmap = 5242896

Jan 29 08:23:57 olwe kernel: Aborting journal on device sda8-8.

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] Unhandled error code

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda]  

Jan 29 08:23:57 olwe kernel: Result: hostbyte=0x04 driverbyte=0x00

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] CDB: 

Jan 29 08:23:57 olwe kernel: cdb[0]=0x2a: 2a 00 09 fc d0 00 00 00 08 00

Jan 29 08:23:57 olwe kernel: end_request: I/O error, dev sda, sector 167563264

Jan 29 08:23:57 olwe kernel: Buffer I/O error on device sda8, logical block 0

Jan 29 08:23:57 olwe kernel: lost page write due to I/O error on sda8

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] Unhandled error code

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda]  

Jan 29 08:23:57 olwe kernel: Result: hostbyte=0x04 driverbyte=0x00

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] CDB: 

Jan 29 08:23:57 olwe kernel: cdb[0]=0x2a: 2a 00 22 40 d0 00 00 00 08 00

Jan 29 08:23:57 olwe kernel: end_request: I/O error, dev sda, sector 574672896

Jan 29 08:23:57 olwe kernel: Buffer I/O error on device sda8, logical block 50888704

Jan 29 08:23:57 olwe kernel: lost page write due to I/O error on sda8

Jan 29 08:23:57 olwe kernel: JBD2: Error -5 detected when updating journal superblock for sda8-8.

Jan 29 08:23:57 olwe kernel: EXT4-fs (sda8): previous I/O error to superblock detected

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] Unhandled error code

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda]  

Jan 29 08:23:57 olwe kernel: Result: hostbyte=0x04 driverbyte=0x00

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] CDB: 

Jan 29 08:23:57 olwe kernel: cdb[0]=0x93: 93 08 00 00 00 00 24 00 d1 48 00 00 00 08 00 00

Jan 29 08:23:57 olwe kernel: end_request: I/O error, dev sda, sector 604033352

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] Unhandled error code

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda]  

Jan 29 08:23:57 olwe kernel: Result: hostbyte=0x04 driverbyte=0x00

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] CDB: 

Jan 29 08:23:57 olwe kernel: cdb[0]=0x2a: 2a 00 09 fc d0 00 00 00 08 00

Jan 29 08:23:57 olwe kernel: end_request: I/O error, dev sda, sector 167563264

Jan 29 08:23:57 olwe kernel: Buffer I/O error on device sda8, logical block 0

Jan 29 08:23:57 olwe kernel: lost page write due to I/O error on sda8

Jan 29 08:23:57 olwe kernel: EXT4-fs (sda8): discard request in group:1665 block:41 count:1 failed with -5

Jan 29 08:23:57 olwe kernel: EXT4-fs error (device sda8): ext4_journal_check_start:56: Detected aborted journal

Jan 29 08:23:57 olwe kernel: EXT4-fs (sda8): Remounting filesystem read-only

Jan 29 08:23:57 olwe kernel: EXT4-fs (sda8): previous I/O error to superblock detected

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] Unhandled error code

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda]  

Jan 29 08:23:57 olwe kernel: Result: hostbyte=0x04 driverbyte=0x00

Jan 29 08:23:57 olwe kernel: sd 0:0:0:0: [sda] CDB: 

Jan 29 08:23:57 olwe kernel: cdb[0]=0x2a: 2a 00 09 fc d0 00 00 00 08 00

Jan 29 08:23:57 olwe kernel: end_request: I/O error, dev sda, sector 167563264

Jan 29 08:23:57 olwe kernel: Buffer I/O error on device sda8, logical block 0

Jan 29 08:23:57 olwe kernel: lost page write due to I/O error on sda8

Jan 29 08:23:57 olwe kernel: type=1006 audit(1391001837.551:6): pid=2023 uid=0 old auid=4294967295 new auid=0 old ses=4294967295 new ses=5 res=1

Jan 29 08:23:57 olwe login[2781]: ROOT LOGIN  on '/dev/tty1'
```

----------

## albright

replying to myself -

I've found that if I use medium_power instead of min_power in

/sys/class/scsi_host/hostN/link...

then all is well (and the power use seems pretty much as good as

with min_power, though that is not tested, just my impression)

tlp (unlike laptop-mode) has a configuration option that handles

this ...

so ... not solved but worked around

----------

