# SSD Sata freezes/link resets with mkfs

## dman777

My Ryzen 5 AMD system is about 6 months old. I have a Samsung SSD 860 EVO(also new and smart status is good). When I try to format it takes a long time, but it eventually finishes.   I see in the logs, among other issues, exception Emask 0x0 SAct 0x4000000 SErr 0x0 action 0x6 frozen. 

I don't believe the ssd was in sleep mode because I used fdisk on it right before with no issues. 

The motherboard is a ASRock AB350 Pro4 and the bios is updated. 

kernel is 4.19.52-gentoo.

Anyone know what could be causing this? 

```

Jul  8 00:03:44 localhost kernel: [  381.930232] ata9.00: cmd 64/01:08:00:00:00/00:00:00:00:00/a0 tag 1 ncq dma 512 out

Jul  8 00:03:44 localhost kernel: [  381.930232]          res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Jul  8 00:03:44 localhost kernel: [  381.930232] ata9.00: status: { DRDY }

Jul  8 00:03:44 localhost kernel: [  381.930235] ata9: hard resetting link

Jul  8 00:03:45 localhost kernel: [  382.401548] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Jul  8 00:03:45 localhost kernel: [  382.401879] ata9.00: supports DRM functions and may not be fully accessible

Jul  8 00:03:45 localhost kernel: [  382.405156] ata9.00: supports DRM functions and may not be fully accessible

Jul  8 00:03:45 localhost kernel: [  382.407998] ata9.00: configured for UDMA/133

Jul  8 00:03:45 localhost kernel: [  382.408002] ata9: EH complete

Jul  8 00:03:45 localhost kernel: [  382.408022] ata9.00: Enabling discard_zeroes_data

Jul  8 00:04:15 localhost kernel: [  412.639205] ata9.00: exception Emask 0x0 SAct 0x10 SErr 0x0 action 0x6 frozen

Jul  8 00:04:15 localhost kernel: [  412.639207] ata9.00: failed command: SEND FPDMA QUEUED

Jul  8 00:04:15 localhost kernel: [  412.639209] ata9.00: cmd 64/01:20:00:00:00/00:00:00:00:00/a0 tag 4 ncq dma 512 out

Jul  8 00:04:15 localhost kernel: [  412.639209]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Jul  8 00:04:15 localhost kernel: [  412.639210] ata9.00: status: { DRDY }

Jul  8 00:04:15 localhost kernel: [  412.639212] ata9: hard resetting link

Jul  8 00:04:15 localhost kernel: [  413.113540] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Jul  8 00:04:15 localhost kernel: [  413.113852] ata9.00: supports DRM functions and may not be fully accessible

Jul  8 00:04:15 localhost kernel: [  413.116952] ata9.00: supports DRM functions and may not be fully accessible

Jul  8 00:04:15 localhost kernel: [  413.119654] ata9.00: configured for UDMA/133

Jul  8 00:04:15 localhost kernel: [  413.119656] ata9.00: device reported invalid CHS sector 0

Jul  8 00:04:15 localhost kernel: [  413.119659] ata9: EH complete

Jul  8 00:04:15 localhost kernel: [  413.119678] ata9.00: Enabling discard_zeroes_data

Jul  8 00:04:45 localhost kernel: [  443.359207] ata9.00: exception Emask 0x0 SAct 0x4000000 SErr 0x0 action 0x6 frozen

Jul  8 00:04:45 localhost kernel: [  443.359208] ata9.00: failed command: SEND FPDMA QUEUED

Jul  8 00:04:45 localhost kernel: [  443.359211] ata9.00: cmd 64/01:d0:00:00:00/00:00:00:00:00/a0 tag 26 ncq dma 512 out

Jul  8 00:04:45 localhost kernel: [  443.359211]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Jul  8 00:04:45 localhost kernel: [  443.359211] ata9.00: status: { DRDY }

Jul  8 00:04:45 localhost kernel: [  443.359213] ata9: hard resetting link

Jul  8 00:04:46 localhost kernel: [  443.833555] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Jul  8 00:04:46 localhost kernel: [  443.833866] ata9.00: supports DRM functions and may not be fully accessible

Jul  8 00:04:46 localhost kernel: [  443.836974] ata9.00: supports DRM functions and may not be fully accessible

Jul  8 00:04:46 localhost kernel: [  443.839675] ata9.00: configured for UDMA/133

Jul  8 00:04:46 localhost kernel: [  443.839677] ata9.00: device reported invalid CHS sector 0

Jul  8 00:04:46 localhost kernel: [  443.839680] ata9: EH complete

Jul  8 00:04:46 localhost kernel: [  443.839700] ata9.00: Enabling discard_zeroes_data

Jul  8 00:05:16 localhost kernel: [  474.079215] ata9.00: NCQ disabled due to excessive errors

Jul  8 00:05:16 localhost kernel: [  474.079217] ata9.00: exception Emask 0x0 SAct 0x20000 SErr 0x0 action 0x6 frozen

Jul  8 00:05:16 localhost kernel: [  474.079218] ata9.00: failed command: SEND FPDMA QUEUED

Jul  8 00:05:16 localhost kernel: [  474.079221] ata9.00: cmd 64/01:88:00:00:00/00:00:00:00:00/a0 tag 17 ncq dma 512 out

Jul  8 00:05:16 localhost kernel: [  474.079221]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Jul  8 00:05:16 localhost kernel: [  474.079222] ata9.00: status: { DRDY }

Jul  8 00:05:16 localhost kernel: [  474.079224] ata9: hard resetting link

Jul  8 00:05:17 localhost kernel: [  474.553544] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Jul  8 00:05:17 localhost kernel: [  474.553851] ata9.00: supports DRM functions and may not be fully accessible

Jul  8 00:05:17 localhost kernel: [  474.556960] ata9.00: supports DRM functions and may not be fully accessible

Jul  8 00:05:17 localhost kernel: [  474.559650] ata9.00: configured for UDMA/133

Jul  8 00:05:17 localhost kernel: [  474.559653] ata9.00: device reported invalid CHS sector 0

Jul  8 00:05:17 localhost kernel: [  474.559655] ata9: EH complete

Jul  8 00:05:17 localhost kernel: [  474.559674] ata9.00: Enabling discard_zeroes_data

```

Here are my ssd settings:

```

localhost /home/one # cat  /sys/block/sdb/queue/iosched/fifo_batch

16

localhost /home/one # cat /sys/block/sdb/queue/scheduler

[mq-deadline] kyber none

localhost /home/one # 

```

----------

## mike155

 *Quote:*   

> 
> 
> ```
> Jul  8 00:03:45 localhost kernel: [  382.408022] ata9.00: Enabling discard_zeroes_data 
> ```
> ...

 

Disable queued discards.

----------

## dman777

Thanks! Just to confirm, this is NCQ I need to disable with below? 

```
echo 1 > /sys/block/sda/device/queue_depth
```

EDIT: I am reading that in some posts, like here, https://strugglers.net/~andy/blog/2015/08/09/ssds-and-linux-native-command-queuing/, disabling the NCQ does infact bring performance penalties on a SSD. Is there no way around this? NCQ has to be disabled?

----------

## mike155

Please do NOT disable NCQ!  It would slow down access to your SSD.

There's a kernel patch that disables queued discards for the Samsung 860 EVO: https://bugzilla.kernel.org/show_bug.cgi?id=203475.

Take a look at the in-kernel blacklist table (to which the patch applies). You will see that the Samsung 840 EVO and 850 EVO (and many other drives) are already there: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-core.c#n4559  :Smile: 

Do NOT mount partitions that reside on your SSD with the 'discard' option. Instead, run fstrim manually or via a cron job.

----------

## dman777

My SSD is a 860 Evo(rather than 850). Is this why the issue happened despite the black listed modules which only go up to model 850*?

Also, this seems to only happen with mkfs. Are we certain it was trim + ncq that caused this?

EDIT: In https://wiki.archlinux.org/index.php/Solid_state_drive it states the issue with NCQ but doesn't mention trim. From this. does this mean simply disabling trim on the EVO is not enough and I have to disable queue?

----------

## mike155

There are 2 proposed solutions:

Put the Samsung 860 EVO on the in-kernel blacklist and disable queued discards

Disable NCQ

The second one will slow down disk access. For that reason, I would start with the first one. I would try the second one only if the first one doesn't work.

Another solution could be to update the firmware of your Samsung 860 EVO. Have you tried that?

----------

## Anon-E-moose

I run an 860 evo (regular sata not msata) and haven't seen a problem. 

I'm wondering if the kernel patch was due to more than it being an 860, they did mention having an asmedia controller. 

Of course I don't run with discard enabled, I run fstrim, from cron, once a night, and my filesystem is btrfs.

As per the original post/problem. You might make sure that the sata cable is a sata3 good quality one. A cable that worked well with a hdd might not work so well with an ssd. (if you're using a cable provided by asrock with the mb, then it should be ok) 

Edit to add:

Looking at the messages again, it looks more like a bad connection or cable, etc than anything else.

----------

## dman777

I removed discard from fstab, re enable ncq,  and rebooted. But I still have the discard/freeze  on 860 Evo(sda) issue with mkfs. Why is this?

```

localhost /home/one # cat /etc/fstab 

# /etc/fstab: static file system information.

#

# noatime turns off atimes for increased performance (atimes normally aren't 

# needed); notail increases performance of ReiserFS (at the expense of storage 

# efficiency).  It's safe to drop the noatime options if you want and to 

# switch between notail / tail freely.

#

# The root filesystem should have a pass number of either 0 or 1.

# All other filesystems should have a pass number of 0 or greater than 1.

#

# See the manpage fstab(5) for more information.

#

# <fs>         <mountpoint>   <type>      <opts>      <dump/pass>

# NOTE: If your BOOT partition is ReiserFS, add the notail option to opts.

#

# NOTE: Even though we list ext4 as the type here, it will work with ext2/ext3

#       filesystems.  This just tells the kernel to use the ext4 driver.

#

# NOTE: You can use full paths to devices like /dev/sda3, but it is often

#       more reliable to use filesystem labels or UUIDs. See your filesystem

#       documentation for details on setting a label. To obtain the UUID, use

#       the blkid(8) command.

UUID="7bd14620-6297-47e0-9721-010e449c1be6"      /boot      ext4   noatime   1 2

/dev/mapper/root   /      ext4      noatime 1 2

 UUID="61738bbb-d928-4bef-bce7-cd3d573a6b88"  /mnt/kvm-guests     ext4     noatime  1 2

#UUID=58e72203-57d1-4497-81ad-97655bd56494      /      ext4   noatime      0 1

#LABEL=swap      none      swap      sw      0 0

#/dev/cdrom      /mnt/cdrom   auto      noauto,ro   0 0

```

```

[  106.289499] ata9.00: Enabling discard_zeroes_data

[  106.289679]  sda: sda1 sda2

[  204.878822] ata9.00: Enabling discard_zeroes_data

[  204.878937]  sda: sda1

[  216.502401] ata9.00: Enabling discard_zeroes_data

[  216.502574]  sda: sda1

[  271.410456] ata9.00: Enabling discard_zeroes_data

[  271.410580]  sda: sda1 sda2

[  324.586200] ata9.00: exception Emask 0x0 SAct 0x400000 SErr 0x0 action 0x6 frozen

[  324.586203] ata9.00: failed command: SEND FPDMA QUEUED

[  324.586207] ata9.00: cmd 64/01:b0:00:00:00/00:00:00:00:00/a0 tag 22 ncq dma 512 out

                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

[  324.586208] ata9.00: status: { DRDY }

[  324.586210] ata9: hard resetting link

[  325.057198] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

[  325.057506] ata9.00: supports DRM functions and may not be fully accessible

[  325.060599] ata9.00: supports DRM functions and may not be fully accessible

[  325.063291] ata9.00: configured for UDMA/133

[  325.063293] ata9.00: device reported invalid CHS sector 0

[  325.063296] ata9: EH complete

[  325.063315] ata9.00: Enabling discard_zeroes_data

[  355.295156] ata9.00: exception Emask 0x0 SAct 0x20000 SErr 0x0 action 0x6 frozen

[  355.295158] ata9.00: failed command: SEND FPDMA QUEUED

[  355.295160] ata9.00: cmd 64/01:88:00:00:00/00:00:00:00:00/a0 tag 17 ncq dma 512 out

                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

[  355.295161] ata9.00: status: { DRDY }

[  355.295163] ata9: hard resetting link

[  355.761508] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

[  355.761822] ata9.00: supports DRM functions and may not be fully accessible

[  355.764921] ata9.00: supports DRM functions and may not be fully accessible

[  355.767619] ata9.00: configured for UDMA/133

[  355.767621] ata9.00: device reported invalid CHS sector 0

[  355.767624] ata9: EH complete

[  355.767643] ata9.00: Enabling discard_zeroes_data

[  386.017159] ata9.00: exception Emask 0x0 SAct 0x200000 SErr 0x0 action 0x6 frozen

[  386.017161] ata9.00: failed command: SEND FPDMA QUEUED

[  386.017163] ata9.00: cmd 64/01:a8:00:00:00/00:00:00:00:00/a0 tag 21 ncq dma 512 out

                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

[  386.017164] ata9.00: status: { DRDY }

[  386.017166] ata9: hard resetting link

[  386.481187] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

[  386.481563] ata9.00: supports DRM functions and may not be fully accessible

[  386.484666] ata9.00: supports DRM functions and may not be fully accessible

[  386.487355] ata9.00: configured for UDMA/133

[  386.487357] ata9.00: device reported invalid CHS sector 0

[  386.487360] ata9: EH complete

[  386.487379] ata9.00: Enabling discard_zeroes_data

[  416.735153] ata9.00: NCQ disabled due to excessive errors

[  416.735155] ata9.00: exception Emask 0x0 SAct 0x1000 SErr 0x0 action 0x6 frozen

[  416.735157] ata9.00: failed command: SEND FPDMA QUEUED

[  416.735159] ata9.00: cmd 64/01:60:00:00:00/00:00:00:00:00/a0 tag 12 ncq dma 512 out

                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

[  416.735160] ata9.00: status: { DRDY }

[  416.735162] ata9: hard resetting link

[  417.201498] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

[  417.201809] ata9.00: supports DRM functions and may not be fully accessible

[  417.204919] ata9.00: supports DRM functions and may not be fully accessible

[  417.207618] ata9.00: configured for UDMA/133

[  417.207620] ata9.00: device reported invalid CHS sector 0

[  417.207622] ata9: EH complete

[  417.207642] ata9.00: Enabling discard_zeroes_data

```

```

localhost /home/one # cat /proc/mounts

proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0

udev /dev devtmpfs rw,nosuid,relatime,size=10240k,nr_inodes=2043347,mode=755 0 0

devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0

sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0

/dev/mapper/root / ext4 rw,noatime 0 0

tmpfs /run tmpfs rw,nodev,relatime,size=1635532k,mode=755 0 0

debugfs /sys/kernel/debug debugfs rw,nosuid,nodev,noexec,relatime 0 0

selinuxfs /sys/fs/selinux selinuxfs rw,relatime 0 0

cgroup_root /sys/fs/cgroup tmpfs rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755 0 0

openrc /sys/fs/cgroup/openrc cgroup rw,nosuid,nodev,noexec,relatime,release_agent=/lib/rc/sh/cgroup-release-agent.sh,name=openrc 0 0

none /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0

cpuset /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0

cpu /sys/fs/cgroup/cpu cgroup rw,nosuid,nodev,noexec,relatime,cpu 0 0

cpuacct /sys/fs/cgroup/cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct 0 0

freezer /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0

mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0

shm /dev/shm tmpfs rw,nosuid,nodev,noexec,relatime 0 0

binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,nosuid,nodev,noexec,relatime 0 0

/dev/sdb1 /boot ext4 rw,noatime 0 0

```

----------

## mike155

 *Quote:*   

> I removed discard from fstab, re enable ncq, and rebooted. But I still have the discard/freeze on 860 Evo(sda) issue with mkfs. Why is this?

 

Did you apply the kernel patch?

----------

## dman777

No, I figured it was already applied after all these years. I should of bought a pro instead of evo. But would that patch even help since this is a model 860? I don't see 860* in that patch.

----------

## mike155

Well, the patch adds the lines below to drivers/ata/libata-core.c, directly below the entry for the Samsung SSD 850: 

```
   { "Samsung SSD 860*",      NULL,   ATA_HORKAGE_NO_NCQ_TRIM |

                                      ATA_HORKAGE_ZERO_AFTER_TRIM, },

```

That's your SSD, isn't it?

----------

## Anon-E-moose

What's the firmware version of your 860? (smartctl -i /dev/sda |grep -i firmware)

edit to add: seems someone else has a similar problem with interesting data (read thread) https://bbs.archlinux.org/viewtopic.php?id=245201

Eta2: also curious if you have the gentoo live cd or sysrescuecd and boot into that, do the errors still show?

----------

## dman777

Oh, I guess I would have to manually patch it with adding my own lines. I probably will skip that. 

Here is my firmware. On the Samsung website, they do not list a firmware upgrade for the 860 lines  https://www.samsung.com/semiconductor/minisite/ssd/download/tools/

```
localhost /home/one # smartctl -i /dev/sda |grep -i firmware

Firmware Version: RVT01B6Q
```

I agree, I bet it has to do with the ASMedia:

24:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 02)

But the motherboard doesn't offer IDE option. Also, seems like a bad idea to use IDE even if it did offer it(like NCQ disable would be less performance downgrade than IDE)

EDIT: Forgot to mention is is a brand new Star tech cable

----------

## mike155

The manual of your mainboard says:

4 x SATA3 6.0 Gb/s Connectors, support RAID (RAID 0, RAID 1 and RAID 10), NCQ, AHCI and Hot Plug

2 x SATA3 6.0 Gb/s Connectors by ASMedia ASM1061, sup- port NCQ, AHCI and Hot Plug

Do I understand correctly that your SSD is connected to one of the ASMedia ASM1061 ports? You could try to connect it to one of the other 4 connectors.

----------

## dman777

Wow! That did it!

So, this motherboard is very confusing. In the Sata layout, they placed their own 2 AsRock Media Sata ports in the middle of the 4 normal Sata ports. The Evo was plugged into one of them. I placed the Evo on a normal Sata port and it works now with no issues on mkfs. 

Thank you for bring that to my attention! And thank you both for the help!

Edit: I still see discard_zeroes_data during boot up...but no freezes. I guess trim will have to persist even though it is not causing issues any longer?

----------

## Anon-E-moose

That firmware is the same as the one, on one of my boxes, that has never had a problem, and I don't use the patch for 860's. 

It's also an old lenovo m92 that uses intel's sata controller. 

Good catch mike, I didn't think to look at the mb manual.

As far as trim, you should be able to turn it off, I don't get the zeroes message, but I don't have discard turned on.

----------

## mike155

 *Quote:*   

> Edit: I still see discard_zeroes_data during boot up...but no freezes

 

The message is probably harmless. On my file server, I see it, too:

```
# dmesg | grep discard

[    0.922457] ata1.00: Enabling discard_zeroes_data          # 1st Samsung 840 Pro SSDs

[    0.926423] ata2.00: Enabling discard_zeroes_data          # 2nd Samsung 840 Pro SSDs
```

My first post probably caused confusion about this message. Sorry!

----------

