# [SOLVED] NVMe drive fails on resume

## xgivolari

Hello all,

Recently, I bought a Kioxia EXCERIA PLUS G2 M.2 NVMe drive for my desktop system. (MB: GIGABYTE B450M S2H, latest firmware version F62b with Ryzen 5 2600) Cloning my installation from an old hard drive onto the new SSD and booting from it went smoothly. The only problem is that the system always freezes when resuming from suspend. My first suspect was the nvidia driver, but removing that makes no difference. When suspending/resuming from terminal, I can at least quickly run dmesg. The system then seemingly does nothing for roughly a minute, only to reveal a kernel log with several errors. After this, any terminal input fails. Unfortunately, I cannot save the original dmesg log, so I have to type out its content manually from a picture I took with my phone:

```
nvme nvme0: Device not ready: aborting reset, CSTS=0x1

nvme nvme0: Removing after probe failure status: -19

nvme nvme0: Device not ready: aborting reset, CSTS=0x1

nvme0n1: detected capacity change from 9776773168 to 0

EXT4-fs error (device nvme0n1p2): ext4_get_inode_loc_noinmem:4444: inode #3063042: block 12061680: comm dmesg: unable to read itable block

EXT4-fs warning (device nvme0n1p2): ext4_end_bio:342: I/O error 10 writing to inode 3015764 starting block 8227588)

EXT4-fs warning (device nvme0n1p2): ext4_end_bio:342: I/O error 10 writing to inode 2097156 starting block 9052450)

Aborting journal on device nvme0n1p2-8.

Buffer I/O error on device nvme0n1p2, logical block 8921122

Buffer I/O error on device nvme0n1p2, logical block 8096260

Buffer I/O error on dev nvme0n1p2, logical block 6324224, lost sync page write

EXT4-fs (nvme0n1p2): Delayed block allocation failed for inode 1967449 at logical offset 14 with max blocks 2 with error 30

JBD2: Error -5 detected when updating journal superblock for nvme0n1p2-8.

<<insert several errors relating to I/O failure / journal abort, filesystem is remounted read-only>>

```

Kernel is latest 5.14.9, everything else is up to date as well.

Steps I have taken so far to solve this issue:

-removing proprietary nvidia drivers

-disabling PCIe ASPM in BIOS

-booting with nvme_core.default_ps_max_latency_us=0 to disable APST

-checking SMART status, (no issues), updating drive firmware, running fsck and e2fsck unmounted, regenerating journal

-disabling power management for the drive

So far, none if these things has made any difference. Do you know how I can get suspend and resume to work correctly? Thanks!Last edited by xgivolari on Mon Oct 04, 2021 2:35 pm; edited 1 time in total

----------

## mike155

Seems to be similar to this: https://bbs.archlinux.org/viewtopic.php?id=258883

 *Quote:*   

> I tried to move the drive from the front nvme slot to the back nvme slot, and sleeping worked. I think this is a fine workaround, but I'll have to test this setup to see if it's stable.

 

Does that help in your case? Anything else in that thread that could help?

You could also ask Google

```
nvme "not ready" after suspend
```

You'll find many results...

----------

## DaggyStyle

worth while to check the bios if there is any suspend/resume entry for pci cards

----------

## xgivolari

@mike155 @DaggyStyle Thank you for your advice. After extensive digging through my BIOS settings, I was not able to find anything of help. At least I have discovered settings for AMD SVE and PCIe ASPM. From what I could gather by reading through other threads about this issue, it looks like this is an AM4 platform-specific thing. In my case at least, APST seems not to be at fault, and disabling it makes no difference.

----------

## DaggyStyle

can you check previous kernels? to see if it is a regression?

----------

## xgivolari

 *DaggyStyle wrote:*   

> can you check previous kernels? to see if it is a regression?

 

I have tried with the most recent stable version of gentoo-sources, v5.10.68. It makes no difference.  :Sad: 

----------

## xgivolari

UPDATE: After experimenting with manually writing into /sys/power/state, I have discovered that suspend2idle actually works! "echo freeze > /sys/power/state" or alternatively "echo s2idle > /sys/power/mem_sleep && echo mem > /sys/power/state" puts my system into a suspend state it can actually resume from. However, suspending via loginctl suspend or XFCE menu still does not work. I edited my /etc/elogind/elogind.conf to this:

SuspendState=mem

SuspendMode=s2idle

But it seems to make no difference.

UPDATE 2: It looks like loginctl can only suspend/resume properly from commandline. Entering loginctl suspend from XFCE results in failure as usual. Playing with the elogind.conf NvidiaSleep option seems to have no effect as well.

UPDATE 3: Loginctl suspend failing turns out to be a known issue as discussed in this thread. I fixed it by placing this script in /lib64/elogind/system/sleep/20-nvidia:

```
#!/bin/sh

case "${1-}" in

    'pre')

        /usr/bin/nvidia-sleep.sh suspend

        ;;

    'post')

        /usr/bin/nvidia-sleep.sh resume &

        ;;

    *)

        exit 1

        ;;

esac
```

Even though working suspend2ram would still be nice, this issue is for all other intents and purposes now solved.

----------

