# Kernel sleep debugging?

## Zucca

Before I start:

I run systemd on ~amd64 Linux 4.11.0. However systemd package has ~amd64 masked (for reasons).

 Yes. The problem has been there on earlier kernels too.

 Sometimes systemd does not "see" that I've pressed the PwrBtn (this is also random). Nothing appears in journal. However, initiating the suspend/hibernate from command line always works (meaning systemd starts the process of entering the sleep state), so this problem may actually not be systemd's fault.

```
Machine:   Device: desktop Mobo: ASRock model: 970M Pro3

           UEFI [Legacy]: American Megatrends v: P1.60 date: 06/17/2016

CPU:       Octa core AMD FX-8350 Eight-Core (-MCP-) cache: 16384 KB 

           clock speeds: max: 4000 MHz 1: 1400 MHz 2: 1400 MHz 3: 2100 MHz

           4: 1400 MHz 5: 2100 MHz 6: 2100 MHz 7: 1400 MHz 8: 1400 MHz

Graphics:  Card: Advanced Micro Devices [AMD/ATI] Fiji [Radeon R9 FURY / NANO Series]

           Display Server: X.Org 1.19.3 drivers: amdgpu (unloaded: radeon)

           Resolution: 1920x1200@59.95hz, 1920x1080@60.00hz

           GLX Renderer: Gallium 0.4 on AMD FIJI (DRM 3.10.0 / 4.11.0-gentoo-wren, LLVM 5.0.0)

           GLX Version: 3.0 Mesa 17.2.0-devel (git-5c92b1bf07)

Network:   Card-1: Mellanox MT25204 [InfiniHost III Lx HCA] driver: ib_mthca

           Card-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller

           driver: r8169

Partition: ID-1: / size: 1.1T used: 238G (25%) fs: btrfs dev: /dev/sda3

           ID-2: /var size: 1.1T used: 238G (25%) fs: btrfs dev: /dev/sda3

           ID-3: /home size: 1.1T used: 238G (25%) fs: btrfs dev: /dev/sda3

           ID-4: /boot size: 496M used: 452M (97%) fs: ext2 dev: /dev/md1

           ID-5: swap-1 size: 18.72GB used: 0.00GB (0%) fs: swap dev: /dev/md5
```

== The problem ==

I have about 75% success rate on hibernating or suspending my desktop PC.

Now I want to try to find out why. I guess I need to raise kernel debugging verbosity.

I can do it by echoing 1 to /proc/sys/kernel/sysrq and then hitting [alt]+[SysRq]+[r] and then [alt]+[SysRq]+[9]. But I guess there a way to set the verbosity on kernel command line, which I would maybe need in order to be able to see all the messages from the beginning.

Also I'm getting a plethora of errors like this in dmesg:

```
[  +0.000000] AMD-Vi: Event logged [

[  +0.000002] IO_PAGE_FAULT device=01:00.0 domain=0x000f address=0x000000f40029b6c0 flags=0x0010]
```

... and by heaps ... wc -l told me a total of 511 IO_PAGE_FAULT -messages.

== Symptons ==

Usually screens turn off but fans are spinning, leds are on. Keyboard on the other hand has all it's leds turned off (backlit, USB).

If hibernating there's a change that at this point if I hold the power button down to forcibly turn the system off, it resumes from the image succesfully on next reboot. This, of course, fails if suspend was used to sleep.

Now... What's the best way to start digging this? The kernel command line parameter to enable verbose logging?

Any info I need to paste? Just tell.

----------

## Zucca

It happened again. This time I was trying to hibernate my system and just before writing the image system froze/stopped. Both monitors were off indicating GPU has cut the signaling.

```
May 15 03:01:16 wren backup-sync.sh[8381]: total size is 32.29G  speedup is 980.97

May 15 03:01:16 wren backup-sync.sh[8381]: Backup synced.

May 15 03:01:16 wren backup-sync.sh[8381]: No daily snapshotting needed.

May 15 03:01:16 wren backup-sync.sh[8381]: Currently storing 62 snaps.

May 15 03:01:16 wren systemd[1]: Started Backup script.

May 15 03:01:16 wren systemd[1]: Reached target Sleep.

May 15 03:01:16 wren systemd[1]: Starting Module un-load...

May 15 03:01:16 wren sh[8392]: modprobe: WARNING: Module ib_core is in use.

May 15 03:01:16 wren sh[8392]: rmmod ib_mthca

May 15 03:01:16 wren systemd-networkd[6716]: ib0: Lost carrier

May 15 03:01:16 wren sh[8392]: modprobe: WARNING: Module ib_cm is in use.

May 15 03:01:16 wren sh[8392]: rmmod ib_ipoib

May 15 03:01:16 wren sh[8392]: rmmod ib_umad

May 15 03:01:16 wren sh[8392]: rmmod rpcrdma

May 15 03:01:16 wren kernel: RPC: Unregistered rdma transport module.

May 15 03:01:16 wren kernel: RPC: Unregistered rdma backchannel transport module.

May 15 03:01:16 wren sh[8392]: rmmod rdma_cm

May 15 03:01:16 wren sh[8392]: rmmod ib_cm

May 15 03:01:16 wren sh[8392]: rmmod iw_cm

May 15 03:01:16 wren sh[8392]: rmmod ib_core

May 15 03:01:16 wren systemd[1]: Started Module un-load.

May 15 03:01:16 wren systemd[1]: Starting Preparing for hibernation. Dropping caches and syncing....

May 15 03:01:18 wren pre-hibernate.sh[8439]: Disk caches dropped.

May 15 03:01:18 wren kernel: pre-hibernate.s (8439): drop_caches: 3

May 15 03:01:18 wren pre-hibernate.sh[8439]: Cached writes synced to disks.

May 15 03:01:19 wren systemd[1]: Started Preparing for hibernation. Dropping caches and syncing..

May 15 03:01:19 wren systemd[1]: Starting Beeping just for fun...

May 15 03:01:19 wren systemd[1]: Started Beeping just for fun.

May 15 03:01:19 wren systemd[1]: Starting Hibernate...

May 15 03:01:19 wren kernel: PM: Hibernation mode set to 'shutdown'
```

As you can see I also have a service that drops caches before hibernating. This is because I get significant speedup when writing hibernation image. systemd does not seem to drop caches automatically (why?).

After that I've made another service that makes PC-speaker beeps. This way I know that all the services have been ran before actual hibernating.

After the beeps screens turn off. They usually turn back on (with frozen view of "last seen" content before sleeping) when kernel writes the image and then system powers off.

I've used 'platform'  mode before, but it had the exact same symptons.

I also tried 'freeze' mode when suspending, but it failed right at the same spot - monitors off, fans spin, leds are lit.

This kind of random behaviour is infuriating. Worst still is that I don't see any error messages.

----------

