# raid5 hibernation swap -- fast wakeup

## Zucca

Hi.

I have a rather hacky thing on my PC. Or at least had.

I used Arch before and I was able to pull this trick out back then.

So, basically I put my swap partition on software raid5 device. Yes, you read right. It is NOT the way to do it, BUT Linux cannot resume from multiple swap partitions so I ended up with this. Reading a resume image from 6xSSD raid array is quite fast. ;)

I compile my kernel with help of genkernel-next (that has a small, in progress, script patch to support ext/isolinux).

I have added domdadm to the kernel command line.

I have enabled mdadm on genkernel.conf (see below)

And (worst of all) I use systemd on this PC. I have had bad experience with systemd and its suspend and hibernate features. But his time around I just think it's "race condition", or rather the order things start up.

```
OLDCONFIG="no"

MENUCONFIG="no"

NCONFIG="yes"

CLEAN="yes"

MRPROPER="no"

ARCH_OVERRIDE="x86_64"

MOUNTBOOT="yes"

SYMLINK="yes"

SAVE_CONFIG="yes"

USECOLOR="yes"

POSTCLEAR="1"

MAKEOPTS="-j8"

LVM="no"

LUKS="no"

GPG="yes"

DMRAID="no"

BUSYBOX="yes"

UDEV="yes"

MDADM="yes"

MDADM_CONFIG="/etc/mdadm.conf"

ISCSI="no"

E2FSPROGS="yes"

BTRFS="yes"

FIRMWARE="yes"

BOOTLOADER="extlinux"

GRUB_CMDLINE_LINUX="init=/usr/lib/systemd/systemd"

SPLASH="no"

SPLASH_THEME="gentoo"

GK_SHARE="${GK_SHARE:-/usr/share/genkernel}"

CACHE_DIR="/var/cache/genkernel"

DISTDIR="/var/lib/genkernel/src"

LOGFILE="/var/log/genkernel.log"

LOGLEVEL=1

DEFAULT_KERNEL_SOURCE="/usr/src/linux"

INTEGRATED_INITRAMFS="0"

COMPRESS_INITRD="yes"

COMPRESS_INITRD_TYPE="gzip"
```

I think I might want to customize the initramfs, but can it be done using genkernel?

Any suggestions?Last edited by Zucca on Sun Feb 19, 2017 10:17 pm; edited 5 times in total

----------

## szatox

 *Quote:*   

> So, basically I put my swap partition on software raid5 device. Yes, you read right. It is NOT the way to do it, BUT Linux cannot reasume from multiple swap partitions so I ended up with this. Reading a resume image from 6xSSD raid array is quite fast. 

  I wonder how well would it work if you disabled all devices but one using swapoff before hibernating.

 *Quote:*   

>  Post subject: Building raid arrays on initramfs 
> 
> I think I might want to customize the initramfs, but can it be done using genkernel? 

 So.... This option is the key:

 *Quote:*   

> MDADM="yes" 

 

What else do you want to customize there?

Anyway, yes, you can use genkernel's options to customize your initramfs, though I'd rather create a generic one manually and compile necessary stuff as builtin instead of modules rather than spend time actually trying to learn more advanced tricks genkernel can do. I just consider it too specific to be worth the effort... Even though I do use it for my box, where it simply covers everything I want from it.

----------

## Zucca

It does write to the raid5 array when hibernating. All the leds flash rapidly.

I have:

```
CONFIG_MD_RAID456=y
```

So it's a built-in.

But:

```
[    4.904934] PM: Checking hibernation image partition /dev/md5

[    4.904959] PM: Hibernation image not present or could not be loaded.
```

And according to dmesg the raid5 array assembly happens way too late. So I need to include (to initramfs) mdadm.conf and possibly a script that runs the assembly command. (But genkernel should already do that. right?)

Creating a custom initramfs image by hand every time I update kernel does not sound optimal. :(

----------

## frostschutz

 *Zucca wrote:*   

> So, basically I put my swap partition on software raid5 device. Yes, you read right. It is NOT the way to do it

 

If you use RAID, of course the swap goes on RAID too. RAID is about availability, a disk dies but the system keeps running. That only works if swap is on RAID too. If swap goes MIA, whole system goes ka-boom. There is nothing whatsoever wrong about this setup.

The only dangerous thing that comes to mind is if you were hibernating in mid-rebuild/reshape of the RAID, and the initramfs would already start continueing the rebuild but after resume it's turning back in time and reshapes what was already reshaped... hopefully RAID code is smart enough to handle these? Don't try it without backup.

Unfortunately I use neither genkernel nor systemd (at least not in Gentoo) so it's not easy for me to point out where things go wrong.

 *Quote:*   

> (But genkernel should already do that. right?)

 

Certainly, but you also need things to happen in the right order ...   :Confused: 

You did not show what your mdadm.conf looks like

----------

## Zucca

 *frostschutz wrote:*   

> You did not show what your mdadm.conf looks like

 

```
ARRAY /dev/md1 metadata=1.0 UUID=68e52b90:1550fb14:93850bc3:94a37400

ARRAY /dev/md5 metadata=1.2 UUID=ad01e865:52cd909d:747e8ef3:823e294a
```

md1 is raid1 for /boot

md5 is raid5 for swap

root and others reside on btrfs storage pool

Here's more:

```
NAME    MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT

sda       8:0    1 238.5G  0 disk  

├─sda1    8:1    1   512M  0 part  

│ └─md1   9:1    0   512M  0 raid1 /boot

├─sda2    8:2    1   3.5G  0 part  

│ └─md5   9:5    0  17.4G  0 raid5 [SWAP]

└─sda3    8:3    1 234.5G  0 part  /home/zucca

sdb       8:16   1 111.8G  0 disk  

├─sdb1    8:17   1   512M  0 part  

│ └─md1   9:1    0   512M  0 raid1 /boot

├─sdb2    8:18   1   3.5G  0 part  

│ └─md5   9:5    0  17.4G  0 raid5 [SWAP]

└─sdb3    8:19   1 107.8G  0 part  

sdc       8:32   1 111.8G  0 disk  

├─sdc1    8:33   1   512M  0 part  

│ └─md1   9:1    0   512M  0 raid1 /boot

├─sdc2    8:34   1   3.5G  0 part  

│ └─md5   9:5    0  17.4G  0 raid5 [SWAP]

└─sdc3    8:35   1 107.8G  0 part  

sdd       8:48   1 447.1G  0 disk  

├─sdd1    8:49   1   512M  0 part  

│ └─md1   9:1    0   512M  0 raid1 /boot

├─sdd2    8:50   1   3.5G  0 part  

│ └─md5   9:5    0  17.4G  0 raid5 [SWAP]

└─sdd3    8:51   1 443.1G  0 part  

sde       8:64   1 447.1G  0 disk  

├─sde1    8:65   1   512M  0 part  

│ └─md1   9:1    0   512M  0 raid1 /boot

├─sde2    8:66   1   3.5G  0 part  

│ └─md5   9:5    0  17.4G  0 raid5 [SWAP]

└─sde3    8:67   1 443.1G  0 part  

sdf       8:80   1 447.1G  0 disk  

├─sdf1    8:81   1   512M  0 part  

│ └─md1   9:1    0   512M  0 raid1 /boot

├─sdf2    8:82   1   3.5G  0 part  

│ └─md5   9:5    0  17.4G  0 raid5 [SWAP]

└─sdf3    8:83   1 443.1G  0 part  

sr0      11:0    1  1024M  0 rom   
```

btrfs being spread on /dev/sd[a-f]3

----------

## Zucca

I did a little dive into /usr/share/genkernel/. And into /usr/share/genkernel/defaults/ to be more specific.

It did watch inside of linuxrc, and everything there's ok. start_volumes function, which runs start_md_volumes function runs before is_livecd || resume_init, which obviously tries to resume from resume device. So my set-up should work out-of-the-box.

----------

## frostschutz

You could add a debug message there, or spawn a rescue shell just before the resume, just to see if the RAID is running at all.

you showed this message earlier

```

[    4.904934] PM: Checking hibernation image partition /dev/md5

[    4.904959] PM: Hibernation image not present or could not be loaded.

```

is the raid assembly in there too? perhaps pastebin the dmesg as a whole.

I think you stated the assembly ran late... I don't know genkernel, it does not send the md assembly as a background task, does it?   :Confused: 

I don't see anything wrong with the mdadm side of things...

----------

## Zucca

 *frostschutz wrote:*   

> is the raid assembly in there too?

 No it was not!

The UUID id the raid array changed at some point. Why? I have no clue. I edited mdadm.conf accordingly, rebuild initramfs image and now I have blazing fast wakeups again. Thanks!

Marking as [SOLVED].

----------

## Zucca

Just an update here. I got about 410MB/s from the raid5 swap out when waking up. While my motherboard has only SATA2.0 which has the limit of 300MB/s (according to wikipedia).

I think (based on the CPU power consumption) that Linux is only giving one thread to read the raid5 stack. And the CPU is actually bottlenecking the wakeup speed. While bottlenecking, the wake up speed is none the less amazing. I think the raid0 stack could give better performance as it does not need to do all that parity checking and what not. I'm not entirely sure if it's worth it when considering that swap with parity is kind of cool. :P

----------

## Zucca

I may have found a little improvement here.

So. As far as I know systemd-sleep does not flush caches before writing hibernation image. I did seem to get faster sleep and wakeup times by running "echo 3 > /proc/sys/vm/drop_caches && sync" before putting my system into hibernate. I, however, don't know in which order should I run those commands.

Also the waking up may be boosted by increasing read_ahead. I just haven't had time (and motivation) yet to implement this. I may need to create a custom linuxrc that sets the read_ahead value before reading the hibernation image.

----------

## Zucca

So I finally did some improvements:

```
[Unit]

Description=Commands to prepare writing the hibernation image

Before=hibernate.target hybrid-sleep.target

[Service]

ExecStart=/usr/local/sbin/pre-hibernate.sh

Type=oneshot

[Install]

WantedBy=hibernate.target hybrid-sleep.target
```

```
#!/bin/sh

sync || exit 1

echo "Cached writes synced to disks."

echo 3 > /proc/sys/vm/drop_caches || exit 1

echo "Disk caches dropped."
```

... and the wakeup is now even faster.

I found it amusing that systemd does not drop disk caches before writing the image (by default). I still believe that it does sync the writes to disks. But just to be sure I added sync command there.

Next step in the process would be to set higher read_ahead. That might boost waking up even more. That, however, need to be done in initramfs.

----------

