# Frozen system when booting on f2fs

## mani001

Hi,

I recently purchased a new SSD for an old computer (10 years+). I copied the boot partition in the old the mechanical drive to a f2fs-based in the new one. However, every time I restart the computer, instead of Gentoo booting I get the grub command prompt (as if something was missing). Now, the twist is that I can work around this by booting the system using "System Rescue", mounting the (f2fs) root partition (/dev/sdb2), umounting it, and restarting the system normally (now without System Rescue USB stick). It seems to me that Gentoo (systemd?) unmounts my / partition in some special way (setting/unsetting some flag), but System Rescue (Arch) doesn't. I read somewhere that if you set the "chkfsck bit" for a f2fs partition in /etc/fstab you can run intro troubles, but that's not it (it's not set). Any idea?

Cheers.Last edited by mani001 on Tue Jul 20, 2021 7:13 pm; edited 2 times in total

----------

## Goverp

AFAIK Gentoo doesn't do anything special in this respect, but I use f2fs with OpenRc, not systemd.

There's general problem with f2fs that it doesn't allow the boot-time fsck that ext4 etc. do - hence the /etc/fstab non-setting.  My system has a small initramfs that calls "fsck,f2fs -p", though it almost invariably finds nothing to do.

There's another problem if you have an old version of f2f: it used to count a change of kernel/f2fs version as requiring a run of fsck.f2fs.  Not sure when they fixed that, something like the last six months.  I can't quite see that applying in this case though.

Another issue is that Grub2 can't handle an f2fs filesystem built with extra_attr option; if you don't know what that means, you don't have it.  If you do have it, you need to put the kernel on some other filesystem. such as an ext4 /boot partition.

Check the dmesg/syslog/whatever for booting via System Rescue and without to see if there are any messages from f2fs.

----------

## mani001

Hi,

sorry for the delay...I haven't had physical access to this computer for a while.

The problem just worked itself out  :Smile:  I restarted the computer (and also turned it off and on again) without a glitch. I don't think I have updated any related package since I first posted this, so I have no idea what happened.

Thanks, anyway!!

----------

## mani001

I talked too soon: it's still hapenning   :Confused:   I took your advice and grep'ed dmesg

```
[    2.428940] F2FS-fs (sdf1): Found nat_bits in checkpoint

[    2.524859] F2FS-fs (sdf1): recover fsync data on readonly fs

[    2.525106] F2FS-fs (sdf1): Mounted with checkpoint version = 20cc2673

[    2.525186] VFS: Mounted root (f2fs filesystem) readonly on device 8:81.

[    3.969591] F2FS-fs (sdf3): Found nat_bits in checkpoint

[    4.079099] F2FS-fs (sdf3): Mounted with checkpoint version = 3df21ed6
```

and journalctl

```
jun 04 19:07:33 totolaca kernel: F2FS-fs (sdf1): Found nat_bits in checkpoint

jun 04 19:07:33 totolaca kernel: F2FS-fs (sdf1): recover fsync data on readonly fs

jun 04 19:07:33 totolaca kernel: F2FS-fs (sdf1): Mounted with checkpoint version = 20cb7f1d

jun 04 19:07:33 totolaca kernel: VFS: Mounted root (f2fs filesystem) readonly on device 8:81.

jun 04 19:07:34 totolaca systemd[1]: Found device KINGSTON_SA400S37480G USRLOCALF2FS.

jun 04 19:07:34 totolaca systemd[1]: Found device KINGSTON_SA400S37480G SWAPF2FS.

jun 04 19:07:34 totolaca systemd[1]: Activating swap /dev/disk/by-label/SWAPF2FS...

jun 04 19:07:34 totolaca systemd[1]: Starting File System Check on /dev/disk/by-label/USRLOCALF2FS...

jun 04 19:07:34 totolaca systemd[1]: Activated swap /dev/disk/by-label/SWAPF2FS.

jun 04 19:07:34 totolaca systemd[1]: Finished File System Check on /dev/disk/by-label/USRLOCALF2FS.

jun 04 19:07:34 totolaca kernel: F2FS-fs (sdf3): Found nat_bits in checkpoint

jun 04 19:07:34 totolaca kernel: F2FS-fs (sdf3): Mounted with checkpoint version = 3df19b25

jul 07 13:33:10 totolaca systemd[1]: dev-disk-by\x2dlabel-SWAPF2FS.swap: Deactivated successfully.

jul 07 13:33:10 totolaca systemd[1]: Deactivated swap /dev/disk/by-label/SWAPF2FS.

jul 07 13:33:11 totolaca systemd[1]: systemd-fsck@dev-disk-by\x2dlabel-USRLOCALF2FS.service: Deactivated successfully.

jul 07 13:33:11 totolaca systemd[1]: Stopped File System Check on /dev/disk/by-label/USRLOCALF2FS.
```

I did the same after booting System Rescue, but nothing came out (no mention of f2fs in dmesg or journalctl   :Rolling Eyes:  )

Might it be the checkpoint thing (sdf1 is my new F2FS root)?

By the way, the exact error I get when trying to boot is

```
attempt to read or write outside of partition
```

----------

## Goverp

 *mani001 wrote:*   

> ...
> 
> ```
> attempt to read or write outside of partition
> ```
> ...

 

OK, so what's your partition layout on /dev/sdf ? Do you know what's issuing that message - grub, linux, systemd ?

----------

## mani001

The error message seems issued by grub (it's below "Welcome to grub", and after that I can see the prompt for grub's rescue mode). About the structure, fdisk -l returns (in Spanish...hopefully it's OK)

```
Disco /dev/sdf: 447,13 GiB, 480103981056 bytes, 937703088 sectores

Modelo de disco: KINGSTON SA400S3

Unidades: sectores de 1 * 512 = 512 bytes

Tamaño de sector (lógico/físico): 512 bytes / 512 bytes

Tamaño de E/S (mínimo/óptimo): 512 bytes / 512 bytes

Tipo de etiqueta de disco: gpt

Identificador del disco: 2FF7A440-45A8-294D-A766-E5FBC04AA147

Disposit.  Comienzo     Final  Sectores Tamaño Tipo

/dev/sdf1      2048  85729279  85727232  40,9G Sistema de ficheros de Linux

/dev/sdf2  85729280  87777279   2048000  1000M Linux swap

/dev/sdf3  87777280 937701375 849924096 405,3G Sistema de ficheros de Linux
```

----------

## Hu

 *mani001 wrote:*   

> fdisk -l returns (in Spanish...hopefully it's OK)

 That should be fine here.  In the future, if you need English, use LC_ALL=C fdisk -l to force the output not to be translated.

----------

## mani001

 *Quote:*   

> use LC_ALL=C fdisk -l to force the output not to be translated.

 

Nice trick!!   :Very Happy:  didn't know about this one. Thanks.

```
Disk /dev/sdf: 447.13 GiB, 480103981056 bytes, 937703088 sectors

Disk model: KINGSTON SA400S3

Units: sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disklabel type: gpt

Disk identifier: 2FF7A440-45A8-294D-A766-E5FBC04AA147

Device        Start       End   Sectors   Size Type

/dev/sdf1      2048  85729279  85727232  40.9G Linux filesystem

/dev/sdf2  85729280  87777279   2048000  1000M Linux swap

/dev/sdf3  87777280 937701375 849924096 405.3G Linux filesystem
```

----------

## Goverp

 *mani001 wrote:*   

> The error message seems issued by grub (it's below "Welcome to grub", and after that I can see the prompt for grub's rescue mode). About the structure, fdisk -l returns (in Spanish...hopefully it's OK)...

 

That all looks OK.  If grub is issuing the message "attempt to read or write outside of partition" it must be something to do with either grub.cfg, or grub's trying to read something that's triggered by Gentoo boot but now System Rescue.  

AFAIK grub does something to save the details of the last menu selection - I presume in some  ENV block or similar that must get written somewhere.  I've no idea how this (a) should work or (b) how it could go wrong.

An alternative is that grub.cfg is doing something more complex than usual.  Did you write your own. or use grub-mkconfig?  Have you changed anything - Gentoo grub got updated to 2.06-rc1 and then 2.06 recently.  In theory you need to rerun grub-install to update the boot records.  I've not had the nerve yet.  There might be a mismatch, but I can't see why booting from a rescue disk would change it.

One possibility: what grub "root=" does your grub.cfg use?  If it's root=(hd0,gpt) or whatever the correct syntax is  :Smile:  perhaps booting via rescue is reordering your hard drives in the /dev tree.  Otherwise check the "find" records in grub.cfg, and check the hints are up to date.  In fact, this could be it - if you've not updated grub.cfg since moving to the SSD, it might be looking for an obsolete partition UUID, failing, and then fouling up trying to find which device should be root.

Hope that helps.

----------

## mani001

Thank you so much for all the hints!!

I used grub-mkconfig and did nothing "manually". The (I guess) important lines in grub.cfg are

```
set root='hd1,gpt1'

linux   /boot/kernel-5.10.52-gentoo root=/dev/sdf1 ro init=/usr/lib/systemd/systemd
```

About the find records,

```
grep -i find grub.cfg
```

returns nothing.

Also, I just run

```
grub-install /dev/sda
```

(Linux is in /dev/sdf, but one should install grub in the first HD, right?) but right now I dare not restart until I have physical access to the computer    :Rolling Eyes: 

----------

## Goverp

 *Quote:*   

> 
> 
> ```
> set root='hd1,gpt1'
> 
> ...

 

I'm surprised grub-mkconfig generated that; if the drives don't get enumerated in the same order, it breaks.  The normal cure is to change to one of:

root=LABEL=<partition-label>

or

root=PARTUUID=<partuuid>.

The same is true of the "set root".  The more resiliant version is

search --label=<fs-label> --set=root

or

search --fs-uuid=<fs-uuid> --set=root

Note: partition-label is not fs-label and partuuid is not fs-uuid.  Grub uses different algorithms to the linux kernel, to make things very confusing.

 *Quote:*   

> About the find records,
> 
> ```
> grep -i find grub.cfg   
> ```
> ...

 

My bad.  I meant "search" not "find".  I try not to look at grub too often  :Smile: 

 *Quote:*   

> Also, I just run
> 
> ```
> grub-install /dev/sda
> ```
> ...

 

You install grub wherever your BIOS will look for it.  The device should be whatever it _currently_ is.  I take it you're still using BIOS+MBR booting, not UEFI,

I guess BIOS should enumerate the drives in some static order, but I think the potential problems are why UEFI booting looks for the boot partition using data in the partition table.

----------

## mani001

The full relevant section is:

```
 91 ### BEGIN /etc/grub.d/10_linux ###

 92 menuentry 'Gentoo GNU/Linux' --class gentoo --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-ea9ff68f-eff9-46db-b0ca-893034f29447' {

 93     load_video

 94     insmod gzio

 95     insmod part_gpt

 96     insmod f2fs

 97     set root='hd1,gpt1'

 98     if [ x$feature_platform_search_hint = xy ]; then

 99       search --no-floppy --fs-uuid --set=root --hint-bios=hd5,gpt1 --hint-efi=hd5,gpt1 --hint-baremetal=ahci5,gpt1 --hint='hd1,gpt1'  ea9ff68f-eff9-46db-b0ca-893034f29447

100     else

101       search --no-floppy --fs-uuid --set=root ea9ff68f-eff9-46db-b0ca-893034f29447

102     fi

103     echo    'Cargando Linux 5.10.52-gentoo...'

104     linux   /boot/kernel-5.10.52-gentoo root=/dev/sdf1 ro init=/usr/lib/systemd/systemd 

105 }
```

so some seach is carried out (implying, I take it, that the first set root is useless as it's overwritten but one of those in if/else construction?).

About the root=/dev/sdf1 in 

```
linux   /boot/kernel-5.10.52-gentoo root=/dev/sdf1 ro init=/usr/lib/systemd/systemd 
```

I will try the root=LABEL=<partition-label>  alternative when I get physical access to the PC, but I would think root=/dev/sdf1 is perfectly unambiguous   :Shocked:   but, who knows...

```
 I take it you're still using BIOS+MBR booting, not UEFI, 
```

Yes, back in 2010 UEFI was not still in fashion   :Very Happy:  ...unfortunately...  :Rolling Eyes: 

----------

## Goverp

That all makes sense.  The trouble with the likes of /dev/sdf1 is that the kernel's enumeration depends on (a) what other disks are plugged in (including USB), and (b) timing - stuff happens in parallel, so what's /dev/sdf might become /dev/sde or even /dev/sda depending on various imponderables.

----------

## Tony0945

Why f2fs?   I did the same as you but all my partitions are ext4.  Just add a line in crontab to periodically trim.

```
~ $ hdparm -i /dev/sda

/dev/sda:

 Model=CT1000MX500SSD1, FwRev=M3CR023, SerialNo=1949E22DBCA9

 Config={ Fixed }...

```

```
 ~ $ sudo crontab -l |grep fstrim

15 2 * * tue /sbin/fstrim -va |logger

```

----------

## mani001

It seems (three restarts in a row witout a glitch)

```
root=PARTUUID=<partuuid>
```

did the trick   :Very Happy:  Thank you so much!!  Now, I was wondering...is there a way of making this permanent (by adding some file to /etc/grub.d/, I guess)? I had to manually edit grub.cfg, meaning next time I run grub-mkconfig after updating the kernel, the modification will go away. Not such a big deal, but anyway...

 *Quote:*   

> Why f2fs?

 

Yes, in hindsight it seems choosing F2FS was a bad move. At the moment I probably checked benchmarks of ext4 vs F2FS and saw promising results, but I can't remember. Also, F2FS is supposed to be more state-of-the-art'ish and intended for flash drives. In any case, yes, if I had to do it all over, I would definitely go with ext4 (at least for the time being).

----------

## Goverp

f2fs has the advantage that it's a journalling file system, so all changes get to the SD card.  By default, ext4 only journals metadata changes.  I think it's possible to change that, but I've not tried.

----------

## mani001

Not solved yet: it has just happened again... I guess I'll have to live with this...or go back to ext4   :Rolling Eyes: 

----------

## Goverp

The message of note is

```
F2FS-fs (sdf1): Found nat_bits in checkpoint
```

which, if I've Googled correctly, should be due to (a) switching off the machine with the fs still mounted and (b) should have been fixed in some version of the kernel since 2017 (which probably means is might occur even if (a)) https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1339391.html.

What versions of kernel and f2fs-tools are you using?

Also, are you using an initramfs?  If so, is the version of fsck.f2fs therein up to date?

----------

## mani001

I'm on stable Gentoo:

```
f2fs-tools-1.14.0

gentoo-sources-5.10.52
```

a) looks like it because, as I mentioned, my (crappy) workaround consists in mounting-and-unmounting the partition from SystemRescue

----------

## Goverp

OK, that eliminates out-of-date software.  It looks a bit like Gentoo's systemd setup isn't unmounting the drive properly; which AFAIR is right back to the beginning of this thread.  Sorry to have taken you on a long detour...  Also, that doesn't explain why the dirty bit (or at least nat_bits, whatever they are) is not being cleared by fsck after reboot.

----------

## mani001

I appreciate all the tips (worth a try!!), so thanks anyway

----------

