# Root processes fail to end

## Skinjob2707

I am having a problem where I'm not sure the right place to being troubleshooting.  After running for awhile (at least 45 minutes to overnight) root processes fail to close.  If I run emerge --newuse --update --ask @world, the first item in the list will be installed, but emerge will stall at the end and never move unto to the next package in the list.  If I run gvim as root, click the x in the upper right hand corner to close it, it will sit a little while before the process will not end dialogue comes up.  If I then click terminate,  the gvim window has everything grayed out, but will not go away without a reboot.  

The output from ps aux | grep gvim reads: 

```
root      9967  0.0  0.1 172616 24908 ?        Ds   13:57   0:00 gvim nvidia-cudnn-bin-8.0-r6.ebuild
```

  .

The only way to get it to go away is to reboot.  On restarting, the shutdown process often hangs because there is an open file on either the /boot or the root partition.  

When I look in journalctl --pager-end I don't see any error messages that look relevant.

My current Kernel config is at: 

https://pastebin.com/VEMNFAvv

Thanks in advance for your assistance!

----------

## Skinjob2707

Looking through the logs, I see the following, but it isn't clear whether it is the cause of the problem:

```
Jun 22 13:14:51 bluemeanie kernel: ------------[ cut here ]------------

Jun 22 13:14:51 bluemeanie kernel: kernel BUG at fs/f2fs/gc.c:899!

Jun 22 13:14:51 bluemeanie kernel: invalid opcode: 0000 [#1] PREEMPT SMP

Jun 22 13:14:51 bluemeanie kernel: Modules linked in: nvidia_drm(PO) arc4 ath9k ath9k_common ath9k_hw mac80211 input_le

Jun 22 13:14:51 bluemeanie kernel: CPU: 0 PID: 1044 Comm: f2fs_gc-8:3 Tainted: P           O    4.11.6-gentoo #1

Jun 22 13:14:51 bluemeanie kernel: Hardware name: Micro-Star International Co., Ltd MS-7A34/B350 PC MATE(MS-7A34), BIOS

Jun 22 13:14:51 bluemeanie kernel: task: ffff88040de8cf40 task.stack: ffffc90000468000

Jun 22 13:14:51 bluemeanie kernel: RIP: 0010:do_garbage_collect+0x9e1/0xb00

Jun 22 13:14:51 bluemeanie kernel: RSP: 0018:ffffc9000046bcb0 EFLAGS: 00010297

Jun 22 13:14:51 bluemeanie kernel: RAX: ffff8801944aa000 RBX: 0000000000000000 RCX: 0000000000000000

Jun 22 13:14:51 bluemeanie kernel: RDX: ffff880000000000 RSI: 0000000000000003 RDI: ffffea0005870530

Jun 22 13:14:51 bluemeanie kernel: RBP: ffffc9000046bdb0 R08: ffff880207505b90 R09: ffffea000587054c

Jun 22 13:14:51 bluemeanie kernel: R10: ffffc9000046bc38 R11: 0000000000000040 R12: 0000000000000006

Jun 22 13:14:51 bluemeanie kernel: R13: ffff88040d033708 R14: ffff88040dc0e800 R15: ffffea0005870530

Jun 22 13:14:51 bluemeanie kernel: FS:  0000000000000000(0000) GS:ffff88041ec00000(0000) knlGS:0000000000000000

Jun 22 13:14:51 bluemeanie kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

Jun 22 13:14:51 bluemeanie kernel: CR2: 00007ffac2da9010 CR3: 00000001a9e2c000 CR4: 00000000003406f0

Jun 22 13:14:51 bluemeanie kernel: Call Trace:

Jun 22 13:14:51 bluemeanie kernel:  ? _raw_spin_lock+0x12/0x40

Jun 22 13:14:51 bluemeanie kernel:  ? pick_next_task_fair+0x585/0x9c0

Jun 22 13:14:51 bluemeanie kernel:  ? find_next_bit+0xb/0x10

Jun 22 13:14:51 bluemeanie kernel:  f2fs_gc+0x19f/0x470

Jun 22 13:14:51 bluemeanie kernel:  ? f2fs_gc+0x19f/0x470

Jun 22 13:14:51 bluemeanie kernel:  ? del_timer_sync+0x30/0x50

Jun 22 13:14:51 bluemeanie kernel:  ? preempt_count_add+0xa3/0xc0

Jun 22 13:14:51 bluemeanie kernel:  gc_thread_func+0x2eb/0x340

Jun 22 13:14:51 bluemeanie kernel:  ? gc_thread_func+0x2eb/0x340

Jun 22 13:14:51 bluemeanie kernel:  ? wake_atomic_t_function+0x50/0x50

Jun 22 13:14:51 bluemeanie kernel:  kthread+0xff/0x140

Jun 22 13:14:51 bluemeanie kernel:  ? f2fs_gc+0x470/0x470

Jun 22 13:14:51 bluemeanie kernel:  ? kthread_create_on_node+0x40/0x40

Jun 22 13:14:51 bluemeanie kernel:  ? umh_complete+0x40/0x40

Jun 22 13:14:51 bluemeanie kernel:  ? call_usermodehelper_exec_async+0x137/0x140

Jun 22 13:14:51 bluemeanie kernel:  ret_from_fork+0x29/0x40

Jun 22 13:14:51 bluemeanie kernel: Code: ff e9 d3 fd ff ff 8b 55 8c 44 89 f1 4c 89 ef e8 b6 ee ff ff e9 78 fe ff ff 8b 

Jun 22 13:14:51 bluemeanie kernel: RIP: do_garbage_collect+0x9e1/0xb00 RSP: ffffc9000046bcb0

Jun 22 13:14:51 bluemeanie kernel: ---[ end trace 71b180caf0c5dabb ]---

```

----------

## Hu

This is:

```
 860 static int do_garbage_collect(struct f2fs_sb_info *sbi,

 865   struct f2fs_summary_block *sum;

 870   unsigned char type = IS_DATASEG(get_seg_entry(sbi, segno)->type) ?

 871                  SUM_TYPE_DATA : SUM_TYPE_NODE;

 898      sum = page_address(sum_page);

 899      f2fs_bug_on(sbi, type != GET_SUM_TYPE((&sum->footer)));
```

Normally, I would suggest that you try to reproduce the problem with an untainted kernel.  I doubt that will matter here, but it's still good practice, if for no other reason than that you are unlikely to get much support upstream for a tainted kernel.

----------

## Skinjob2707

You make a good point about the tainted kernel. Here is the same problem happening with an untainted kernel: 

```
[  494.543223] kernel BUG at fs/f2fs/gc.c:899!

[  494.543812] invalid opcode: 0000 [#1] PREEMPT SMP

[  494.544396] Modules linked in: arc4 ath9k ath9k_common ath9k_hw mac80211 input_leds ath cfg80211 nouveau snd_hda_codec_realtek snd_hda_codec_generic video led_class i2c_algo_bit hwmon drm_kms_helper syscopyarea sysfillrect snd_hda_intel sysimgblt fb_sys_fops snd_hda_codec ttm kvm snd_hwdep snd_hda_core irqbypass snd_pcm drm snd_timer pcspkr i2c_piix4 snd wmi 8250 8250_base serial_core button acpi_cpufreq r8169 mii efivarfs

[  494.545753] CPU: 0 PID: 1120 Comm: f2fs_gc-8:3 Not tainted 4.11.6-gentoo #1

[  494.546424] Hardware name: Micro-Star International Co., Ltd MS-7A34/B350 PC MATE(MS-7A34), BIOS A.30 04/19/2017

[  494.547107] task: ffff88040e48ae80 task.stack: ffffc90000418000

[  494.547798] RIP: 0010:do_garbage_collect+0x9e1/0xb00

[  494.548483] RSP: 0018:ffffc9000041bcb0 EFLAGS: 00010297

[  494.549162] RAX: ffff880255dc8000 RBX: 0000000000000000 RCX: 0000000000000000

[  494.549845] RDX: ffff880000000000 RSI: 0000000000000003 RDI: ffffea00082c83c0

[  494.550524] RBP: ffffc9000041bdb0 R08: ffff88040767b4d8 R09: ffffea00082c83dc

[  494.551201] R10: ffffc9000041bc38 R11: 0000000000000040 R12: 0000000000000007

[  494.551877] R13: ffff88040e3b04c8 R14: ffff88040d1d9800 R15: ffffea00082c83c0

[  494.552552] FS:  0000000000000000(0000) GS:ffff88041ec00000(0000) knlGS:0000000000000000

[  494.553230] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

[  494.553906] CR2: 00007f41d4caf038 CR3: 0000000407be6000 CR4: 00000000003406f0

[  494.554592] Call Trace:

[  494.555275]  ? find_next_bit+0xb/0x10

[  494.555953]  f2fs_gc+0x19f/0x470

[  494.556626]  ? f2fs_gc+0x19f/0x470

[  494.557296]  ? del_timer_sync+0x20/0x50

[  494.557960]  ? preempt_count_add+0xa3/0xc0

[  494.558620]  gc_thread_func+0x2eb/0x340

[  494.559278]  ? gc_thread_func+0x2eb/0x340

[  494.559939]  ? wake_atomic_t_function+0x50/0x50

[  494.560603]  kthread+0xff/0x140

[  494.561258]  ? f2fs_gc+0x470/0x470

[  494.561916]  ? kthread_create_on_node+0x40/0x40

[  494.562569]  ret_from_fork+0x29/0x40

[  494.563216] Code: ff e9 d3 fd ff ff 8b 55 8c 44 89 f1 4c 89 ef e8 b6 ee ff ff e9 78 fe ff ff 8b 75 94 4c 89 ff e8 56 7f 00 00 e9 06 fc ff ff 0f 0b <0f> 0b 44 89 fe 48 89 cf e8 52 8e 00 00 e9 db f8 ff ff 0f b6 b5 

[  494.563935] RIP: do_garbage_collect+0x9e1/0xb00 RSP: ffffc9000041bcb0

[  494.570544] ---[ end trace bb6a511b13617ddd ]---
```

The Google produced this mailing list entry with at least the same source code line in common: https://www.mail-archive.com/linux-f2fs-devel@lists.sourceforge.net/msg06152.html.  

Is the f2fs-devel mail list the right place to inquire about this issue?   Or should I try and apply the patch using portage's patch facility?

Thanks for your help![/code]

----------

## Hu

I am not sufficiently familiar with f2fs to know whether that patch is useful.  I suggest contacting the f2fs mailing list, explaining your problem, and asking their opinion on the patch.

----------

## Skinjob2707

I upgraded on Sunday morning to kernel 4.11.7 and haven't had the problem since.  Unless it recurs,  I'm going to consider the problem solved.

Thanks for your help!

----------

