# Nvidia drivers hunging UDEV resulting in one core at 100%

## rafaelzigx

Hello guys,

I'm having a issue with nvidia property drivers.

Every time I install it, it crashes with udev and make one of the cores going to 100% all the time.

The process consuming the processor:

/sbin/udev --daemon

I cant kill it.

I've already downgraded kernel versions down to 4.9.x and up to 4.19. Didnt solve.

I did the same tests with nvidia drivers.

upgraded also the eudev (even if I wasnt sure of it). Nothing.

If I use nouveau, I dont have this problem. But as soon as I install nvidia, it starts.

My Kernel log:

```
[   65.832518] udevd[2723]: slow: 'lmt-udev auto' [2786]

[   66.873927] udevd[2690]: worker [2724] /module/nvidia is taking a long time

[   66.873931] udevd[2690]: worker [2754] /devices/pci0000:00/0000:00:01.0/0000:01:00.0 is taking a long time

[   66.873933] udevd[2690]: worker [2723] /devices/system/machinecheck/machinecheck3 is taking a long time

[  185.917368] udevd[2724]: timeout 'nvidia-udev.sh add'

[  185.917378] udevd[2724]: slow: 'nvidia-udev.sh add' [2868]

[  186.918427] udevd[2724]: timeout: killing 'nvidia-udev.sh add' [2868]

[  186.918443] udevd[2724]: slow: 'nvidia-udev.sh add' [2868]

[  186.918626] udevd[2724]: 'nvidia-udev.sh add' [2868] terminated by signal 9 (Killed)

[  186.928568] udevd[2723]: timeout: killing 'lmt-udev auto' [2786]

[  186.928577] udevd[2723]: slow: 'lmt-udev auto' [2786]

[  186.928714] udevd[2723]: 'lmt-udev auto' [2786] terminated by signal 9 (Killed)

[  189.931852] udevd[2690]: worker [2754] /devices/pci0000:00/0000:00:01.0/0000:01:00.0 timeout; kill it

[  189.931861] udevd[2690]: seq 1837 '/devices/pci0000:00/0000:00:01.0/0000:01:00.0' killed

[  246.983258] INFO: task laptop_mode:5703 blocked for more than 120 seconds.

[  246.983260]       Tainted: P           OE     4.19.1-gentoo-vulkan #1

[  246.983261] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

[  246.983262] laptop_mode     D    0  5703   3755 0x00000000

[  246.983264] Call Trace:

[  246.983269]  ? __schedule+0x250/0x800

[  246.983271]  schedule+0x28/0x80

[  246.983272]  schedule_preempt_disabled+0xa/0x10

[  246.983274]  __mutex_lock.isra.1+0x24d/0x490

[  246.983276]  ? wp_page_copy+0x318/0x640

[  246.983279]  ? control_store+0x20/0x80

[  246.983280]  control_store+0x20/0x80

[  246.983283]  kernfs_fop_write+0x105/0x180

[  246.983286]  __vfs_write+0x36/0x180

[  246.983288]  ? selinux_file_permission+0x11f/0x130

[  246.983289]  ? security_file_permission+0x2c/0xb0

[  246.983291]  vfs_write+0xb0/0x190

[  246.983293]  ksys_write+0x52/0xc0

[  246.983295]  do_syscall_64+0x5a/0x110

[  246.983297]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[  246.983299] RIP: 0033:0x7f819f211da8

[  246.983303] Code: Bad RIP value.

[  246.983304] RSP: 002b:00007ffd95ab9370 EFLAGS: 00000246 ORIG_RAX: 0000000000000001

[  246.983305] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f819f211da8

[  246.983306] RDX: 0000000000000003 RSI: 0000563fd80fbab0 RDI: 0000000000000001

[  246.983307] RBP: 0000563fd80fbab0 R08: 000000000000000a R09: 0000563fd8130270

[  246.983308] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f819f4e1760

[  246.983309] R13: 0000000000000003 R14: 00007f819f4dc760 R15: 0000000000000003

[  369.863272] INFO: task laptop_mode:5703 blocked for more than 120 seconds.

[  369.863274]       Tainted: P           OE     4.19.1-gentoo-vulkan #1

[  369.863274] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

[  369.863275] laptop_mode     D    0  5703   3755 0x00000000

[  369.863277] Call Trace:

[  369.863281]  ? __schedule+0x250/0x800

[  369.863282]  schedule+0x28/0x80

[  369.863283]  schedule_preempt_disabled+0xa/0x10

[  369.863284]  __mutex_lock.isra.1+0x24d/0x490

[  369.863287]  ? wp_page_copy+0x318/0x640

[  369.863289]  ? control_store+0x20/0x80

[  369.863290]  control_store+0x20/0x80

[  369.863292]  kernfs_fop_write+0x105/0x180

[  369.863294]  __vfs_write+0x36/0x180

[  369.863296]  ? selinux_file_permission+0x11f/0x130

[  369.863297]  ? security_file_permission+0x2c/0xb0

[  369.863299]  vfs_write+0xb0/0x190

[  369.863300]  ksys_write+0x52/0xc0

[  369.863302]  do_syscall_64+0x5a/0x110

[  369.863303]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[  369.863305] RIP: 0033:0x7f819f211da8

[  369.863308] Code: Bad RIP value.

[  369.863309] RSP: 002b:00007ffd95ab9370 EFLAGS: 00000246 ORIG_RAX: 0000000000000001

[  369.863310] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f819f211da8

[  369.863310] RDX: 0000000000000003 RSI: 0000563fd80fbab0 RDI: 0000000000000001

[  369.863311] RBP: 0000563fd80fbab0 R08: 000000000000000a R09: 0000563fd8130270

[  369.863311] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f819f4e1760

[  369.863312] R13: 0000000000000003 R14: 00007f819f4dc760 R15: 0000000000000003

[  492.743282] INFO: task laptop_mode:5703 blocked for more than 120 seconds.

[  492.743283]       Tainted: P           OE     4.19.1-gentoo-vulkan #1

[  492.743284] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

[  492.743285] laptop_mode     D    0  5703   3755 0x00000000

[  492.743286] Call Trace:

[  492.743291]  ? __schedule+0x250/0x800

[  492.743292]  schedule+0x28/0x80

[  492.743293]  schedule_preempt_disabled+0xa/0x10

[  492.743294]  __mutex_lock.isra.1+0x24d/0x490

[  492.743297]  ? wp_page_copy+0x318/0x640

[  492.743299]  ? control_store+0x20/0x80

[  492.743300]  control_store+0x20/0x80

[  492.743302]  kernfs_fop_write+0x105/0x180

[  492.743304]  __vfs_write+0x36/0x180

[  492.743306]  ? selinux_file_permission+0x11f/0x130

[  492.743307]  ? security_file_permission+0x2c/0xb0

[  492.743309]  vfs_write+0xb0/0x190

[  492.743310]  ksys_write+0x52/0xc0

[  492.743312]  do_syscall_64+0x5a/0x110

[  492.743326]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[  492.743327] RIP: 0033:0x7f819f211da8

[  492.743330] Code: Bad RIP value.

[  492.743331] RSP: 002b:00007ffd95ab9370 EFLAGS: 00000246 ORIG_RAX: 0000000000000001

[  492.743332] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f819f211da8

[  492.743333] RDX: 0000000000000003 RSI: 0000563fd80fbab0 RDI: 0000000000000001

[  492.743333] RBP: 0000563fd80fbab0 R08: 000000000000000a R09: 0000563fd8130270

[  492.743334] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f819f4e1760

[  492.743335] R13: 0000000000000003 R14: 00007f819f4dc760 R15: 0000000000000003

[  615.623274] INFO: task laptop_mode:5703 blocked for more than 120 seconds.

[  615.623276]       Tainted: P           OE     4.19.1-gentoo-vulkan #1

[  615.623276] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

[  615.623277] laptop_mode     D    0  5703   3755 0x00000000

[  615.623278] Call Trace:

[  615.623282]  ? __schedule+0x250/0x800

[  615.623283]  schedule+0x28/0x80

[  615.623284]  schedule_preempt_disabled+0xa/0x10

[  615.623286]  __mutex_lock.isra.1+0x24d/0x490

[  615.623288]  ? wp_page_copy+0x318/0x640

[  615.623290]  ? control_store+0x20/0x80

[  615.623291]  control_store+0x20/0x80

[  615.623293]  kernfs_fop_write+0x105/0x180

[  615.623295]  __vfs_write+0x36/0x180

[  615.623297]  ? selinux_file_permission+0x11f/0x130

[  615.623298]  ? security_file_permission+0x2c/0xb0

[  615.623300]  vfs_write+0xb0/0x190

[  615.623301]  ksys_write+0x52/0xc0

[  615.623303]  do_syscall_64+0x5a/0x110

[  615.623304]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[  615.623306] RIP: 0033:0x7f819f211da8

[  615.623309] Code: Bad RIP value.

[  615.623309] RSP: 002b:00007ffd95ab9370 EFLAGS: 00000246 ORIG_RAX: 0000000000000001

[  615.623311] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f819f211da8

[  615.623311] RDX: 0000000000000003 RSI: 0000563fd80fbab0 RDI: 0000000000000001

[  615.623312] RBP: 0000563fd80fbab0 R08: 000000000000000a R09: 0000563fd8130270

[  615.623312] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f819f4e1760

[  615.623313] R13: 0000000000000003 R14: 00007f819f4dc760 R15: 0000000000000003

[  738.503276] INFO: task laptop_mode:5703 blocked for more than 120 seconds.

[  738.503278]       Tainted: P           OE     4.19.1-gentoo-vulkan #1

[  738.503278] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

```

My hardware:

```
┌─[rafael][vulkan][~]

└─▪ inxi -v 2

System:    Host: vulkan Kernel: 4.19.1-gentoo-vulkan x86_64 bits: 64 Desktop: Xfce 4.12.4

           Distro: Gentoo Base System release 2.4.1

Machine:   Device: laptop System: Dell product: XPS 15 9560 serial: N/A

           Mobo: Dell model: 05FFDN v: A00 serial: N/A UEFI: Dell v: 1.12.1 date: 10/02/2018

Battery    BAT0: charge: 36.4 Wh 74.2% condition: 49.0/56.0 Wh (88%)

CPU:       Quad core Intel Core i7-7700HQ (-MT-MCP-) speed/max: 3510/3800 MHz

Graphics:  Card-1: Intel Device 591b

           Card-2: NVIDIA GP107M [GeForce GTX 1050 Mobile]

           Display Server: X.Org 1.20.3 driver: modesetting Resolution: 1920x1080@59.93hz

           OpenGL: renderer: Mesa DRI Intel HD Graphics 630 (Kaby Lake GT2) version: 4.5 Mesa 18.2.4

Network:   Card-1: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter driver: ath10k_pci

           Card-2: Qualcomm Atheros

Drives:    HDD Total Size: 3024.6GB (2.5% used)

           ID-1: model: Samsung_SSD_960_PRO_1TB

           ID-2: model: Ultra_Slim_PL

Info:      Processes: 217 Uptime: 33 min Memory: 1097.2/15930.2MB Client: Shell (bash) inxi: 2.3.56

```

Thanks in advance.

[Moderator edit: added [code] tags to preserve output layout. -Hu]

----------

## javeree

I have the same issue since a few days. 

bug https://bugs.gentoo.org/show_bug.cgi?id=454740 describes the cause, but the solution is a workaround, and I think what happens is a ratrace in the workaround of the script, as the error does not happen at all bootups. In between the time they check for existence of the nvidia module and the execution of nvidia-smi, the module is unloaded again.

I see that sometimes after say one hour, it suddenly succeeds and gets the module loaded. I also see that manually loading nvidia-drm can break the loop.

So for now as a poor workaround, I have added 

```
cat > /etc/local.d/nvidia-break-udevd-lock.start <<EOF

#! / bin/sh

modprobe nvidia-drm

EOF

chmod +x etc/local.d/nvidia-break-udevd-lock.start

rc-update add local
```

The next thing I will check: I have seen at a given moment that a module 'nvidia' was loaded, but not nvidia-drm. So maybe the check in  nvidia-udev.sh should not be lsmod | grep -iq nvidia, but rather lsmod | grep -iq nvidia-drm.

----------

## javeree

Also additional bug and proposed solution:

https://bugs.gentoo.org/504326

----------

## krinn

another one to have a look at https://forums.gentoo.org/viewtopic-p-8280144.html#8280144

----------

