# Random freeze

## Mika15

Hi everybody,

Since 2-3 weeks, I have some random freezes of the graphical interface (SSH still working but I can't do anything even reboot the computer, it's stuck).

I can pass 2-3 days without any and having 2 in 2-3hours, it's very annoying and I would like to solve that. 

I'm using Plasma, fully updated with kernel 5.5.9, these freezes appear with Firefox watching movies or normal surfing.

Looking on my /var/log/messages, I found that for every freeze I have:

 *Quote:*   

> Mar 19 07:35:39 mikap kernel: BUG: kernel NULL pointer dereference, address: 0000000000000004
> 
> Mar 19 07:35:39 mikap kernel: #PF: supervisor instruction fetch in kernel mode
> 
> Mar 19 07:35:39 mikap kernel: #PF: error_code(0x0010) - not-present page
> ...

 

Apparently is a bug with the kernel and/or the graphic driver but I cannot decipher that, and the most important I don't know how to solve that (if it's possible).

Can someone help me or give me more informations ?

Thank you.

Regards.

----------

## fturco

Please post your emerge --info output.

----------

## Mika15

 *Quote:*   

> mikap /home/mika # emerge --info
> 
> Portage 2.3.94 (python 3.7.7-final-0, default/linux/amd64/17.1/desktop/plasma, gcc-9.3.0, glibc-2.30-r5, 5.5.9-gentoo x86_64)
> 
> =================================================================
> ...

 

----------

## Hu

Is the problem reproducible on an untainted kernel?  The trace you showed is tainted with proprietary and out-of-tree modules.

----------

## Mika15

Hi and thanks for response.

I checked what is an untainted kernel and is more or less a kernel without proprietary drivers which in my case means without Nvidia driver I think (stop me if I'm wrong).

I have Nvidia drivers support and loaded because I have discrete graphic card with bumblebee but I don't use it, it's just in case, I'm always using the Intel i915 graphics card, my configuration is the same for years and this problem is really new, I think that is only when I'm using Firefox as front app (nothing happens when it's in background).

I don't really know how to diagnostics this problem, is a i915 driver problem ? Kernel ? Firefox ? How can I check and take more information.

Do I have to compile my kernel without nvidia/bumblebee support for try ?

Thanks.

----------

## Hu

This is definitely a problem with some kernel component.  Whether it is the nVidia proprietary driver or part of the Linux kernel is not known yet.  I would blacklist the nVidia driver, reboot to clear the taint flags, then repeat the test.

----------

## Mika15

OK, thanks! It's done, I will try with nvidia module blacklisted and come back to let you know if the problem persist.

Have a good day!

----------

## Mika15

Hi again! 

My problem still the same with Nvidia's module blacklisted, and again under Firefox on front, my plasma GUI freeze but my cmputer respond to ping en SSH but I cannot reboot cleanly!

```

Mar 23 23:57:24 mikap kernel: perf: interrupt took too long (4070 > 3971), lowering kernel.perf_event_max_sample_rate to 49000

Mar 23 23:57:40 mikap kernel: BUG: kernel NULL pointer dereference, address: 0000000000000004

Mar 23 23:57:40 mikap kernel: #PF: supervisor instruction fetch in kernel mode

Mar 23 23:57:40 mikap kernel: #PF: error_code(0x0010) - not-present page

Mar 23 23:57:40 mikap kernel: PGD 0 P4D 0 

Mar 23 23:57:40 mikap kernel: Oops: 0010 [#1] SMP PTI

Mar 23 23:57:40 mikap kernel: CPU: 0 PID: 2076 Comm: X Tainted: P           OE     5.5.9-gentoo #1

Mar 23 23:57:40 mikap kernel: Hardware name: Dell Inc.          Inspiron 7537/07PF9F, BIOS A13 06/04/2015

Mar 23 23:57:40 mikap kernel: RIP: 0010:0x4

Mar 23 23:57:40 mikap kernel: Code: Bad RIP value.

Mar 23 23:57:40 mikap kernel: RSP: 0018:ffffc90001517a58 EFLAGS: 00010087

Mar 23 23:57:40 mikap kernel: RAX: 0000000000000004 RBX: ffff888215ccf088 RCX: 0000000080200011

Mar 23 23:57:40 mikap kernel: RDX: ffff888215ccf088 RSI: ffff8881847c8b50 RDI: ffff8881ef4373c0

Mar 23 23:57:40 mikap kernel: RBP: ffff8881ef4373c0 R08: 0000000000000000 R09: 0000000000000001

Mar 23 23:57:40 mikap kernel: R10: ffff888215ccfb80 R11: 0000000000000000 R12: ffffc90001517a60

Mar 23 23:57:40 mikap kernel: R13: ffff8881ef437400 R14: ffff88822b11c7c0 R15: ffff88822b030000

Mar 23 23:57:40 mikap kernel: FS:  00007f6303905d00(0000) GS:ffff888237200000(0000) knlGS:0000000000000000

Mar 23 23:57:40 mikap kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

Mar 23 23:57:40 mikap kernel: CR2: ffffffffffffffda CR3: 0000000222394001 CR4: 00000000001606f0

Mar 23 23:57:40 mikap kernel: Call Trace:

Mar 23 23:57:40 mikap kernel:  ? dma_fence_signal_locked+0x82/0x100

Mar 23 23:57:40 mikap kernel:  ? i915_request_retire+0x29e/0x2f0 [i915]

Mar 23 23:57:40 mikap kernel:  ? i915_request_create+0x44/0xc0 [i915]

Mar 23 23:57:40 mikap kernel:  ? i915_gem_do_execbuffer+0x9cb/0x18c0 [i915]

Mar 23 23:57:40 mikap kernel:  ? try_to_wake_up+0x213/0x5e0

Mar 23 23:57:40 mikap kernel:  ? __kmalloc_reserve.isra.0+0x2d/0x70

Mar 23 23:57:40 mikap kernel:  ? pollwake+0x74/0x90

Mar 23 23:57:40 mikap kernel:  ? __wake_up_common+0x7a/0x140

Mar 23 23:57:40 mikap kernel:  ? __wake_up_common_lock+0x8a/0xc0

Mar 23 23:57:40 mikap kernel:  ? unix_stream_sendmsg+0x397/0x3d0

Mar 23 23:57:40 mikap kernel:  ? __kmalloc_node+0x226/0x300

Mar 23 23:57:40 mikap kernel:  ? i915_gem_execbuffer2_ioctl+0x1cf/0x3b0 [i915]

Mar 23 23:57:40 mikap kernel:  ? i915_gem_execbuffer_ioctl+0x2c0/0x2c0 [i915]

Mar 23 23:57:40 mikap kernel:  ? drm_ioctl_kernel+0xaa/0xf0 [drm]

Mar 23 23:57:40 mikap kernel:  ? drm_ioctl+0x1e4/0x370 [drm]

Mar 23 23:57:40 mikap kernel:  ? i915_gem_execbuffer_ioctl+0x2c0/0x2c0 [i915]

Mar 23 23:57:40 mikap kernel:  ? do_vfs_ioctl+0x451/0x6c0

Mar 23 23:57:40 mikap kernel:  ? ksys_ioctl+0x5e/0x90

Mar 23 23:57:40 mikap kernel:  ? __x64_sys_ioctl+0x16/0x20

Mar 23 23:57:40 mikap kernel:  ? do_syscall_64+0x59/0x210

Mar 23 23:57:40 mikap kernel:  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

Mar 23 23:57:40 mikap kernel: Modules linked in: rfcomm ctr ccm bbswitch(OE) cmac algif_hash algif_skcipher af_alg bnep binfmt_misc fuse nls_iso8859_1 vfat fat snd_hda_codec_realtek snd_hda_codec_hdmi uvcvideo snd_hda_codec_generic videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 snd_hda_intel iwlmvm videodev hid_multitouch snd_intel_dspcfg hid_generic mac80211 videobuf2_common usbhid snd_hda_codec x86_pkg_temp_thermal libarc4 intel_powerclamp snd_hwdep iwlwifi btusb snd_hda_core coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel btintel aesni_intel dell_laptop i915 bluetooth cfg80211 snd_pcm dell_wmi sparse_keymap ecdh_generic ecc rtsx_pci_sdmmc crypto_simd cryptd mousedev i2c_algo_bit ledtrig_audio nvidia(POE) rtsx_pci r8169 realtek wmi_bmof glue_helper drm_kms_helper libphy snd_timer psmouse dell_smbios dcdbas dell_wmi_descriptor dell_smm_hwmon syscopyarea sysfillrect sysimgblt ehci_pci fb_sys_fops rfkill snd xhci_pci xhci_hcd drm ehci_hcd video mei_me backlight soundcore wmi intel_smartconnect

Mar 23 23:57:40 mikap kernel:  i2c_i801 lpc_ich [last unloaded: nvidia_modeset]

Mar 23 23:57:40 mikap kernel: CR2: 0000000000000004

Mar 23 23:57:40 mikap kernel: ---[ end trace c33dd1ec1ecb32eb ]---

Mar 23 23:57:40 mikap kernel: RIP: 0010:0x4

Mar 23 23:57:40 mikap kernel: Code: Bad RIP value.

Mar 23 23:57:40 mikap kernel: RSP: 0018:ffffc90001517a58 EFLAGS: 00010087

Mar 23 23:57:40 mikap kernel: RAX: 0000000000000004 RBX: ffff888215ccf088 RCX: 0000000080200011

Mar 23 23:57:40 mikap kernel: RDX: ffff888215ccf088 RSI: ffff8881847c8b50 RDI: ffff8881ef4373c0

Mar 23 23:57:40 mikap kernel: RBP: ffff8881ef4373c0 R08: 0000000000000000 R09: 0000000000000001

Mar 23 23:57:40 mikap kernel: R10: ffff888215ccfb80 R11: 0000000000000000 R12: ffffc90001517a60

Mar 23 23:57:40 mikap kernel: R13: ffff8881ef437400 R14: ffff88822b11c7c0 R15: ffff88822b030000

Mar 23 23:57:40 mikap kernel: FS:  00007f6303905d00(0000) GS:ffff888237200000(0000) knlGS:0000000000000000

Mar 23 23:57:40 mikap kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

Mar 23 23:57:40 mikap kernel: CR2: ffffffffffffffda CR3: 0000000222394001 CR4: 00000000001606f0

Mar 23 23:57:44 mikap root[6737]: ACPI event unhandled: video/brightnessdown BRTDN 00000087 00000000

Mar 23 23:57:44 mikap root[6739]: ACPI event unhandled: video/brightnessdown BRTDN 00000087 00000000 K

Mar 23 23:57:44 mikap root[6741]: ACPI event unhandled: video/brightnessdown BRTDN 00000087 00000000 K

Mar 23 23:57:44 mikap root[6743]: ACPI event unhandled: video/brightnessdown BRTDN 00000087 00000000

```

I see that I still having taints flag, but the only proprietary drivers I'm using is nvidia which is blacklisted, when I do an lsmod|grep nv, I have nv_queue, I also blacklisted it now.

Any suggestions are welcome.

Thank you and have a nice day.

----------

## mir3x

It looks like that trace is connected to i915.

I quick googled and found that:

https://linuxreviews.org/Linux_Kernel_5.5_Will_Not_Fix_The_Frequent_Intel_GPU_Hangs_In_Recent_Kernels

Have u upgraded kernel lately ?

Maybe downgrade to 5.0 or 4.x.

Or maybe try that ( its on bottom of that link)

```
 Kernel 5.5 rc7 appears to be problem-free on low-powered Intel chips with the kernel parameters intel_idle.max_cstate=1 i915.enable_dc=0. You may not need both. Goldmount chips will only need i915.enable_dc=0, Baytrail chips need intel_idle.max_cstate=1. Some chips need both. 
```

----------

## Mika15

Thank you very much for this link and for having looked for my problem, I quickly read and it seems to apply to me, I will check the fix before downgrading and let you know!

Thanks again!

----------

## asturm

5.0 is very EOL, if you want to downgrade then go back to (currently) 4.19.112, this is an LTS kernel. It is what I am using on my Laptop with Intel GPU.

----------

## Mika15

Thank you, regarding the article I decided to downgrade, I'm now with 4.9.217, I took the last one on 4.9's portage, I will see if it works and I will be waiting for 5.6 maybe!

----------

## Hu

asturm suggested you use 4.19, not 4.9.

----------

