# BUG: unable to handle kernel NULL pointer dereference SOLVED

## woZa

Hello all.

I keep getting this kernel oops and have no idea how to go about finding what is causing it. Sometimes the computer becomes totally unresponsive, others it carries on ok.

```
[ 1470.726579] BUG: unable to handle kernel NULL pointer dereference at 000000000000001c

[ 1470.727003] IP: [<ffffffff8109c5da>] remove_mapping+0x1a/0x40

[ 1470.727003] PGD 21d0b5067 PUD 21ecef067 PMD 0 

[ 1470.727003] Oops: 0000 [#1] PREEMPT SMP 

[ 1470.727003] CPU 1 

[ 1470.727003] Modules linked in: it87 hwmon_vid coretemp hwmon ir_lirc_codec lirc_dev ir_mce_kbd_decoder ir_sony_decoder ir_jvc_decoder ir_rc6_decoder ir_rc5_decoder ir_nec_decoder dvb_usb_dib0700 dvb_usb snd_hda_codec_realtek dvb_core dib3000mc dibx000_common mt2060 rc_core i2c_i801 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc

[ 1470.727003] 

[ 1470.727003] Pid: 479, comm: kswapd0 Not tainted 3.3.0-gentoo #1 Gigabyte Technology Co., Ltd. G33M-DS2R/G33M-DS2R

[ 1470.727003] RIP: 0010:[<ffffffff8109c5da>]  [<ffffffff8109c5da>] remove_mapping+0x1a/0x40

[ 1470.727003] RSP: 0018:ffff880226b41ba0  EFLAGS: 00010282

[ 1470.727003] RAX: 0000000000000000 RBX: ffffea00001c3b80 RCX: 00000000ffffffe8

[ 1470.727003] RDX: 0000000000000000 RSI: 0000000000000009 RDI: ffff88021334ba38

[ 1470.727003] RBP: 0000000000004c40 R08: da00000000000000 R09: a8000070ed000000

[ 1470.727003] R10: 57ffe98f131c3b40 R11: 000000000000000d R12: ffffffffffffffff

[ 1470.727003] R13: ffff88021334ba20 R14: ffffea00001c3b80 R15: 0000000000000004

[ 1470.727003] FS:  0000000000000000(0000) GS:ffff88022fd00000(0000) knlGS:0000000000000000

[ 1470.727003] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b

[ 1470.727003] CR2: 000000000000001c CR3: 00000002252ed000 CR4: 00000000000006e0

[ 1470.727003] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

[ 1470.727003] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

[ 1470.727003] Process kswapd0 (pid: 479, threadinfo ffff880226b40000, task ffff880226a655c0)

[ 1470.727003] Stack:

[ 1470.727003]  0000000000004c4a ffffffff8109bcab ffff880226b41c08 0000000000000001

[ 1470.727003]  000000000000000e 0000000000000000 ffffea00001c3400 ffffea00001c3ac0

[ 1470.727003]  ffffea00001c3b00 ffffea00001c3b40 ffffea00001c3b80 ffffea00001c3bc0

[ 1470.727003] Call Trace:

[ 1470.727003]  [<ffffffff8109bcab>] ? invalidate_mapping_pages+0xcb/0x150

[ 1470.727003]  [<ffffffff810e68aa>] ? prune_icache_sb+0x2ea/0x340

[ 1470.727003]  [<ffffffff810cf573>] ? prune_super+0x133/0x1a0

[ 1470.727003]  [<ffffffff8109c521>] ? shrink_slab+0x121/0x1c0

[ 1470.727003]  [<ffffffff8109ea9b>] ? balance_pgdat+0x49b/0x630

[ 1470.727003]  [<ffffffff8109eda5>] ? kswapd+0x175/0x2c0

[ 1470.727003]  [<ffffffff8104a230>] ? add_wait_queue+0x60/0x60

[ 1470.727003]  [<ffffffff8109ec30>] ? balance_pgdat+0x630/0x630

[ 1470.727003]  [<ffffffff81049955>] ? kthread+0x85/0x90

[ 1470.727003]  [<ffffffff816e8db4>] ? kernel_thread_helper+0x4/0x10

[ 1470.727003]  [<ffffffff810498d0>] ? kthread_freezable_should_stop+0x60/0x60

[ 1470.727003]  [<ffffffff816e8db0>] ? gs_change+0xb/0xb

[ 1470.727003] Code: 27 64 00 4c 89 f1 e9 16 ff ff ff 66 0f 1f 44 00 00 53 48 89 f3 e8 57 f8 ff ff 31 d2 85 c0 74 1a 48 8b 03 f6 c4 80 75 16 48 89 d8 <8b> 40 1c ba 01 00 00 00 c7 43 1c 01 00 00 00 89 d0 5b c3 48 8b 

[ 1470.727003] RIP  [<ffffffff8109c5da>] remove_mapping+0x1a/0x40

[ 1470.727003]  RSP <ffff880226b41ba0>

[ 1470.727003] CR2: 000000000000001c

[ 1470.776887] ---[ end trace 315b7cbffa1f9e52 ]---
```

Now I don't know but I would guess that perhaps one of the linked in modules is causing the problem? I've run memtest and that passes running loops overnight so it's not the memory anyway...

Any help greatly appreciated.

Thanks

----------

## BillWho

woZa,

How is  CONFIG_SMP set ?  and how many processors show with 

```
grep processor /proc/cpuinfo
```

Just guessing at this out of curiosity.

----------

## woZa

```
grep processor /proc/cpuinfo

processor   : 0

processor   : 1
```

```
cat /usr/src/*11*/.config | grep SMP

CONFIG_X86_64_SMP=y

CONFIG_USE_GENERIC_SMP_HELPERS=y

CONFIG_SMP=y

CONFIG_PM_SLEEP_SMP=y

CONFIG_HAVE_TEXT_POKE_SMP=y
```

Running on gentoo-sources-3.2.11, Core 2 Duo, AMD 64

Thanks

----------

## BillWho

woZa,

I guess I guessed wrong   :Crying or Very sad:   This part of the log entry  *Quote:*   

> [ 1470.727003] Oops: 0000 [#1] PREEMPT SMP

  which preceded  *Quote:*   

> [ 1470.727003] CPU 1

   led my to speculate that there was only one processor, but support for more than one CPU was enabled.

If you want to pursue this error further /usr/src/linux-3.3-rc7/Documentation/oops-tracing.txt  (substitute your kernel version) provides information on how to locate the cause of the Oops and where to send that info. It's quite involved though.

I also found  this site a while back that might help

Good luck and if I should think of anything, I'll let you know   :Wink: 

----------

## woZa

Thanks for your help. I'll have a look at those links...

A

----------

## woZa

Not looked into the bug trace but think have narrowed it down to nfs. Only happens when copying data to the server from my mac client. Discovered this after pulling all the extra hardware from the box and shutting down most services.

Running the shares via netatalk (which I thought was supposed to be buggy hence using nfs) now so we'll see how that goes.

Any ideas why ifs might be causing this? Running v3 and nfs-utils 1.2.5...

----------

## BillWho

woZa, 

Glad to hear that you're making some progress. I'd be cautious of suspecting nfs though. I haven't heard of any problems with it and I have the same version here and it's  running fine.

 *Quote:*   

> Only happens when copying data to the server from my mac client

 

The only difference is there are no mac clients here connected to the server so I couldn't be of any help there or even try to speculate if it could be the cause of your Oops.

The best approach is to do what you're currently doing now which is keep an eye on netatalk and reintroduce the hardware and services one step at a time. 

Good luck   :Wink: 

----------

## woZa

Well, so much for that theory... Same again overnight. Beginning to think it might be hardware related. Might try leaving it running prime torture test tonight to see if the cpu is stable.

No trace in the logs this time, just on screen and computer frozen...

So just a coincidence with the nfs thing by the looks of it.

----------

## woZa

Ditched the gentoo-sources kernel and went with the sys-rescue one and the crash has gone... Marking this as solved.

----------

## kimmie

I'm a little late to this party, but in my experience CONFIG_PREEMPT=Y has been one of the first things to suspect when you run into kernel issues. My guess is that sometimes driver code really isn't up to scratch, so you can run into issues with new or little-used devices. There's no reason to enable it unless you're trying to build a digital audio workstation or something real-timeish. CONFIG_PREEMPT_VOLUNTARY is what you really want for a desktop.

----------

