# kernel null pointer

## curmudgeon

Does anyone have an idea what is causing (and how to fix) these?:

```

Jun  2 14:58:36 system BUG: unable to handle kernel NULL pointer dereference at 00000033 

Jun  2 14:58:36 system IP: [<00000033>] 0x33 

Jun  2 14:58:36 system *pde = 00000000 

Jun  2 14:58:36 system Oops: 0000 [#1] 

Jun  2 14:58:36 system last sysfs file: /sys/devices/virtual/backlight/thinkpad_screen/max_brightness 

Jun  2 14:58:36 system 

Jun  2 14:58:36 system Pid: 2429, comm: hald Not tainted (2.6.29-gentoo-r5 #1) 26472TA 

Jun  2 14:58:36 system EIP: 0060:[<00000033>] EFLAGS: 00010296 CPU: 0 

Jun  2 14:58:36 system EIP is at 0x33 

Jun  2 14:58:36 system EAX: efad583c EBX: fffffffb ECX: eed95000 EDX: c0487024 

Jun  2 14:58:36 system ESI: c0487024 EDI: 00000033 EBP: c048d200 ESP: eee1bf4c 

Jun  2 14:58:36 system DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 

Jun  2 14:58:36 system Process hald (pid: 2429, ti=eee1a000 task=efbd2330 task.ti=eee1a000) 

Jun  2 14:58:36 system Stack: 

Jun  2 14:58:36 system c023eae1 ffffffed efb71380 efad7f30 c01818da 00001000 b7dc4000 efb71394 

Jun  2 14:58:36 system efad5880 efb9b780 c018184f b7dc4000 00001000 c01522d4 eee1bfa0 efb9b780 

Jun  2 14:58:36 system fffffff7 08b8d970 eee1a000 c01523df eee1bfa0 00000000 00000000 00000000 

Jun  2 14:58:36 system Call Trace: 

Jun  2 14:58:36 system [<c023eae1>] dev_attr_show+0x16/0x32 

Jun  2 14:58:36 system [<c01818da>] sysfs_read_file+0x8b/0xea 

Jun  2 14:58:36 system [<c018184f>] sysfs_read_file+0x0/0xea 

Jun  2 14:58:36 system [<c01522d4>] vfs_read+0x81/0xf4 

Jun  2 14:58:36 system [<c01523df>] sys_read+0x3c/0x63 

Jun  2 14:58:36 system [<c0102b85>] sysenter_do_call+0x12/0x25 

Jun  2 14:58:36 system [<c0390000>] remove_monitor_info+0x19/0x48 

Jun  2 14:58:36 system Code:  Bad EIP value. 

Jun  2 14:58:36 system EIP: [<00000033>] 0x33 SS:ESP 0068:eee1bf4c 

Jun  2 14:58:36 system ---[ end trace 4b9c08ea6a3e9dd0 ]--- 

  

[...] 

  

Jun  2 15:01:29 system BUG: unable to handle kernel NULL pointer dereference at 00000033 

Jun  2 15:01:29 system IP: [<00000033>] 0x33 

Jun  2 15:01:29 system *pde = 00000000 

Jun  2 15:01:29 system Oops: 0000 [#2] 

Jun  2 15:01:29 system last sysfs file: /sys/devices/virtual/backlight/thinkpad_screen/max_brightness 

Jun  2 15:01:29 system 

Jun  2 15:01:29 system Pid: 3876, comm: hald Tainted: G      D    (2.6.29-gentoo-r5 #1) 26472TA 

Jun  2 15:01:29 system EIP: 0060:[<00000033>] EFLAGS: 00010296 CPU: 0 

Jun  2 15:01:29 system EIP is at 0x33 

Jun  2 15:01:29 system EAX: efad583c EBX: fffffffb ECX: eef42000 EDX: c0487024 

Jun  2 15:01:29 system ESI: c0487024 EDI: 00000033 EBP: c048d200 ESP: eef15f4c 

Jun  2 15:01:29 system DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 

Jun  2 15:01:29 system Process hald (pid: 3876, ti=eef14000 task=ef8a6cc0 task.ti=eef14000) 

Jun  2 15:01:29 system Stack: 

Jun  2 15:01:29 system c023eae1 ffffffed eeeb4ec0 efad7f30 c01818da 00001000 b7c84000 eeeb4ed4 

Jun  2 15:01:29 system efad5880 eef23000 c018184f b7c84000 00001000 c01522d4 eef15fa0 eef23000 

Jun  2 15:01:29 system fffffff7 08969970 eef14000 c01523df eef15fa0 00000000 00000000 00000000 

Jun  2 15:01:29 system Call Trace: 

Jun  2 15:01:29 system [<c023eae1>] dev_attr_show+0x16/0x32 

Jun  2 15:01:29 system [<c01818da>] sysfs_read_file+0x8b/0xea 

Jun  2 15:01:29 system [<c018184f>] sysfs_read_file+0x0/0xea 

Jun  2 15:01:29 system [<c01522d4>] vfs_read+0x81/0xf4 

Jun  2 15:01:29 system [<c01523df>] sys_read+0x3c/0x63 

Jun  2 15:01:29 system [<c0102b85>] sysenter_do_call+0x12/0x25 

Jun  2 15:01:29 system [<c0390000>] remove_monitor_info+0x19/0x48 

Jun  2 15:01:29 system Code:  Bad EIP value. 

Jun  2 15:01:29 system EIP: [<00000033>] 0x33 SS:ESP 0068:eef15f4c 

Jun  2 15:01:29 system ---[ end trace 4b9c08ea6a3e9dd1 ]---

```

I have no idea why the second one says "tainted." This is a non-modular kernel.

```

$ emerge -p --info 

Portage 2.1.6.11 (default/linux/x86/2008.0/desktop, gcc-4.3.2, glibc-2.8_p20080602-r1, 2.6.29-gentoo-r5 i686) 

================================================================= 

System uname: Linux-2.6.29-gentoo-r5-i686-Intel-R-_Pentium-R-_III_Mobile_CPU_1000MHz-with-glibc2.0 

Timestamp of tree: Mon, 01 Jun 2009 19:00:01 +0000 

app-shells/bash:     3.2_p39 

dev-java/java-config: 2.1.7 

dev-lang/python:     2.5.4-r2 

dev-util/cmake:      2.6.4 

sys-apps/baselayout: 1.12.11.1 

sys-apps/sandbox:    1.6-r2 

sys-devel/autoconf:  2.63 

sys-devel/automake:  1.7.9-r1, 1.9.6-r2, 1.10.2 

sys-devel/binutils:  2.18-r3 

sys-devel/gcc-config: 1.4.1 

sys-devel/libtool:   1.5.26 

virtual/os-headers:  2.6.27-r2 

ACCEPT_KEYWORDS="x86" 

CBUILD="i686-pc-linux-gnu" 

CFLAGS="-march=pentium3 -Os -pipe -fomit-frame-pointer -mfpmath=sse" 

CHOST="i686-pc-linux-gnu" 

CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/config" 

CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/udev/rules.d" 

CXXFLAGS="-march=pentium3 -Os -pipe -fomit-frame-pointer -mfpmath=sse" 

DISTDIR="/usr/portage/distfiles" 

EMERGE_DEFAULT_OPTS="--with-bdeps y" 

FEATURES="distlocks fixpackages protect-owned sandbox sfperms strict unmerge-orphans userfetch" 

GENTOO_MIRRORS="http://distfiles.gentoo.org http://www.ibiblio.org/pub/Linux/distributions/gentoo" 

LDFLAGS="-Wl,-O1" 

PKGDIR="/usr/portage/packages" 

PORTAGE_CONFIGROOT="/" 

PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" 

PORTAGE_TMPDIR="/var/tmp" 

PORTDIR="/usr/portage" 

SYNC="rsync://rsync.gentoo.org/gentoo-portage" 

USE="X a52 aac aalib acpi alsa arts audiofile berkdb bzip2 cairo caps cdparanoia cjk cracklib crypt css cups dbus dga dhcp directfb dri dvd dvdread encode exif expat fam fbcon ffmpeg flac gcj ggi gif glibc-omitfp gmp gphoto2 gpm gstreamer hal hardcoded-tables iconv idea imagemagick imap imlib ipv6 jabber javascript jbig joystick jpeg kde lcms libcaca libnotify libwww live lm_sensors mad matroska mbox memlimit midi mmx mmxext mng mp3 mpeg mudflap mysql nas ncurses network nls no-old-linux nodrm nptl nptlonly ogg opengl openmp oscar pcre pdf perl png quicktime readline rtc samba scanner sdl sensord silc smtp sndfile speex spell sse ssl svg sysfs tcpd theora threads threadsafe tiff timidity truetype unicode usb userlocales utf8 vcd vorbis wifi win32codecs x86 xinerama xml xorg xulrunner xv xvid yahoo zlib zrtp" ALSA_CARDS="intel8x0" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CAMERAS="ptp2" ELIBC="glibc" INPUT_DEVICES="keyboard mouse" KERNEL="linux" LINGUAS="en ru" USERLAND="GNU" VIDEO_CARDS="savage" 

Unset:  CPPFLAGS, CTARGET, FFLAGS, INSTALL_MASK, LANG, LC_ALL, LINGUAS, MAKEOPTS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY

```

----------

## pappy_mcfae

Is this an ongoing situation, or has it happened suddenly? Post your .config, the results of lspci -n and cat /proc/cpuinfo, and your /etc/fstab file and I'll see if I see anything misconfigured that would cause that to happen, I'll modify as needed. 

Blessed be!

Pappy

----------

## curmudgeon

 *pappy_mcfae wrote:*   

> Is this an ongoing situation, or has it happened suddenly?

 

Ongoing, but it seems to have become MUCH worse after an upgrade to 2.6.29-r5

 *pappy_mcfae wrote:*   

> Post your .config, the results of lspci -n and cat /proc/cpuinfo, and your /etc/fstab file

 

http://pastebin.com/m3bdcd080

```

$ /usr/sbin/lspci -n

00:00.0 0600: 8086:3575 (rev 02)

00:01.0 0604: 8086:3576 (rev 02)

00:1d.0 0c03: 8086:2482 (rev 01)

00:1d.1 0c03: 8086:2484 (rev 01)

00:1d.2 0c03: 8086:2487 (rev 01)

00:1e.0 0604: 8086:2448 (rev 41)

00:1f.0 0601: 8086:248c (rev 01)

00:1f.1 0101: 8086:248a (rev 01)

00:1f.3 0c05: 8086:2483 (rev 01)

00:1f.5 0401: 8086:2485 (rev 01)

01:00.0 0300: 5333:8c2e (rev 05)

02:00.0 0607: 104c:ac51

02:00.1 0607: 104c:ac51

02:02.0 0780: 11c1:0449 (rev 01)

02:08.0 0200: 8086:1031 (rev 41)

$ dog /proc/cpuinfo

processor       : 0

vendor_id       : GenuineIntel

cpu family      : 6

model           : 11

model name      : Intel(R) Pentium(R) III Mobile CPU      1000MHz

stepping        : 1

cpu MHz         : 1000.000

cache size      : 512 KB

fdiv_bug        : no

hlt_bug         : no

f00f_bug        : no

coma_bug        : no

fpu             : yes

fpu_exception   : yes

cpuid level     : 2

wp              : yes

flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse

bogomips        : 1998.25

clflush size    : 32

power management:

$ dog /etc/fstab

/dev/sda1           none            swap        sw                      0 0

/dev/sda5           /boot           ext2        noatime,noauto          1 2

/dev/sda6           /               ext3        noatime                 1 1

/dev/sda7           /usr/local      ext3        noatime                 1 2

/dev/sda8           /var            ext3        noatime                 1 2

/dev/sda9           /tmp            ext3        noatime                 1 2

/dev/sda10          /home           ext3        noatime                 1 2

/dev/sda11          /usr/portage    ext3        noatime                 1 2

/dev/sda12          /mnt/docs       ext3        noatime                 1 2

/dev/cdroms/cdrom0  /mnt/cdrom      auto        noatime,noauto,user,ro  0 0

/dev/fd0            /mnt/floppy     auto        noatime,noauto,user     0 0

shm                 /dev/shm        tmpfs       nodev,nosuid,noexec     0 0

```

----------

## pappy_mcfae

I saw a few disconcerting issues with your kernel. I wanted to make sure that your kernel was as stable as possible, so I used a seed and set it up for your system. This kernel is as much a troubleshooting tool as anything else. The tainting of the kernel at one point and not at another speaks more to me of possible hardware issues.

Also, there has been a lot of work done on the .28 and .29 kernels. As far as I know, none of it has involved savage chips. If the kernel I configured doesn't work, move back to the last working kernel version, and send that .config file.

Click here for your new .config. Compile as is.

For the best results, please do the following:

1) Move your .config file out of your kernel source directory ( /usr/src/linux-2.6.28-gentoo-r5 ).

2) Issue the command make mrproper. This is a destructive step. It returns the source to pristine condition. Unmoved .config files will be deleted!

3) Copy my .config into your source directory.

4) Issue the command make && make modules_install.

5) Install the kernel as you normally would, and reboot.

6) Once it boots, please post /var/log/dmesg so I can see how things loaded.

Don't forget to reemerge your video driver.

Blessed be!

Pappy

----------

## curmudgeon

 *pappy_mcfae wrote:*   

> This kernel is as much a troubleshooting tool as anything else.

 

Thank you very much for your efforts. I did compile a kernel from your config, and it did boot and shut down without any errors.

Unfortunately, it is not very usuble for me in its current state  (no wireless, wrong character set, and for some strange reason it didn't see my audio card), so it seems I still have a lot of troubleshooting to do (the configuration you posted is VERY different).

I'm not even sure how much finding the source of the problem will help me. On a 2.6.28-gentoo-r5 kernel, I found an option that when activated caused Oops - the CONFIG_AGP_INTEL option. When not set, I have no exception problems (though also inferior video performance and no framebuffer).

Setting this option to yes, give me (on shutdown):

```

* Remounting remaining filesystems readonly ...    [ok]

Oops: 0000 [#1]

last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A03:00/device:01:PnP0C09:00/PNP0C0A:00/power_supply/BAT0/energy_full

Pid:4070, comm: reboot Not tainted (2.6.28-gentoo-r5 #1) 26472TA

EIP: 0060:[<000000b7>] EFLAGS: 00010206 CPU:0

EIP is at 0xb7

EAX: ef89ec00 EBX: ef867ab0 ECX: c01dcb4f EDX: 000000b7

ESI: 28121969 EDI: b8004ff4 EBP: ef018000 ESP: ef019e90

 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068

Process reboot (pid: 4070, ti=ef018000 task=ef1eb0e0 task.ti=ef018000)

Stack:

 c01dcb62 c023fcc2 00000000 c0120389 c01203b5 01234567 c01204e5 ef019eb0

 ef019eb0 ef019eb0 00000000 c0512e58 00000024 00000000 c02f3040 bff3db84

 ef1eb234 00000800 bff3d384 c0513e8c ef1eb0e0 c04612e0 00000000 ef1eb0e0

Call Trace:

 [<c01dcb62>] pci_device_shutdown+0x13/0x14

 [<c023fcc2>] device_shutdown+0x37/0x69

 [<c0120389>] kernel_restart_prepare+0x20/0x25

 [<c01203b5>] kernel_restart+0x8/0x2e

 [<c01204e5>] sys_reboot+0x103/0x120

 [<c02f3040>] dev_ioctl+0x4f6/0x59c

 [<c03958fe>] schedule+0x246/0x271

 [<c03232a8>] udp_ioctl+0x0/0x55

 [<c032882d>] inet_ioctl+0x9f/0xa2

 [<c02e9e49>] lock_sock_nested+0x7e/0x85

 [<c0158756>] vfs_ioctl+0x16/0x4a

 [<c015a86b>] d_kill+0x3e/0x43

 [<c015ba80>] dput+0x21/0xf3

 [<c0150898>] __fput+0x12f/0x157

 [<c015ecb5>] mntput_no_expirre+0x13/0x81

 [<c014e524>] filp_close+0x4d/0x53

 [<c014e576>] sys_close+0x4c/0x7a

 [<c0102bc5>] sysenter_do_call+0x12/0x25

Code:  Bad EIP value.

EIP: [<000000b7>] 0xb7 SS:ESP 0068:ef019e90

---[ end trace 6aa580aa26c88a0a ]---

/etc/init.d/reboot.sh: line 7:  4070 Segmentation fault      /sbin/reboot "%{opts}" 2> /dev/null

sd 0:0:0:0: [sda] Synchronizing SCSI cache

BUG: unable to handle kernel NULL pointer dereference at 000000b7

IP: [<000000b7>] 0xb7

*pde = 00000000

Oops: 0000 [#2]

last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A03:00/device:01:PnP0C09:00/PNP0C0A:00/power_supply/BAT0/energy_full

Pid:4071, comm: reboot Tainted: G      D    (2.6.28-gentoo-r5 #1) 26472TA

EIP: 0060:[<000000b7>] EFLAGS: 00010206 CPU:0

EIP is at 0xb7

EAX: ef89ec00 EBX: ef867ab0 ECX: c01dcb4f EDX: 000000b7

ESI: 28121969 EDI: b7ed0ff4 EBP: ef072000 ESP: ef073e90

 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068

Process reboot (pid: 4071, ti=ef072000 task=ef1eb440 task.ti=ef072000)

Stack:

 c01dcb62 c023fcc2 00000000 c0120389 c01203b5 01234567 c01204e5 ef1eb46c

 ef1eb440 00000000 ef1eb440 c011368f 00000000 ef8a77d4 ef8a77cc c04e51e4

 ef1eb594 c01135e3 ef8a77cc 00000000 ef1eb440 c04612e0 00000000 ef1eb440

Call Trace:

 [<c01dcb62>] pci_device_shutdown+0x13/0x14

 [<c023fcc2>] device_shutdown+0x37/0x69

 [<c0120389>] kernel_restart_prepare+0x20/0x25

 [<c01203b5>] kernel_restart+0x8/0x2e

 [<c01204e5>] sys_reboot+0x103/0x120

 [<c011368f>] dequeue_task_fair+0x1d/0x146

 [<c01135e3>] set_next_entity+0x29/0x4e

 [<c03958fe>] schedule+0x246/0x271

 [<c0125300>] hrtimer_cancel+0xa/0x14

 [<c03961a6>] do_nanosleep+0x57/0x86

 [<c0125733>] hrtimer_nanosleep+0xdd/0x143

 [<c0125258>] hrtimer_wakeup+0x0/0x18

 [<c0396185>] do_nanosleep+0x36/0x86

 [<c01257da>] sys_nanosleep+0x41/0x51

 [<c0102bc5>] sysenter_do_call+0x12/0x25

Code:  Bad EIP value.

EIP: [<000000b7>] 0xb7 SS:ESP 0068:ef073e90

---[ end trace 6aa580aa26c88a0a ]---

/etc/init.d/reboot.sh: line 11:  4071 Segmentation fault      /sbin/reboot -f

INIT: no more processes left in this runlevel

```

At least your configuration gives me something to start with that I can work toward what I want. Thanks again.

----------

## Hu

 *pappy_mcfae wrote:*   

> The tainting of the kernel at one point and not at another speaks more to me of possible hardware issues.

 

In this case, the taint on the second oops looks harmless.  Looking at print_tainted, you will get a G anytime tainted_mask is non-zero.  You get a D if the kernel has had a previous oops during this boot.  So, the first NULL pointer dereference causes an oops, which raises the D flag in the tainted mask.  The second oops then reports that D flag, and throws in a G since it had to report any taint at all.

curmudgeon: as a guess, I would say that hald is triggering a bug in sysfs handling of some device you have attached.  If you can run without hald, that might avoid accessing the bad sysfs file.  The oopses in your later post are more worrisome.

----------

## pappy_mcfae

If things worked error free, then clearly there was an issue with your kernel. The fact that you had no errors starting or shutting down with my kernel means there are issues with yours. 

As far as setting up the wireless, there was no indication of a wireless device in lspci -n. Is it a USB device? Is it a PCMCIA device that wasn't plugged in? If the former, emerge usbutils and do lsusb. If the latter, put the PCMCIA device in and do another lspci -n.

Now that we know you had kernel issues, it's just a matter of tweaking the new one until it works all the way. It's not that hard to accomplish assuming your hardware isn't bad.

Blessed be!

Pappy

----------

## curmudgeon

 *Hu wrote:*   

> curmudgeon: as a guess, I would say that hald is triggering a bug in sysfs handling of some device you have attached.  If you can run without hald, that might avoid accessing the bad sysfs file.  The oopses in your later post are more worrisome.

 

Unfortunately, the Oops references the battery, which is not an easy thing for a laptop to live without. :)

I could try doing without hal, but I would have to recompile about a dozen packages.

----------

## curmudgeon

 *pappy_mcfae wrote:*   

> If things worked error free, then clearly there was an issue with your kernel. The fact that you had no errors starting or shutting down with my kernel means there are issues with yours.

 

You are absolutely right about that.

 *pappy_mcfae wrote:*   

> As far as setting up the wireless, there was no indication of a wireless device in lspci -n. Is it a USB device?

 

Yes, usb. I already had that in my .config.

 *pappy_mcfae wrote:*   

> If the former, emerge usbutils and do lsusb.

 

Sticking the device on another machine right now:

```

Bus 001 Device 003: ID 0ace:1215 ZyDAS WLA-54L WiFi

```

 *pappy_mcfae wrote:*   

> Now that we know you had kernel issues, it's just a matter of tweaking the new one until it works all the way. It's not that hard to accomplish assuming your hardware isn't bad.

 

Indeed. Not that hard - just time consuming, and unfortunately, I am out of time on this trip. Again, I thank you. I now believe I can eliminate this problem, and the next time I get back here, I will continue my attempts to do so.

----------

## pappy_mcfae

Considering you only have one device that didn't get loaded, you are doing quite well. If that is the only problem, all you need to is look up the hex address, and that will tell you which driver to use. And while I agree that fixing the kernel would be a time consuming affair, once that's done, your Linux experience will me more positive.

Blessed be!

Pappy

----------

## curmudgeon

 *pappy_mcfae wrote:*   

> Considering you only have one device that didn't get loaded, you are doing quite well. If that is the only problem, all you need to is look up the hex address, and that will tell you which driver to use. And while I agree that fixing the kernel would be a time consuming affair, once that's done, your Linux experience will me more positive.

 

I am back (or rather this machine is back, and I have access to it for the next three months or so).

Let me update you on the situation. I am still getting kernel null pointers during boot, and as they seem to occur when I start hal or udev (which I now need to start xorg-server), I can't get anything done. :(

```

$ lspci -n

00:00.0 0600: 8086:3575 (rev 02)

00:01.0 0604: 8086:3576 (rev 02)

00:1d.0 0c03: 8086:2482 (rev 01)

00:1d.1 0c03: 8086:2484 (rev 01)

00:1d.2 0c03: 8086:2487 (rev 01)

00:1e.0 0604: 8086:2448 (rev 41)

00:1f.0 0601: 8086:248c (rev 01)

00:1f.1 0101: 8086:248a (rev 01)

00:1f.3 0c05: 8086:2483 (rev 01)

00:1f.5 0401: 8086:2485 (rev 01)

01:00.0 0300: 5333:8c2e (rev 05)

02:00.0 0607: 104c:ac51

02:00.1 0607: 104c:ac51

02:02.0 0780: 11c1:0449 (rev 01)

02:08.0 0200: 8086:1031 (rev 41)

$ lsusb

Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

Bus 002 Device 002: ID 0ace:1215 ZyDAS WLA-54L WiFi

Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

```

I took a kernel seed from your site (2.6.31-r5 - the closest to the latest stable 2.6.31-r6) and added my hardware while going through all of the pages on you site.

This produced:

```

Dec 25 09:32:04 system kernel: [   21.017896] BUG: unable to handle kernel NULL pointer dereference at 0000004c

Dec 25 09:32:04 system kernel: [   21.018333] IP: [<c1183d9a>] strlen+0xa/0x20

Dec 25 09:32:04 system kernel: [   21.018682] *pde = 00000000

Dec 25 09:32:04 system kernel: [   21.018998] Oops: 0000 [#1]

Dec 25 09:32:04 system kernel: [   21.019288] last sysfs file: /sys/devices/virtual/misc/hpet/uevent

Dec 25 09:32:04 system kernel: [   21.019606] Modules linked in:

Dec 25 09:32:04 system kernel: [   21.019870]

Dec 25 09:32:04 system kernel: [   21.020013] Pid: 3220, comm: udevadm Not tainted (2.6.31-gentoo-r6 #4) 26472TA

Dec 25 09:32:04 system kernel: [   21.020013] EIP: 0060:[<c1183d9a>] EFLAGS: 00010246 CPU: 0

Dec 25 09:32:04 system kernel: [   21.020013] EIP is at strlen+0xa/0x20

Dec 25 09:32:04 system kernel: [   21.020013] EAX: 00000000 EBX: efb2b9c0 ECX: ffffffff EDX: 000000d0

Dec 25 09:32:04 system kernel: [   21.020013] ESI: 0000004c EDI: 0000004c EBP: ef1b1000 ESP: ee855ed0

Dec 25 09:32:04 system kernel: [   21.020013]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068

Dec 25 09:32:04 system kernel: [   21.020013] Process udevadm (pid: 3220, ti=ee854000 task=ee85c000 task.ti=ee854000)

Dec 25 09:32:04 system kernel: [   21.020013] Stack:

Dec 25 09:32:04 system kernel: [   21.020013]  000000d0 c106b89f efb2b9c0 ee855f00 ef1b1000 c1214275 efb2b9c0 ef1b1000

Dec 25 09:32:04 system kernel: [   21.020013] <0> c12143b8 ef1b1000 c14a3ef0 000000e4 00000000 c151a320 ef808280 c1214551

Dec 25 09:32:04 system kernel: [   21.020013] <0> 000b7768 00000000 ef10d1e0 ef343000 00001000 efb2b9c8 fffffffb c151a34c

Dec 25 09:32:04 system kernel: [   21.020013] Call Trace:

Dec 25 09:32:04 system kernel: [   21.020013]  [<c106b89f>] ? kstrdup+0x1f/0x60

Dec 25 09:32:04 system kernel: [   21.020013]  [<c1214275>] ? device_get_nodename+0x55/0xb0

Dec 25 09:32:04 system kernel: [   21.020013]  [<c12143b8>] ? dev_uevent+0xe8/0x120

Dec 25 09:32:04 system kernel: [   21.020013]  [<c1214551>] ? show_uevent+0xd1/0x140

Dec 25 09:32:04 system kernel: [   21.020013]  [<c1214480>] ? show_uevent+0x0/0x140

Dec 25 09:32:04 system kernel: [   21.020013]  [<c12140d1>] ? dev_attr_show+0x21/0x50

Dec 25 09:32:04 system kernel: [   21.020013]  [<c10ca137>] ? sysfs_read_file+0x87/0x120

Dec 25 09:32:04 system kernel: [   21.020013]  [<c10ca0b0>] ? sysfs_read_file+0x0/0x120

Dec 25 09:32:04 system kernel: [   21.020013]  [<c10882e5>] ? vfs_read+0xa5/0x160

Dec 25 09:32:04 system kernel: [   21.020013]  [<c1088471>] ? sys_read+0x41/0x80

Dec 25 09:32:04 system kernel: [   21.020013]  [<c1002da8>] ? sysenter_do_call+0x12/0x26

Dec 25 09:32:04 system kernel: [   21.020013] Code: 00 56 89 c6 89 d0 88 c4 ac 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e c3 8d b6 00 00 00 00 57 b9 ff ff ff ff 89 c7 31 c0 <f2> ae f7 d1 49 89 c8 5f c3 8d b6 00 00 00 00 8d bc 27 00 00 00

Dec 25 09:32:04 system kernel: [   21.020013] EIP: [<c1183d9a>] strlen+0xa/0x20 SS:ESP 0068:ee855ed0

Dec 25 09:32:04 system kernel: [   21.020013] CR2: 000000000000004c

Dec 25 09:32:04 system kernel: [   21.030553] ---[ end trace 788a61c699a64791 ]---

```

The actual .conf that I used to compile this (other information remains the same as before in this thread):

http://pastebin.com/m71d892c5

Thank you again.

----------

## pappy_mcfae

Give this a try. If the issue remains, post /var/log/dmesg and /var/log/messages. There is the possibility you have a hardware issue. I'd need those other files to perhaps show a pattern, or show what fires up and causes the issue.

BB!

P

----------

## curmudgeon

 *pappy_mcfae wrote:*   

> Give this a try. If the issue remains, post /var/log/dmesg and /var/log/messages.

 

Same result. :(

/var/log/dmesg

/var/log/messages

Thanks again.

----------

## pappy_mcfae

At this point, we'll have to go into guess mode. 

My first guess is either the wireless USB device is bad, the firmware isn't loading, or a bit of both. For the purposes of this test, turn off wireless support in the kernel, and recompile. If the issue remains, remove the network driver. Then remove the PCMCIA driver, and so on until you are down to just the video and hard drive controllers. 

Unfortunately, there are no magic wands for this kind of issue.

Eventually, you will hit upon the offending driver. Once you do that, you can pretty much guess that the hardware driven by said driver is toast. You may be able to work around the issue, but we need to know what the issue is, exactly, and which driver is causing it.

Blessed be!

Pappy

----------

## curmudgeon

I have removed drivers for the wireless USB device, ethernet adapter, CardBus support, and the sound card driver. None of those helped.

I did/do not have this problem with the previous gentoo kernel (2.6.28-r5). Any idea what to try next?

----------

## pappy_mcfae

Send the functional .config so I can analyze it. You may have found an incompatibility with your machine and any kernel greater than the one that works. That is not unheard of.

BB!

P

----------

## curmudgeon

This .config works perfectly with the 2.6.28-r5 kernel. Looking back, it seems the problems started with 2.6.29 (though in that case, I didn't get the OOPS until I tried to shut down):

config-2.6.28-gentoo-r5

----------

## pappy_mcfae

Well, there is nothing in there that jumps out and says, "I'm the issue!" However, I do have another thought. Have you recompiled udev against your newest kernel version? Which version of udev are you using? Post the results of emerge --info as well.

BB!

P

----------

## curmudgeon

 *pappy_mcfae wrote:*   

> Have you recompiled udev against your newest kernel version?

 

I hadn't before. I just recompiled the entire system (for glibc and gcc updates, plus the migration to kde 4) before trying to update the kernel. However, I just tried that now (recompiling both udev and hal) running with a 2.6.31 kernel. Same result. :(

 *pappy_mcfae wrote:*   

> Which version of udev are you using?

 

The latest stable, which at the moment is 146-r1.

 *pappy_mcfae wrote:*   

> Post the results of emerge --info as well.

 

```

$ emerge --info

Portage 2.1.6.13 (default/linux/x86/10.0/desktop, gcc-4.3.4, glibc-2.9_p20081201-r2, 2.6.31-gentoo-r6 i686)

=================================================================

System uname: Linux-2.6.31-gentoo-r6-i686-Intel-R-_Pentium-R-_III_Mobile_CPU_1000MHz-with-gentoo-1.12.13

Timestamp of tree: Thu, 24 Dec 2009 20:30:02 +0000

app-shells/bash:     4.0_p35

dev-java/java-config: 2.1.9-r2

dev-lang/python:     2.6.4

dev-util/cmake:      2.6.4-r3

sys-apps/baselayout: 1.12.13

sys-apps/sandbox:    1.6-r2

sys-devel/autoconf:  2.63-r1

sys-devel/automake:  1.9.6-r2, 1.10.2

sys-devel/binutils:  2.18-r3

sys-devel/gcc-config: 1.4.1

sys-devel/libtool:   2.2.6b

virtual/os-headers:  2.6.27-r2

ACCEPT_KEYWORDS="x86"

CBUILD="i686-pc-linux-gnu"

CFLAGS="-march=pentium3 -Os -pipe -fomit-frame-pointer -mfpmath=sse"

CHOST="i686-pc-linux-gnu"

CONFIG_PROTECT="/etc /usr/share/X11/xkb /usr/share/config"

CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/udev/rules.d"

CXXFLAGS="-march=pentium3 -Os -pipe -fomit-frame-pointer -mfpmath=sse"

DISTDIR="/usr/portage/distfiles"

EMERGE_DEFAULT_OPTS="--with-bdeps y"

FEATURES="distlocks fixpackages protect-owned sandbox sfperms strict unmerge-orphans userfetch"

GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo"

LDFLAGS="-Wl,-O1"

PKGDIR="/usr/portage/packages"

PORTAGE_CONFIGROOT="/"

PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"

PORTAGE_TMPDIR="/var/tmp"

PORTDIR="/usr/portage"

SYNC="rsync://rsync.gentoo.org/gentoo-portage"

USE="X a52 aac aalib acpi alsa ass audiofile berkdb bwscheduler bzip2 cairo caps cdio cjk consolekit cracklib crypt css cups cxx dbus dga dhcp directfb downloadorder dri dvd dvdnav encode exif expat fam fbcon ffmpeg flac gcj ggi gif glibc-omitfp gmp gphoto2 gpm gstreamer hal handbook hardcoded-tables iconv idea imagemagick imap imlib infowidget ipfilter ipv6 jabber javascript jbig joystick jpeg kde lcms libcaca libnotify libwww live lm_sensors logviewer mad matroska mbox memlimit mmx mmxext mng mp3 mpeg mudflap mysql nas ncurses network nls no-old-linux nodrm nptl nptlonly ntp ogg opengl openmp oscar osdmenu pcre pdf perl pm-utils png qt3support quicktime readline rss rtc samba scanfolder scanner sdl search sensord shm silc smtp sndfile speex spell sse ssl stats svg sysfs tcpd theora threads threadsafe tiff timidity tremor truetype unicode upnp usb userlocales vcd vorbis webinterface wifi x86 xcb xinerama xml xorg xv xvid yahoo zlib zrtp" ALSA_CARDS="intel8x0" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CAMERAS="ptp2" ELIBC="glibc" INPUT_DEVICES="evdev" KERNEL="linux" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="savage"

Unset:  CPPFLAGS, CTARGET, FFLAGS, INSTALL_MASK, LANG, LC_ALL, LINGUAS, MAKEOPTS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY

```

----------

## pappy_mcfae

Once again, nothing leaps out, but more ideas have come.

Post the contents of /etc/udev/rules.d (all files and contents thereof). Also, change /etc/udev/udev.conf to the following:

```
# The initial syslog(3) priority: "err", "info", "debug" or its

# numerical equivalent. For runtime debugging, the daemons internal

# state can be changed with: "udevadm control --log-priority=<value>".

udev_log="debug"

# If you need to change mount-options, do it in /etc/fstab
```

, and retry. You may have to drop -fomit-frame-pointers from make.conf, and recompile udev for this to work. Try without first. The results of this should end up in either /var/log/messages or /var/log/dmesg (or perhaps both). Post both once you've done this.

Blessed be!

Pappy

----------

## curmudgeon

 *pappy_mcfae wrote:*   

> Post the contents of /etc/udev/rules.d (all files and contents thereof).

 

Let me take care of this one first.

```

$ ls -al /etc/udev/rules.d/

total 172

drwxr-xr-x 2 root root  4096 2009-12-28 12:16:31 ./

drwxr-xr-x 3 root root  4096 2009-12-28 12:03:03 ../

-rw-r--r-- 1 root root     0 2009-12-28 12:02:42 .keep_sys-fs_udev-0

-rw-r--r-- 1 root root   185 2009-12-23 08:08:23 55-ltmodem.rules

-rw-r--r-- 1 root root  1093 2009-12-23 05:02:51 60-pcmcia.rules

-rw-r--r-- 1 root root  1104 2009-12-27 21:56:33 64-device-mapper.rules

-rw-r--r-- 1 root root 43446 2009-12-27 13:45:20 70-libgphoto2.rules

-rw-r--r-- 1 root root 93795 2009-12-27 12:29:08 70-libsane.rules

-rw-r--r-- 1 root root   507 2007-10-30 23:32:03 70-persistent-cd.rules

-rw-r--r-- 1 root root   710 2009-12-28 12:03:08 70-persistent-net.rules

-rw-r--r-- 1 root root    83 2009-12-28 12:13:56 90-hal.rules

```

I don't know if those two big files provide anything of use in debugging this, so rather than post them, I will just move them out of that directory for now (I certainly won't attempt to use a camera or scanner in the meantime).

For the others:

```

$ dog /etc/udev/rules.d/55-ltmodem.rules

#

# UDEV rule for ltmodem

#  creates symlink /dev/modem to /dev/ttySLT?, and takes care of permissions

KERNEL=="ttySLTM[0-9]", NAME="%k", MODE="0660", GROUP="dialout", SYMLINK="modem"

$ dog /etc/udev/rules.d/60-pcmcia.rules

# PCMCIA devices:

#

# modprobe $modalias loads all possibly appropriate modules

ACTION=="add", SUBSYSTEM=="pcmcia", ENV{MODALIAS}=="?*", \

                RUN+="/sbin/modprobe $env{MODALIAS}"

# Very few CIS firmware entries (which we use for matching)

# are so broken that we need to read out random bytes of it

# instead of the manufactor, card or product ID. Then the

# matching is done in userspace.

ACTION=="add", SUBSYSTEM=="pcmcia", ENV{MODALIAS}=="?*", \

                RUN+="/sbin/pcmcia-check-broken-cis"

# However, the "weak" matching by func_id is only allowed _after_ modprobe

# returns, so that "strong" matches have a higher priority.

ACTION=="add", SUBSYSTEM=="pcmcia", ENV{MODALIAS}=="?*", \

                RUN+="/bin/sh -c 'echo 1 > /sys/$devpath/allow_func_id_match'"

# PCMCIA sockets:

#

# modprobe the pcmcia bus module so that 16-bit PCMCIA devices work

ACTION=="add", SUBSYSTEM=="pcmcia_socket", \

                RUN+="/sbin/modprobe pcmcia"

# if this is a PCMCIA socket which needs a resource database,

# pcmcia-socket-startup sets it up

ACTION=="add", SUBSYSTEM=="pcmcia_socket", \

                RUN+="/sbin/pcmcia-socket-startup"

$ dog /etc/udev/rules.d/64-device-mapper.rules

# do not edit this file, it will be overwritten on update

KERNEL=="device-mapper", SYMLINK+="mapper/control"

KERNEL!="dm-*", GOTO="device_mapper_end"

ACTION!="add|change", GOTO="device_mapper_end"

IMPORT{program}="/sbin/dmsetup info --export -j%M -m%m"

ENV{DM_NAME}!="?*", GOTO="device_mapper_end"

NAME="mapper/$env{DM_NAME}", SYMLINK+="%k"

SYMLINK+="disk/by-id/dm-name-$env{DM_NAME}", OPTIONS+="string_escape=replace"

ENV{DM_UUID}=="?*", SYMLINK+="disk/by-id/dm-uuid-$env{DM_UUID}", OPTIONS+="string_escape=replace"

ENV{DM_SUSPENDED}=="1", GOTO="device_mapper_end"

ENV{DM_EXISTS}=="0", GOTO="device_mapper_end"

ENV{DM_TARGET_TYPES}=="|*error*", GOTO="device_mapper_end"

IMPORT{program}="/sbin/blkid -o udev -p $tempnode"

OPTIONS+="link_priority=-100"

OPTIONS+="watch"

ENV{DM_TARGET_TYPES}=="*snapshot-origin*", OPTIONS+="link_priority=-90"

ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_UUID_ENC}=="?*", SYMLINK+="disk/by-uuid/$env{ID_FS_UUID_ENC}"

ENV{ID_FS_USAGE}=="filesystem|other", ENV{ID_FS_LABEL_ENC}=="?*", SYMLINK+="disk/by-label/$env{ID_FS_LABEL_ENC}"

LABEL="device_mapper_end"

$ dog /etc/udev/rules.d/70-persistent-cd.rules

# This file was automatically generated by the /lib/udev/write_cd_rules

# program, probably run by the cd-aliases-generator.rules rules file.

#

# You can modify it, as long as you keep each rule on a single line

# and set the $GENERATED variable.

# DVD-ROM_GDR8081N (pci-0000:00:1f.1-scsi-1:0:0:0)

ENV{ID_CDROM}=="?*", ENV{ID_PATH}=="pci-0000:00:1f.1-scsi-1:0:0:0", SYMLINK+="cdrom", ENV{GENERATED}="1"

ENV{ID_CDROM}=="?*", ENV{ID_PATH}=="pci-0000:00:1f.1-scsi-1:0:0:0", SYMLINK+="dvd", ENV{GENERATED}="1"

$ dog /etc/udev/rules.d/70-persistent-net.rules

# This file was automatically generated by the /lib/udev/write_net_rules

# program run by the persistent-net-generator.rules rules file.

#

# You can modify it, as long as you keep each rule on a single line.

# PCI device 0x8086:0x1031 (e100)

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:d0:59:bf:e5:98", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

# USB device 0x0ace:0x1215 (zd1211rw)

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:02:72:5a:47:59", ATTR{type}=="1", KERNEL=="wlan*", NAME="wlan0"

$ dog /etc/udev/rules.d/90-hal.rules

# pass all events to the HAL daemon

RUN+="socket:@/org/freedesktop/hal/udev_event"

```

I will work on the other part, and post the results when I get them.

----------

## curmudgeon

 *pappy_mcfae wrote:*   

> Also, change /etc/udev/udev.conf to...  and retry.

 

That didn't take long. Interesting (to me) is that I saw all of the messages on the screen before the Oops, but they all show up in /var/log/messages after). 

I don't know if you want the complete boot logs or just the debug part, but I will post the complete ones just to be on the safe side.

/var/log/messages

/var/log/dmesg - which doesn't seem to contain any new information

----------

## pappy_mcfae

Fire up the machine using the kernel that works, and resend the files. I need something to compare.

BB!

P

----------

## toralf

BTW to help/inform kernel devs please emerge the package sys-kernel/kerneloops.

----------

## curmudgeon

 *pappy_mcfae wrote:*   

> Fire up the machine using the kernel that works, and resend the files. I need something to compare.

 

Here you go. :)

/var/log/messages - a heck of a lot bigger than before

/var/log/dmesg - again, with no extra information

 *toralf wrote:*   

> BTW to help/inform kernel devs please emerge the package sys-kernel/kerneloops.

 

A great idea. I installed it right away. I think I submitted this set multiple times trying to figure out how to get it to work (I compiled it without the gui), but I eventually added the most recent twenty or so to their database.

Their web page is very interesting. It shows that 2.6.28 (the one that works for me) has a lot fewer of these problems BY FAR (ignoring possible differences in use by distributions - which could make some kernel versions much more widely used than others, and how long 2.6.28 remained the latest stable before the release of 2.6.29).

----------

## cwr

Last time I met this it was a failing memory; at least, it went away when I

reseated the DIMM.  Apparently that particular kernel used a piece of

memory that was unreliable.

Will

----------

## curmudgeon

 *cwr wrote:*   

> Last time I met this it was a failing memory; at least, it went away when I
> 
> reseated the DIMM.  Apparently that particular kernel used a piece of
> 
> memory that was unreliable.

 

It was about six months ago, but I ran memtest overnight (no problems), tried each piece of memory individually, swapped them, reseated them, and did everything else I could think of, none of which made any difference.

In fact, we already discussed this (same machine, too) here:

https://forums.gentoo.org/viewtopic-t-744926.html

----------

## pappy_mcfae

You may have to accept that this install is stuck at that point. You could try a full re-install to kill any bugs that have crept in after you back everything up and see how that works.

You can also try this .config. I gave it a few tweaks regarding some of the errors I saw. It may or may not help. The hope is it will.

If not, I'd next do a full stage4 backup, copy that off, and do a full reinstall with a fresh tarball and portage snapshot. If that doesn't fix it, then all you have left is the .28 functional kernel.

As far as I can tell, those are the only options that remain unless kerneloops can tell you/me more.

BB!

P

----------

## curmudgeon

 *pappy_mcfae wrote:*   

> You can also try this .config.

 

No luck with that.

 *pappy_mcfae wrote:*   

> You could try a full re-install to kill any bugs that have crept in after you back everything up and see how that works.

 

I was skeptical of that. I just recompiled the entire system from scratch (which took well over three days on this machine).

Many months ago, I kept getting oopses with the vanilla sources, too, but I decided I should give them another try now. And guess what? The (latest stable) 2.6.31.6 compiled and booted without any problems (with my original .config).

 *pappy_mcfae wrote:*   

> As far as I can tell, those are the only options that remain unless kerneloops can tell you/me more.

 

I don't think the kerneloops does much more that add the crash to a database (helpful for the kernel developers, but not for you or me).

I guess the first thing to do is file a bug about this.

----------

## pappy_mcfae

Most definitely. I wonder what it is in gentoo sources that would do that. As far as getting the new vanilla sources to run, awesome. I'm glad you've come to a place where things are working.

Blessed be!

Pappy

----------

