# I get Kernel Panic or Oops during emerge --sync :(

## Etal

When I do an emerge --sync on my new computer, I get either a Kernel Panic or an Oops.  :Sad: 

I have my portage tree on a reiserfs partition, and the setup has always worked properly on two other computers. Yesterday I had Kernel Panics, so I unmounted the reiserfs (3.6) partition and synced (so it writes onto ext3) and it was fine. Then I reformatted the portage partition, and I copied all the files there. No problem. Emerge worked fine. Today I decided to sync, and I got an oops instead of a panic:

```
perl-core/

profiles/

profiles/ChangeLog

profiles/package.mask

profiles/thirdpartymirrors

profiles/use.local.desc

Oops: 0000 [#1]

PREEMPT SMP

Modules linked in: ipw3945 nvidia(P) snd_hda_intel snd_hda_codec snd_pcm snd_timer snd snd_page_alloc

CPU:    0

EIP:    0060:[<c01b30ac>]    Tainted: P      VLI

EFLAGS: 00210202   (2.6.20-gentoo-r8 #19)

EIP is at check_balance+0x97c/0x1340

eax: 00000004   ebx: 00000003   ecx: f7337a76   edx: 0000018d

esi: f7337b30   edi: 00000000   ebp: ffffffff   esp: f73379dc

ds: 007b   es: 007b   ss: 0068

Process rsync (pid: 8849, ti=f7336000 task=f75dfa90 task.ti=f7336000)

Stack: 00000000 ffffffff 00000000 ffffffff f7337a70 00000001 0000001c 00000010

       00000000 00000600 013986ec f43986ec 00000418 f65098f8 00000000 c0187b93

       c018605a 00000000 00000000 f43986ec f688f480 f7337ca0 00000004 00000000

Call Trace:

 [<c0187b93>] bio_put+0x23/0x30

 [<c018605a>] end_bio_bh_io_sync+0x2a/0x40

 [<c0185757>] ll_rw_block+0x37/0xb0

 [<c01bce0b>] search_by_key+0x14b/0xf90

 [<c01b3c13>] fix_nodes+0x1a3/0x750

 [<c01bfd6f>] reiserfs_insert_item+0x15f/0x310

 [<c01afe0f>] reiserfs_file_write+0x169f/0x1dc0

 [<f900ef9a>] ipw_handle_reply_rx+0x87a/0x1f60 [ipw3945]

 [<c0119381>] try_to_wake_up+0x41/0x3c0

 [<c0164461>] vfs_write+0xc1/0x180

 [<c01ae770>] reiserfs_file_write+0x0/0x1dc0

 [<c0164bc1>] sys_write+0x41/0x70

 [<c01030d2>] sysenter_past_esp+0x5f/0x85

 [<c0470033>] ieee80211_rx+0x773/0xd30

 =======================

Code: 24 48 31 d2 89 f0 c7 04 24 00 00 00 00 e8 3d e3 ff ff e9 44 f7 ff ff 89 f0 e8 b1 e3 ff ff 85 c0 0f 84 66 f8 ff ff 

e9 30 f7 ff ff <0f> bf 43 08 8d 94 24 9e 00 00 00 b9 ff ff ff ff 89 54 24 10 31

EIP: [<c01b30ac>] check_balance+0x97c/0x1340 SS:ESP 0068:f73379dc

 BUG: at kernel/exit.c:860 do_exit()

 [<c0121bb4>] do_exit+0x694/0x7f0

 [<c0104726>] die+0x246/0x260

 [<c0115c0c>] do_page_fault+0x2cc/0x5d0

 [<c0115940>] do_page_fault+0x0/0x5d0

 [<c047832c>] error_code+0x7c/0x84

 [<c01b30ac>] check_balance+0x97c/0x1340

 [<c0187b93>] bio_put+0x23/0x30

 [<c018605a>] end_bio_bh_io_sync+0x2a/0x40

 [<c0185757>] ll_rw_block+0x37/0xb0

 [<c01bce0b>] search_by_key+0x14b/0xf90

 [<c01b3c13>] fix_nodes+0x1a3/0x750

 [<c01bfd6f>] reiserfs_insert_item+0x15f/0x310

 [<c01afe0f>] reiserfs_file_write+0x169f/0x1dc0

 [<f900ef9a>] ipw_handle_reply_rx+0x87a/0x1f60 [ipw3945]

 [<c0119381>] try_to_wake_up+0x41/0x3c0

 [<c0164461>] vfs_write+0xc1/0x180

 [<c01ae770>] reiserfs_file_write+0x0/0x1dc0

 [<c0164bc1>] sys_write+0x41/0x70

 [<c01030d2>] sysenter_past_esp+0x5f/0x85

 [<c0470033>] ieee80211_rx+0x773/0xd30

 =======================

rsync: connection unexpectedly closed (3029865 bytes received so far) [generator]

rsync error: error in rsync protocol data stream (code 12) at io.c(458) [generator=2.6.9]

>>> Retrying...

Exiting on signal 2
```

Anyone know what that could be?  :Shocked: 

Edit: I run the default gentoo-sources kernel

----------

## didymos

What kernel version? And have you tried re-emerging rsync?

----------

## Etal

 *didymos wrote:*   

> What kernel version?

 

uname -a 

Linux no-name-yet 2.6.20-gentoo-r8 #19 SMP PREEMPT Thu Jun 28 21:18:54 EDT 2007 i686 Intel(R) Core(TM)2 CPU T5300 @ 1.73GHz GenuineIntel GNU/Linux

 *didymos wrote:*   

> And have you tried re-emerging rsync?

 

Did that, got another oops   :Sad: 

----------

## didymos

OK, just noticed this in the oops:

```

[<c0470033>] ieee80211_rx+0x773/0xd30

 =======================

rsync: connection unexpectedly closed (

```

So it's the wireless driver's fault probably.  I'm guessing the level of activity during rsync is what's triggering this bug.  What drivers are you using for the wifi card: in-kernel vs. external package, version, etc.?

----------

## Etal

I'm using Intel Intel PRO 3945ABG, with the ipw3945 driver (external)

----------

## didymos

Well, you could try the 1.2.1, which is masked right now.  I thought they'd added an in-kernel driver for the 3945, but not yet I guess. Not until the new wireless stack is in place. As to the masked version, the only explanation in package.mask is this:

 *Quote:*   

> 
> 
> # Masked because 1.2.1 because it is broken
> 
> 

 

which is really informative.  Looking around, it seems to be the case that it works great for some, and poorly for others.  No one seems to know why.

----------

## Etal

Hmmm... this is curious, though: After the first oops, the system would not halt. The after the second time I received an oops, I decided to unmount the the reiserfs partition. I typed in "umount /dev/sda6" and it just hanged and did nothing (ignoring ^C). After that, I typed in "reboot", and like the last time it just said "The system is going down for reboot NOW!" and nothing else. I can still go to other tty's, run programs, etc. At the same time, the wireless card works fine!

Now, to further test that... I formatted the partition with ext3, and did "emerge --sync" - meaning I downloaded the whole portage tree. No kernel panic, no oops - which proves that the card can handle the load.

So, it might seem like there is something wrong with reiserfs... which would be odd, since it has always worked for me without a problem on two other computers, and it's a stable version that shouldn't get corrupted twice in one day!  :Shocked: 

So, the question still remains. What causes the problem? But most importantly, what should be done to prevent further problems that might be caused by it?  :Mad: 

Edit: If you need any extra information, I'll be happy to post it!

----------

## defenderBG

if u use hdparm

/etc/conf.d/hdparm

rc-update show

hdparm <portage device>

still might be a problem between having both wireless and using the raiserfs...

to test if so... copy the hall portage tree raiser, but not with wlan, but from disk to disk...

----------

## Etal

 *defenderBG wrote:*   

> if u use hdparm
> 
> /etc/conf.d/hdparm
> 
> rc-update show
> ...

 

No, I don't use hdparm. On Monday, I'll plug my laptop to a wired connection, format it with reiserfs, shut off all unnecessary processes, and see if it still crashes. So far, it seems like this computer does not like rsyncing on reiserfs, since on ext3 it runs fine so far. We'll see what happens...

----------

## didymos

It's probably some weird little bug way down in the kernel guts, and it just so happens that the reiserfs/ipw3945 + heavy FS activity from rsync triggers it.  Maybe a race condition with a resource lock, something like that.  Someone with some decent kernel coding experience would be able to give a much more educated guess.  It may even have been fixed already in .21 or the .22 release candidates.

----------

## Etal

I decided to try reiserfs again, having ext3 running flawlessly for almost a month.

So, one month and two kernel versions later (I now use linux-2.6.22-gentoo-r1):

I started the rsync, and it worked until at some point it got interrupted and it went to retry:

```
receiving file list ... io timeout after 180 seconds -- exiting

rsync error: timeout in data send/receive (code 30) at io.c(165) [receiver=2.6.9]
```

Another Oops...   :Sad:  Here's the relevant part of /var/log/messeges:

```
Jul 26 16:51:37 tux ReiserFS: sda6: found reiserfs format "3.6" with standard journal

Jul 26 16:51:37 tux ReiserFS: sda6: using ordered data mode

Jul 26 16:51:37 tux ReiserFS: sda6: journal params: device sda6, size 3965, journal first block 130, max trans len 128, max batch 112, max commit age 30, max trans age 30

Jul 26 16:51:37 tux ReiserFS: sda6: checking transaction log (sda6)

Jul 26 16:51:37 tux ReiserFS: sda6: Using r5 hash to sort names

Jul 26 16:51:37 tux ReiserFS: sda6: warning: Created .reiserfs_priv on sda6 - reserved for xattr storage.

Jul 26 16:57:44 tux ReiserFS: warning: is_tree_node: node level 55359 does not match to the expected one 1

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-5150: search_by_key: invalid format found in block 162501. Fsck?

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-13050: reiserfs_update_sd: i/o failure occurred trying to update [44703 50726 0x0 SD] stat data

Jul 26 16:57:44 tux ReiserFS: warning: is_tree_node: node level 55359 does not match to the expected one 1

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-5150: search_by_key: invalid format found in block 162501. Fsck?

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-13050: reiserfs_update_sd: i/o failure occurred trying to update [44703 50726 0x0 SD] stat data

Jul 26 16:57:44 tux ReiserFS: warning: is_tree_node: node level 55359 does not match to the expected one 1

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-5150: search_by_key: invalid format found in block 162501. Fsck?

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-13050: reiserfs_update_sd: i/o failure occurred trying to update [44703 50726 0x0 SD] stat data

Jul 26 16:57:44 tux ReiserFS: warning: is_tree_node: node level 55359 does not match to the expected one 1

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-5150: search_by_key: invalid format found in block 162501. Fsck?

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-13050: reiserfs_update_sd: i/o failure occurred trying to update [44703 50726 0x0 SD] stat data

Jul 26 16:57:44 tux ReiserFS: warning: is_tree_node: node level 43706 does not match to the expected one 1

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-5150: search_by_key: invalid format found in block 186478. Fsck?

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-13050: reiserfs_update_sd: i/o failure occurred trying to update [44703 50727 0x0 SD] stat data

Jul 26 16:57:44 tux ReiserFS: warning: is_tree_node: node level 43706 does not match to the expected one 1

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-5150: search_by_key: invalid format found in block 186478. Fsck?

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-13050: reiserfs_update_sd: i/o failure occurred trying to update [44703 50727 0x0 SD] stat data

Jul 26 16:57:44 tux ReiserFS: warning: is_tree_node: node level 43706 does not match to the expected one 1

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-5150: search_by_key: invalid format found in block 186478. Fsck?

Jul 26 16:57:44 tux ReiserFS: warning: is_tree_node: node level 43706 does not match to the expected one 1

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-5150: search_by_key: invalid format found in block 186478. Fsck?

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-13050: reiserfs_update_sd: i/o failure occurred trying to update [44703 50727 0x0 SD] stat data

Jul 26 16:57:44 tux ReiserFS: warning: is_tree_node: node level 43706 does not match to the expected one 1

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-5150: search_by_key: invalid format found in block 186478. Fsck?

Jul 26 16:57:44 tux ReiserFS: sda6: warning: vs-5657: reiserfs_do_truncate: i/o failure occurred trying to truncate [44703 50727 0xfffffffffffffff DIRECT]

Jul 26 16:57:59 tux BUG: unable to handle kernel NULL pointer dereference at virtual address 00000005

Jul 26 16:57:59 tux printing eip:

Jul 26 16:57:59 tux c018aaa9

Jul 26 16:57:59 tux *pde = 00000000

Jul 26 16:57:59 tux Oops: 0000 [#1]

Jul 26 16:57:59 tux Oops: 0000 [#1]

Jul 26 16:57:59 tux PREEMPT SMP

Jul 26 16:57:59 tux Modules linked in: ipw3945 nvidia(P)

Jul 26 16:57:59 tux CPU:    0

Jul 26 16:57:59 tux EIP:    0060:[<c018aaa9>]    Tainted: P       VLI

Jul 26 16:57:59 tux EFLAGS: 00010293   (2.6.22-gentoo-r1 #5)

Jul 26 16:57:59 tux EIP is at __block_write_full_page+0xb9/0x310

Jul 26 16:57:59 tux eax: 0000005c   ebx: 00000005   ecx: 00000003   edx: 000012cd

Jul 26 16:57:59 tux esi: 00009669   edi: c21276b8   ebp: e93e2e3c   esp: f7e13df0

Jul 26 16:57:59 tux ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068

Jul 26 16:57:59 tux Process pdflush (pid: 254, ti=f7e12000 task=c2188580 task.ti=f7e12000)

Jul 26 16:57:59 tux Stack: f1808e10 0003ffff 0000000e f580acb0 c018ed10 c1527ca0 00000000 000f713f

Jul 26 16:57:59 tux 00000200 00000000 c21276b8 c1527ca0 00000000 c018adfa f7e13f74 f7e13ec4

Jul 26 16:57:59 tux f7e13f74 c018ed10 c1527ca0 c2127768 00000001 f7e13f74 0000000e c014ed18

Jul 26 16:57:59 tux Call Trace:

Jul 26 16:57:59 tux [<c018ed10>] blkdev_get_block+0x0/0x60

Jul 26 16:57:59 tux [<c018adfa>] block_write_full_page+0xfa/0x110

Jul 26 16:57:59 tux [<c018ed10>] blkdev_get_block+0x0/0x60

Jul 26 16:57:59 tux [<c014ed18>] __writepage+0x8/0x30

Jul 26 16:57:59 tux [<c014f1b5>] write_cache_pages+0x225/0x310

Jul 26 16:57:59 tux [<c014ed10>] __writepage+0x0/0x30

Jul 26 16:57:59 tux [<c014f2c0>] generic_writepages+0x20/0x30

Jul 26 16:57:59 tux [<c014f2fb>] do_writepages+0x2b/0x50

Jul 26 16:57:59 tux [<c01857d4>] __writeback_single_inode+0x94/0x3c0

Jul 26 16:57:59 tux [<c0454989>] __sched_text_start+0x2e9/0x970

Jul 26 16:57:59 tux [<c04549b0>] __sched_text_start+0x310/0x970

Jul 26 16:57:59 tux [<c0185ed2>] sync_sb_inodes+0x192/0x270

Jul 26 16:57:59 tux [<c018639b>] writeback_inodes+0x9b/0xd0

Jul 26 16:57:59 tux [<c014f9d5>] wb_kupdate+0x85/0xf0

Jul 26 16:57:59 tux [<c014fd80>] pdflush+0x0/0x1d0

Jul 26 16:57:59 tux [<c014fe7e>] pdflush+0xfe/0x1d0

Jul 26 16:57:59 tux [<c014f950>] wb_kupdate+0x0/0xf0

Jul 26 16:57:59 tux [<c01316d2>] kthread+0x42/0x70

Jul 26 16:57:59 tux [<c0131690>] kthread+0x0/0x70

Jul 26 16:57:59 tux [<c0103703>] kernel_thread_helper+0x7/0x14

Jul 26 16:57:59 tux =======================

Jul 26 16:57:59 tux Code: 14 b9 0c 00 00 00 89 d6 29 e9 d3 e6 8b 68 0c 89 eb eb 12 f0 0f ba 33 01 f0 0f ba 2b 00 8b 5b 04 39 dd 74 5a 46 3b 74 24 1c 77 e8 <8b> 03 a8 20 75 ec

8b 03 a8 02 74 e6 8b 54 24 20 3b 53 10 0f 85

Jul 26 16:57:59 tux EIP: [<c018aaa9>] __block_write_full_page+0xb9/0x310 SS:ESP 0068:f7e13df0

Jul 26 16:59:34 tux ------------[ cut here ]------------

Jul 26 16:59:34 tux kernel BUG at fs/reiserfs/journal.c:1679!

Jul 26 16:59:34 tux invalid opcode: 0000 [#2]

Jul 26 16:59:34 tux PREEMPT SMP

Jul 26 16:59:34 tux Modules linked in: ipw3945 nvidia(P)

Jul 26 16:59:34 tux CPU:    0

Jul 26 16:59:34 tux EIP:    0060:[<c01c9849>]    Tainted: P       VLI

Jul 26 16:59:34 tux EFLAGS: 00210206   (2.6.22-gentoo-r1 #5)

Jul 26 16:59:34 tux EIP is at flush_used_journal_lists+0x2d9/0x2e0

Jul 26 16:59:34 tux eax: ff9daabe   ebx: f9569fa0   ecx: 00000000   edx: f6418640

Jul 26 16:59:34 tux esi: e93e6da0   edi: f5c35e40   ebp: f8e83000   esp: e914fc58

Jul 26 16:59:34 tux ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068

Jul 26 16:59:34 tux Process rsync (pid: 17076, ti=e914e000 task=c224d030 task.ti=e914e000)

Jul 26 16:59:34 tux Stack: c01c7d30 00001b3f f7acec00 00000104 00000000 0000000e 0000000b 00000042

Jul 26 16:59:34 tux 0168f4c0 f7199bc0 e93d3d04 e93d3d38 f60671a4 eca5f5e8 eca5f240 e9c0c928

Jul 26 16:59:34 tux eca5f274 eca5fc68 eca5fa94 eca54310 e9c0c95c e90bdda0 eca54c34 eca54c00

Jul 26 16:59:34 tux Call Trace:

Jul 26 16:59:34 tux [<c01c7d30>] write_chunk+0x0/0x50

Jul 26 16:59:34 tux [<c01cb41e>] do_journal_end+0x99e/0xcc0

Jul 26 16:59:34 tux [<c01cba08>] do_journal_begin_r+0x188/0x2d0

Jul 26 16:59:34 tux [<c01cbbba>] journal_begin+0x6a/0xf0

Jul 26 16:59:34 tux [<c01bb69b>] reiserfs_dirty_inode+0x5b/0xa0

Jul 26 16:59:34 tux [<c01722fa>] __link_path_walk+0xaba/0xe30

Jul 26 16:59:34 tux [<c0186174>] __mark_inode_dirty+0x34/0x1c0

Jul 26 16:59:34 tux [<c017f013>] mntput_no_expire+0x13/0x70

Jul 26 16:59:34 tux [<c01726d5>] link_path_walk+0x65/0xc0

Jul 26 16:59:34 tux [<c017d5e2>] inode_setattr+0xb2/0x190

Jul 26 16:59:34 tux [<c01b36b9>] reiserfs_setattr+0x199/0x230

Jul 26 16:59:34 tux [<c012369d>] current_fs_time+0x4d/0x60

Jul 26 16:59:34 tux [<c017d955>] notify_change+0x295/0x320

Jul 26 16:59:34 tux [<c0188c6e>] do_utimes+0x10e/0x220

Jul 26 16:59:34 tux [<c0188da1>] sys_futimesat+0x21/0x90

Jul 26 16:59:34 tux [<c0188e2f>] sys_utimes+0x1f/0x30

Jul 26 16:59:34 tux [<c0102ae2>] sysenter_past_esp+0x5f/0x85

Jul 26 16:59:34 tux [<c0450000>] ieee80211_wx_get_scan+0xad0/0xb20

Jul 26 16:59:34 tux =======================

Jul 26 16:59:34 tux Code: 0a fc ff e9 4f ff ff ff 89 f0 e8 23 0c fc ff 8d 76 00 e9 eb fe ff ff 89 fa e9 67 ff ff ff 8b 44 24 08 89 fa e8 c9 de ff ff eb 8b <0f> 0b eb fe 8d 76

00 56 89 c6 a1 3c c0 52 c0 53 ba 50 08 00 00

Jul 26 16:59:34 tux EIP: [<c01c9849>] flush_used_journal_lists+0x2d9/0x2e0 SS:ESP 0068:e914fc58

Jul 26 17:00:01 tux cron[17082]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons )

```

I wish I could say that it's just reiserfs, but "grep Oops /var/log/messages" returns two Oopses which happened when I wasn't using reisefs:  :Mad: 

```
Jun 29 11:29:55 tux Oops: 0000 [#1] <- Sync with reiserfs

Jun 29 15:11:09 tux Oops: 0000 [#1] <- Sync with reiserfs

Jul  7 12:37:34 tux Oops: 0000 [#1]

Jul  7 12:43:26 tux Oops: 0000 [#1]

Jul 16 18:09:57 tux Oops: 0002 [#1]

Jul 26 16:57:59 tux Oops: 0000 [#1] <- Sync with reiserfs
```

My other computer never had a kernel oops.

Next, I tried to attach my laptop to a wired connection. I removed the ipw3945d from the default runlevel, and disabled the wireless card. Did a sync and my computer froze (I assume it was a Kernel Panic)

What should I do?  :Sad: 

If you need any more information I'll post it

----------

## eccerr0r

It doesn't look like the same oops each time, check the logfile to make sure of the new ones.

Did you try running memtest86 to rule out memory issues?

I've heard the reiserfs driver is a bit more tough on the cpu/ram than ext2fs and it might be enough to expose an issue.

----------

## Etal

 *eccerr0r wrote:*   

> It doesn't look like the same oops each time, check the logfile to make sure of the new ones.

 

I looked it up in /var/log/messages and I remembered that on Jul 7 I configured the kernel incorrectly and I got an Oops at bootup. As for Jul 16, I don't know. Should I post it?

The Oopses I posted were happened during rsync. Anytime I sync on a reiserfs partition, I either get an Oops or a Panic. Unpacking the snapshot or searching on reiserfs causes no problems.

 *eccerr0r wrote:*   

> Did you try running memtest86 to rule out memory issues?
> 
> I've heard the reiserfs driver is a bit more tough on the cpu/ram than ext2fs and it might be enough to expose an issue.

 

Yes, I tried that, and it worked fine. It's a new computer...

----------

## Etal

Anyone?  :Sad: 

----------

## eccerr0r

Your oopses are all over the place, it's still pointing towards a hardware issue.  New machines doesn't really mean much.

Can you underclock your machine to see if it makes a difference?

----------

## Etal

 *eccerr0r wrote:*   

> Your oopses are all over the place, it's still pointing towards a hardware issue.  New machines doesn't really mean much.
> 
> Can you underclock your machine to see if it makes a difference?

 How do I underclock?

----------

## didymos

Either throttling/scaling using the kernel features for that, or by setting the clock speed in the BIOS (if your BIOS will let you).

----------

