# Kernel panic after 2 weeks

## kilburna

Hi

My main server was installed 6 weeks ago, but every 2 weeks I get a hard kernel panic. I cannot capture the screen as it hard locks. I have to reset via KVM. I have no idea how to get a log from kernel panic as I have not had this before. After the server is rebooted the raid goes into a re-sync. The FS is EXT4 and running kernel 2.6.33-r2.

If someone could assist me in how I could go about getting a log from a kernel panic. There are no indications in the /var/log/messages.

Thank you

----------

## madchaz

you will want to look at the monitor to see what it's giving you. Otherwise, a console (serial, usb or network) might help.

----------

## idella4

kilburna, 

no you will not get a log or dmesg.  It tends to start logging from just beyond the point of kernel panic.

 *Quote:*   

> 
> 
> I have to reset via KVM.
> 
> 

 

??? What does kvm have to do with it?  Don't you press the computer reset?  You're describing a host or a vm?

 *Quote:*   

> 
> 
> a console (serial, usb or network) 
> 
> 

 

requires I believe a separate computer.  Time for a pen and paper; observe the screen output and post.  This should suffice.

e.g. could not mount root with any file system; could not find init; could not mount on unknown block ...  ... ..

 *Quote:*   

> 
> 
> but every 2 weeks
> 
> 

 

So, over 6 weeks, the server has hit kernel panic, become resolved and continued to boot until the next kernel panic? ?  

Please elaborate on how it reverted and what changed.

----------

## madchaz

I think is issue is it panics after 2 weeks running, not at every boot.

----------

## idella4

madchaz,

hmmm, well, I must be missing something.  The server runs for two weeks then kernel panics.

I am confused.  How can a kernel panic when it's already booted up and running?

The only kernel panic I know about occurs at bootup!!  

However, discussing it amongst ourselves is curious, but more helpful to hear from kilburna whic can clarify.

edit;

madchaz

yes sure.  I may be in for some more learning.  I'm not replying again, the next to reply should be the poster.

----------

## madchaz

lots of things can make the kernel panic while things run. Hardware issues come to mind. As he mentionned kvm, I guess it's a vm. But you are right, harder to figure out without OP clarifying

----------

## DirtyHairy

idella: as madchaz already pointed out, a kernel panic can occur at _any_ point, not only during boot. If the panic happens reproducibly after two weeks of uptime, a memory leak might be the culprit (provided that oom_killer has been decativated). For logging the panic, you have to either hardcopy it from the screen paper-and-pencil style, or you have to redirect the kernel log output to a serial port.

----------

## kilburna

This is becoming that weird. This has happened twice now. The server will panic then it will reboot, raid will sync, then 6 hours later will panic then reboot, raid will sync and then stay up for another 2 weeks. Then process will started again.

On 2 occasions I had to restart the server via iKVM (not KVM virtualisation) but on other occasions that kernel will reboot automatically I think because of kexec. I did not record the screen dump the first time I saw this as there are a few thousand mail users accessing. 

Without any dumps I am fishing in the dark. The iKVM is also logging no hardware events. This is a supermicro 6016 with 5620 CPU.

Although I have many cron jobs running, this was used for many years, but I will do a complete emerge system & world.

----------

## Hu

 *kilburna wrote:*   

> I will do a complete emerge system & world.

 I recommend that you not do that at this time.  Barring a kernel bug, a kernel panic cannot be triggered by user code.  Therefore, rebuilding all the user code will not help with a kernel problem.  At best, you will waste your time.  At worst, you may be in the middle of merging something important when the kernel panics and leaves you with a half-installed package.  If the panic is due to a kernel bug allowing a user program to crash the kernel, rebuilding that user program is not likely to fix it.  Upgrading the offending user program might mask the bug, if you knew what program it was and the author had changed it to avoid the bug.

----------

## madchaz

More and more this sounds like some very obscure hardware issue. 

You might want to run memtest on the machine, see if it as any memory issues.

----------

