# Help me identify the instability in my system

## vrih

My computer recently became very unstable after i installed a new network card and CD writer. The problem is it's still unstable with that hardware removed.  It's stable enough for 90% of normal usage but it totally locks up whenever something uses gcc, which makes the whole portage system unusable.  It can last anything up to about 45 mins before it locks when emerging something like gcc. I'm 90% sure that it's not a software problem because I've chrooted and tried to bootstrap from a clean environment but it still locked.

The minimum hardware I've tried it with is:

Abit KT7A motherboard, Thunderbird 1.4GHz CPU, Hercules Prophet 4500 Graphics Card, 1 x 128 MB ram,  1 x 256 MB ram, Enermax PSU, 2 hard disk (cant remeber sizes or brands).

I'm pretty sure the RAMs not at fault because I've tried each stick on its own and it locked both times. The PSU should be able to cope fine with just that running. The motherboard sensors show CPU temp to be lower than it was when it was working fine through the hot summer. the SYS temp is running at about 29 Celsius which doesn't seem unreasonable. I've also run `fdisk -cf` on all my partitions from a slackware boot CD, although it was sending up an mesage about one of my partitions having unsopported features, but that only mounts as /home/ so it shouldnt be a problem.  The only time it locked up without gcc running was when it was loading my desktop with multiple instances of gimp to load up. I was just using the 128 MB ram stick, which makes me think that it might be some sort of memory problem, possibly swap problem, although i don't think it should be swapping at that. Please help!!!

----------

## Schmolch

I would:

- remove 1 RAM and change all RAM-Settings in the BIOS to the lowest/slowest settings. run memtest86 to stress-test the ram.

- Check the BIOS Settings for the CPU and eventually underclock it a bit or raise the VCore (thats dangerous, underclocking would be more safe)

- Check BIOS Settings for APM/ACPI and eventually disable them

- Disable "Lock up at random places" in the BIOS-Settings (sorry just kidding, its weekend and im soo happy  :Smile: )

Good Luck and have fun searching for the cause  :Smile: 

----------

## vrih

How long should I be running memtest86 for? It's passed one test but then it started again. Should I bewaiting for it to get 10 passes or 100 passes or is one enough?

I've underclocked my CPU, removed my 128mb ram stick and got an external fan blowing into my case. SYS temp is now about 24/25 Celsius but it's still locking.  Is fdisk a good enough check of my hard disks? I'm starting to get worried that my mobo might have gone.

----------

## happypup

I had a problem of it locking up in gentoo on install sometimes sometimes not It turned out to be the posts on the motherboard to the case they were the square ones an it shorted out the board as humidity changed. I changed cases and it never froze up but it sure screwed up am radio so I endend up not using it any more  :Mad: 

I usually run memtest about 10 passes. I know some people let it go all day though  :Confused: 

----------

## vrih

I found a stability testing guide written by our leader Daniel Robbins for IBM. I've run CPUburn for a couple of hours and it ran hot but didnt crash. GCC still does. I read somewhere that it could be a bus problem. Does anyone have experience with that?

I'm going to run memtest most of the day tomorrow, I've needed my computer running various servers over the weekend so I havent had the chance yet.

I think I've limited the location of the problem to either a bus problem, some kinda fault in a hard disk, ram or other general motherboard screwyness. I'm really hoping memtest finds out that my ram is screwed tomorrow just so i can put this problem to bed.

----------

## vrih

BurnK7 passed, memtest passed. This is getting really frustrating   :Twisted Evil:  Any suggestions at all from anyone would be greatly appreciated.

----------

## jammib

Have you tried it with a different PSU.  I know the one you have should cope with that load, but I have read several reports on the Internet of KG7's locking up due to underspeced PSU's.  I don't think that your PSU is underspecced, but maybe the extra load of the CDR and NIC was enough to trip something internally.

Hope it helps

Jammib

----------

## vrih

My my, I feel stupid. It was my kernel that was instable. I had been using 2.4.20-ck sources and i upgraded to 2.4.22-ck sources adding in scsi support as well. Now I've got 2.4.22 vanilla sources without scsi support and it works fine. now to start re-enabling options to find out the culprit

----------

## Nermal

Disable APIC as well as ACPI.

Adding the network card / etc could have screwed up your IRQ routing tables,  and the other devices may not have had their IRQs reassigned even when you removed the network card.

----------

