# Hardware Instability...

## Woollhara

Hi,

I came accross this article by DR (http://www-106.ibm.com/developerworks/library/l-hw1/) which was interesting and enlightning  :Smile: 

I have this custom built box (since March 2000):

```

dual P3 cpu (733MHz) slot-1

512Mb (4x128Mb SDRam 100Mhz)

Supermicro Mobo PIIIDME

SB Live! value

Promise Ultra100TX2

Adaptec SCSI 29140N

3 x IDE hard drives (1 on mobo IDE, 2 on Promise controller)

Zip 250 (IDE)

CD-RW (SCSI)

DVD (IDE)

```

and experienced the occasional lockup or sudden reboot but since I moved to Gentoo (and probably because I've never compiled some much stuff before!) I experienced sudden reboots or lock ups when emerging stuff (especially huge software, like mozilla, koffice) much more frequently than before.

After reading this article I used memtest86 to test my memory. After almost 24h, it did not trigger any memory errors. So far so good.

Yesterday I tried the kernel compilation test and horror, hard lock ups everytime (at least for the 2 times I ran the script). I cannot tell how long it took for the machine to lock up because I was not in front of it but I will try that tonight.

note: I launched the memtest86 test again this morning just before leaving to work, just in case.

So at this stage I believe that I've got a cooling pbm on the 2 CPUs and I just want to know if any of you guys have experience in improving the cooling for slot-1 P3 in a dual CPU config or any other tips that could help me?

What I'll try tonight as well is to leave the box open?

Anyway thanks for any help.

----------

## Mimamau

had this problem with my old pIII 600 slot1. everytime I had a large compile session (kernel, kde, ...) my system suddenly rebooted. 

I fixed this with a BIOS update from Asus.

yes, first I also thought it could be a cooling problem, but my cpu was really cold all the time. you also should check your power supply, maybe its just too weak. does your mainboard officially support these cpus? maybe they need too much power, too?

----------

## linux4u1

hmm i was given a computer (900 mhz anthlon) it would freeze up randomly to 9thats why it was given to me ) i ried testing the memory and processor cache everything ran fine even ran benchmarking software . after many times taking this pc apart and trying many different linux distros mandrake would load but not boot . redhat would lock in install . gentoo would lock in install sometimes ( due to compiling software)....  well i decided to pull the processor to look at the heatsink (at this time the processor would not even get hot) once i pulled the heatsink off the processor i found what appears to be cement (grease that burned up) this was causeing a heat wall it also made the processor seem cool to . once i scrapped that crap off and put more greese on it , it worked great.  you might want to check to see if this is your problem.

----------

## Woollhara

OK thanks for the tips. I guess I have a long and painful investigation in sight  :Wink: 

----------

## Woollhara

Well, if I go to the BIOS settings and check the temp of the CPUs (basically, at that stage they are just idle) I read CPU0 = 56 celsius degre, CPU1 = 48 celsius degree (the room temperature is about 20 celsius degree). Considering the room temperature and the fact the CPUs are doing f... all, having these babies at a temp 30 celsius degree above the room temp is not good!

Those Slot-1 fittings are a piece of crap, they're mounted so close to each other on the mobo and 1 of the fan blows hot air directly at the other CPU!

I need to check the thermal grease though, it may be all burned up as linux4u1 mentionned.

----------

## therootshell

I had  similar issue, but it was not thermal related at all.  Check your BIOS for power management crap - my machine would die during long ebuilds because of APM/APIC garbage.  A BIOS flash and settings check later and it is running smooth.   :Wink: 

As for the thermal issue - get two high CFM case fans (and some good thermal grease).  Check to make sure you have an air path through the case.  I've seen people have a ton of fans and still get thermal lock ups because they were just blowing the hot air around inside the case!  Position the fans to get air moving over the CPUs - one exhaust fan and one inlet fan makes a huge difference.

 :Exclamation:  One last thing to check: the power supply.  Alot of times folks get memory errors and lockups when the power supply is overtaxed.  Ususally they can go a bit over their rating, but that is a REALLY bad idea.  The minimum PS I use on a dual machine is at least a 400W.  Remember that the PS is the most important part of the machine - without it you've just got a really expensive paperweight.

Good luck and happy SMPing!

----------

## BonezTheGoon

I think you got pretty much all the good advice you need!!  I second the motion on case fans that allow air travel THROUGH the case, not just stirring inside the case.  I always try to add up my air flow such that the exact number of CFM goes out that comes in.  The other thing I did want to mention that no one else had in this thread is that I personally think there is no reason to use grease over "arctic silver" -- Just read a few reviews on artic silver and I think you will also be convinced.  One really nice thing about it is that it does not depend on moisture for its heat-transfer (like the cemented old dried out grease) so even if your Arctic Silver gets really old and dries out some you will still have very good heat-transfer.  I think that you could get away without using the Arctic Silver, but I see much value it using it the FIRST time you put a HSF on a CPU.  By the way these P3 coppermine CPU's should run nicely cold--so it either HAS to be your compound not allowing good contact to the heatsink--OR--it has to be something other than heat ((Read PSU)).

Regards,

BonezTheGoon

----------

## Woollhara

Well, first thanks to all the posters  :Smile: 

Right now my box has been compiling kernel after kernel for the last 3 1/2 hours and it is still ticking along happily  :Smile:   :Smile:   :Smile: 

What I did was:

- removed all components, get rid of the tons of dust in the heatsinks, fans and other filthy stuff on them or inside the box.

- got "artic silver" from www.overclockers.co.uk and used it on both CPUs

I now have both CPUs running around 10 celsius degree colder (46 & 40 approx.)

The only thing I'm not happy about is the way the 2 CPUs are mounted on the mobo, basically CPU1 is blowing hot air at CPU2. In fact, CPU1 stays hotter than CPU2 because there is only a small space between the 2 CPUs and the hot air is not evacuated that well so I will probably try to add a fan that sucks air from above the CPUs... I'm pretty sure I can lose a few more degrees.

Anyway, at least it works now.

I've got a 300W power supply that seems ok but I will monitor this aspect in case some other pbms show up.

Thanks again for all the tips. Now let's go back to serious business.... emerge --update world  :Smile: 

----------

## BonezTheGoon

Boy I love "Happy endings!!!!"

Thanks for the detailed results, glad they turned out to be for the better!!!  Gotta love that Artic Silver II.  It's cool (har har, pun only slightly intended--booing and hissing permitted and encouraged accordingly)

Congrats on tackling the problem!!

Regards,

BonezTheGoon

----------

