# Errors during emerge (Hardware ?)

## markus_b

My gentoo machine fails with various errors, most of the time during emerge runs. A typical example

is upgrading to the latest gcc today:

First all works fine until bzip2 fails claiming a corrupted file. Just relaunching the emerge

goes further, but then gcc fails with a segfault.

 *Quote:*   

> altair linux-2.6.15-gentoo-r1 # emerge -pu gcc
> 
> These are the packages that I would merge, in order:
> 
> Calculating dependencies ...done!
> ...

 

The PC is a homegrown system with 2.4GHz P4 in a Asus P4PE Motherboard and 1GB of Kingston memory.

It has a single IDE ST3200822A harddrive and a IDE Nec ND-3540 DVD burner.

Where and how should I track down this problem ?

How to test the hardware ?

Markus

----------

## NeddySeagoon

markus_b,

The file /usr/portage /distfiles/bison-2.1.tar.bz2 is probably corrupt.

rm it and emerge will fetch it again.

If the problem persists, google for it and try different sources.

Its possible the mirror has a corrupt copy.

----------

## SinoTech

This problem can bne caused by an overclocked CPU, faulty Ram, ...

If you have overclocked your CPU, undo that please and test if the errors are still there.

To check your memory you should install "memtest86+" (Follow the instructions, about configuring grub, shown in the message when emerge finishes).

Regards,

Sino

----------

## markus_b

 *NeddySeagoon wrote:*   

> markus_b,
> 
> The file /usr/portage /distfiles/bison-2.1.tar.bz2 is probably corrupt.
> 
> rm it and emerge will fetch it again.
> ...

 

No it is not. I can test it (bzip2 -rvv /usr/portage /distfiles/bison-2.1.tar.bz2) and bzip2

tells me it is fine. When repeating the emerge -u command it gets unpacked fine, but

the gcc fails with a segfault.

Markus

----------

## markus_b

 *SinoTech wrote:*   

> This problem can bne caused by an overclocked CPU, faulty Ram, ...
> 
> If you have overclocked your CPU, undo that please and test if the errors are still there.
> 
> To check your memory you should install "memtest86+" (Follow the instructions, about configuring grub, shown in the message when emerge finishes).
> ...

 

That's the area I suspect too. The system is not overclocked. I'd rather underclock as I'm looking for

reliability and low power consumption. Speed is not important for this system.

At the moment I try to get the new gcc compiled, but it always fails a couple of minutes into the emerge

with a failure. A different failure every time !

I'm running 'memtester' in parallel, but it found no error so far.

Maybe I've got a bad power supply, how can I chek that ?

Markus

----------

## SinoTech

 *markus_b wrote:*   

>  *SinoTech wrote:*   This problem can bne caused by an overclocked CPU, faulty Ram, ...
> 
> If you have overclocked your CPU, undo that please and test if the errors are still there.
> 
> To check your memory you should install "memtest86+" (Follow the instructions, about configuring grub, shown in the message when emerge finishes).
> ...

 

If gcc always fails on another position, you can be sure your RAM is corrupt. BTW you shouldn't check your RAM on a running machine, since then you can only check that portion of memory which is currently unused. So I recomment to install "memtest86+", which comes with an own small kernel. After that you can reboot your machine and boot the memtest86 program (As stated in my last post, you've to update your grub.conf, but that is explained when emerging memtest86+ finishs).

BTW you should let it run severeal hours, even if the first or second try pass all the tests without a failure. Let it run over night and have a look on it the next day.

Regards,

Sino

----------

## markus_b

[/quote]

If gcc always fails on another position, you can be sure your RAM is corrupt. BTW you shouldn't check your RAM on a running machine, since then you can only check that portion of memory which is currently unused. So I recomment to install "memtest86+", which comes with an own small kernel. After that you can reboot your machine and boot the memtest86 program (As stated in my last post, you've to update your grub.conf, but that is explained when emerging memtest86+ finishs).

BTW you should let it run severeal hours, even if the first or second try pass all the tests without a failure. Let it run over night and have a look on it the next day.

Regards,

Sino[/quote]

Sino,

memtest is running now and shows no error so far (15mins, 50%).

I fear the problem is that the power-supply weakens under load (disk access) and the memory fails then.

Markus

----------

## SinoTech

Hmm .. didn't know. Never had problems with a to weak power supply before  :Sad: . BTW what's your one? Think you should have 300W or more.

Regards,

Sino

----------

## markus_b

 *SinoTech wrote:*   

> Hmm .. didn't know. Never had problems with a to weak power supply before :(. BTW what's your one? Think you should have 300W or more.
> 
> Regards,
> 
> Sino

 

The power supply claims 300W (180W on 5V). The brand is 'Power master', model JJ-300T.

Just found is on Microtech's site: http://www.microtechswiss.com/en/psu/psu.php?pm_jj-300t

It is suspect to me as it came with the (cheap) housing I'm using.

The 1st run just went through without error. Memtest is churning on...

Markus

----------

## NeddySeagoon

markus_b,

Since its gcc that fails, I suspect that the CPU or chipset it getting hot when its working hard, as it does during a compile.

With stock heatsink/fan, the CPU temperature rise can be 15 deg C between idle and working hard. 

Try compiling with the case open, that will be worth at least 5 deg C. If that works, improve your airflow but ensure that the case is always at a higher pressure than the inside. That arrangement ensures that clean air leaks out through the floppy, CD/DVD drives, rather than dirty air being sucked in.

----------

## BitJam

I agree with NeddySeagoon.   I would have said something similar but he beat me to it.  I had a similar problem and beefing up my cooling fixed it.

----------

## SinoTech

Hmm .. perhaps to much dust in the CPU's cooler. Take a straw a blow the dust away. That can make a difference of 10 degrees or more (dependent on how much dust is there).

Regards,

Sino

----------

## markus_b

Gentlemen,

I've dome some temp measurements while memtest running:

- Northbridge (has no fan): 55 deg

- CPU heatsink 45 deg

- Memory chip 45 deg

Besides the nortbridge the temps look reasonable. I've never seen any excessive temps

on the built-in temp sensors.

However, I gone to the BIOS and chnegd the memory speed from 333Mhz to 233Mhz and

changed the DRAM timing fro the default to the slowest available. The compiles I've been

running since went well (new gcc new kernel). I'll proceed to 'emerge -u world' before going

to bed. We'll see where we are tomorrow.

It looks like the changed memory timing has helped, I'll have to look into the memory stuff

somewhat closer. There are two dimm's, one is one sided the other is double-sided. Maybe

this difference is sufficient to induce the instability.

Markus

----------

## NeddySeagoon

markus_b,

memtest does not work the CPU very hard. I would say thats a little on the high side.

----------

## markus_b

 *NeddySeagoon wrote:*   

> memtest does not work the CPU very hard. I would say thats a little on the high side.

 

After some twiddling I've got 'sensors' working:

 *Quote:*   

> altair linux # sensors
> 
> asb100-i2c-0-2d
> 
> Adapter: SMBus I801 adapter at e800
> ...

 

The values look real to except for the chassis fan who is running at a similar

speed to the cpu fan.

Markus

----------

