# Undesired shutdown - possibly overheating [solved]

## kwesadilo

I am running an amd64 build of gentoo-sources 2.6.34-r12 on a Thinkpad T61p. Sometimes, when I am using it, it will just shut down without warning. The shutdowns are about as orderly as could be expected. My window closes, X stops, I get logged out, and all of the running services stop. It switches to runlevel 0, unmounts the disks, and halts. As far as I know, this problem has existed since I first installed Gentoo, but the shutdowns would only occur when I was in the middle of a compilation while emerging something big, e.g. Open Office. Typically, I would start an update, walk away, and come back to find my computer turned off. I initially thought that there might be some sort of GCC- or Portage-triggered kernel bug. It wouldn't happen every time, and updates were a real crap shoot. Recently, I installed Quake 4, and playing it for a few minutes with the graphical settings maxed out pretty reliably generates the exact same shutdown behavior behavior.

Because compilation and video games are both resource-intensive activities, it occurred to me that the computer could be overheating. I'm not that experienced with the logging facilities (I use metalog), but I poked around and found 

```
Dec 17 23:56:47 [kernel] [27482.150814] thinkpad_acpi: THERMAL EMERGENCY: a sensor reports something is extremely hot!

Dec 18 00:00:59 [kernel] [   10.320126] thinkpad_acpi: THERMAL EMERGENCY: a sensor reports something is extremely hot!

Dec 18 00:07:39 [kernel] [  596.441888] thinkpad_acpi: THERMAL EMERGENCY: a sensor reports something is extremely hot!
```

 in /var/log/critical/current. I looked in /var/log/everything/* and saw that two of those events occurred right before one of the unexpected shutdowns. Both of them were immediately followed by thinkpad_acpi output showing the different temperature readings, showing that the CPU temperature was 100 degrees C. The middle thermal emergency occurred during boot and did not result in a shutdown. The CPU temperature was only 91 degrees C in /var/log/everything/* at that time. The two emergencies that preceded shutdowns were surrounded by output similar to 

```
Dec 20 03:02:48 [logger] ACPI event unhandled: processor LNXCPU:01 00000080 00000005

Dec 20 03:02:48 [logger] ACPI event unhandled: processor LNXCPU:00 00000080 00000000

Dec 20 03:02:48 [logger] ACPI event unhandled: processor LNXCPU:01 00000080 00000000

Dec 20 03:02:48 [logger] ACPI event unhandled: processor LNXCPU:00 00000080 00000005
```

 but with the same timestamps as the associated thermal emergencies. This output was repeated about fifteen times with most occurring before the emergency and some after but all within the same second or two.

Unfortunately, I cannot show you the log output depicting the unwanted shutdown, because when I tried to reproduce this problem and log output by playing Quake 4, my logs were inundated with "ACPI event unhandled" output exactly like that depicted above. I played for a few minutes, and the computer didn't shut down, so I cranked the quality as high as it would go. After I quit (you need to restart for graphics settings changes to take effect), I noticed from my tray applet that the CPU temperature was in the 90s C (might have been 99, but I can't remember with certainty). I restarted Quake and the computer shut down after I had played for probably about ten minutes. I didn't keep track of the times, but it felt like a little bit longer than it usually takes to cause a shutdown. When I started the computer and looked at /var/log/everything/*, I saw to my annoyance that the files were almost completely full of "ACPI event unhandled" output and only covered the last fifteen minutes or so of operation. It did show a shutdown at the time that the last shutdown occurred. It did not show any "THERMAL EMERGENCY" output. Nor did any files in /var/log/critical, so I guess there wasn't a thermal emergency that time. I created another log destination in my metalog configuration that filters out all of the "ACPI event unhandled" output, but it now occurs to me that I might be endangering my computer by repeatedly reproducing this problem, and I don't want to do that.

I have read other posts online about excessive "ACPI event unhandled" log output, but those people were talking about continuously repeated output every few seconds. I only get this output in the few minutes preceding one of the undesired shutdowns (or at least when I'm doing something resource-intensive), and from what I can tell, I got about 53,000 lines of it in about ten minutes. Also, none of the other posters ever came up with any solution for the problem other than telling acpid or their logger to ignore it. I notice that the "processor" event group is, in fact, unhandled in /etc/acpi/default.sh. Is there something useful that acpid can or should be doing with such events besides spamming my logs?

I searched for a while and couldn't find anybody else with a shutdown problem similar to mine. Do you think that my problem is related to overheating? Is it safe to try to reproduce it? If the computer is overheating dangerously, shouldn't it immediately halt and not perform and orderly shutdown? Shouldn't it be adjusting the fan speed and throttling the processor to avoid overheating?

I can provide various file contents and program output, but I'm not sure what people will want. Any attempts to help would be appreciated. I'm really stumped, and this is seriously interfering with my ability to update software and play video games (both mission-critical activities).Last edited by kwesadilo on Tue Dec 28, 2010 1:33 am; edited 1 time in total

----------

## ASID

I would check the BIOS for anything unusual. 

-> Check CPU freq, fan speed, etc.

-> Check BIOS version and if an update is available.

Does the fan make any sound while stressing the machine?

-> Maybe your fan fails or it might just need cleaning.

Do you use any power/CPU management tool? 

-> Something like cpufreq. 

-> This could be useful : http://www.gentoo.org/doc/en/power-management-guide.xml

Do you have dual boot?

-> Maybe if you test this in Windows and have the same problem, then you'll probably have some hardware fail or BIOS bug.

Also, some useful files are:

1) Kernel .config file

2) emerge --info

3) emerge -s acpi

Good luck!

----------

## Herring42

May I also suggest checking the cpu fan is still running, and taking a can of compressed air to the heat sink.

Laptops are big attractors of dust.

----------

## kwesadilo

Thank you both for your help.

Edit: It turns out you can't post thousands of lines of output in a comment, but it works just fine in the preview. Let's try that again.

My kernel's .config is here.

emerge --info:

```
Portage 2.1.9.25 (default/linux/amd64/10.0/desktop/gnome, gcc-4.4.4, glibc-2.11.2-r3, 2.6.34-gentoo-r12 x86_64)

=================================================================

System uname: Linux-2.6.34-gentoo-r12-x86_64-Intel-R-_Core-TM-2_Duo_CPU_T9300_@_2.50GHz-with-gentoo-1.12.14

Timestamp of tree: Mon, 20 Dec 2010 08:15:03 +0000

app-shells/bash:     4.1_p7

dev-java/java-config: 2.1.11-r1

dev-lang/python:     2.6.5-r3, 3.1.2-r4

dev-util/cmake:      2.8.1-r2

sys-apps/baselayout: 1.12.14-r1

sys-apps/sandbox:    2.4

sys-devel/autoconf:  2.13, 2.65-r1

sys-devel/automake:  1.9.6-r3, 1.10.3, 1.11.1

sys-devel/binutils:  2.20.1-r1

sys-devel/gcc:       4.4.4-r2

sys-devel/gcc-config: 1.4.1

sys-devel/libtool:   2.2.10

sys-devel/make:      3.81-r2

virtual/os-headers:  2.6.30-r1 (sys-kernel/linux-headers)

ACCEPT_KEYWORDS="amd64"

ACCEPT_LICENSE="*"

CBUILD="x86_64-pc-linux-gnu"

CFLAGS="-O2 -march=core2 -pipe"

CHOST="x86_64-pc-linux-gnu"

CONFIG_PROTECT="/etc /usr/share/X11/xkb"

CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"

CXXFLAGS="-O2 -march=core2 -pipe"

DISTDIR="/usr/portage/distfiles"

FEATURES="assume-digests binpkg-logs distlocks fixlafiles fixpackages news parallel-fetch protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"

GENTOO_MIRRORS="http://gentoo.osuosl.org http://distfiles.gentoo.org    http://www.ibiblio.org/pub/Linux/distributions/gentoo"

LANG="C"

LDFLAGS="-Wl,-O1 -Wl,--as-needed"

MAKEOPTS="-j3"

PKGDIR="/usr/portage/packages"

PORTAGE_CONFIGROOT="/"

PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"

PORTAGE_TMPDIR="/var/tmp"

PORTDIR="/usr/portage"

PORTDIR_OVERLAY="/usr/local/portage"

SYNC="rsync://rsync.us.gentoo.org/gentoo-portage"

USE="X a52 aac acl acpi afs alsa amd64 apache2 bash-completion berkdb bluetooth bzip2 cairo cdda cddb cdr cli consolekit cracklib crypt css cups cxx dbus dri dts dv dvd dvdr eds emboss encode evo exif fam fbcon ffmpeg firefox flac fortran gd gdbm gdu geoip gif gmp gnome gnome-keyring gpm gstreamer gtk gzip hal hardened iconv imap imlib ipv6 jpeg kerberos lame lcms ldap libnotify mad matroska mikmod mmx mng modules mp3 mp4 mpeg mudflap multilib nautilus ncurses networkmanager nls nptl nptlonly offensive ogg opengl openmp pam pango pcre pdf perl png policykit ppds pppd python qt3support quicktime readline samba sdl session spell sse sse2 ssl startup-notification svg sysfs syslog tcpd theora threads tiff truetype unicode usb vcd vim-syntax vorbis wmf x264 xcb xinerama xml xorg xscreensaver xulrunner xv xvid zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="evdev synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" PHP_TARGETS="php5-2" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" 

Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
```

emerge -s acpi:

```
Searching...    

[ Results for search key : acpi ]

[ Applications found : 12 ]

*  app-laptop/acpi4asus [ Masked ]

      Latest version available: 0.41

      Latest version installed: [ Not Installed ]

      Size of files: 29 kB

      Homepage:      http://sourceforge.net/projects/acpi4asus

      Description:   Acpi daemon and kernel module to control ASUS Laptop Hotkeys

      License:       GPL-2

*  mail-client/claws-mail-acpi-notifier

      Latest version available: 1.0.23

      Latest version installed: [ Not Installed ]

      Size of files: 379 kB

      Homepage:      http://www.claws-mail.org/

      Description:   This plugin enables mail notification via LEDs on some laptops.

      License:       GPL-3

*  sec-policy/selinux-acpi [ Masked ]

      Latest version available: 20080525

      Latest version installed: [ Not Installed ]

      Size of files: 328 kB

      Homepage:      http://www.gentoo.org/proj/en/hardened/selinux/

      Description:   SELinux policy for APM and ACPI

      License:       GPL-2

*  sys-libs/libacpi [ Masked ]

      Latest version available: 0.2

      Latest version installed: [ Not Installed ]

      Size of files: 102 kB

      Homepage:      http://www.ngolde.de/libacpi.html

      Description:   A general purpose library for ACPI

      License:       MIT

*  sys-power/acpi

      Latest version available: 1.5

      Latest version installed: [ Not Installed ]

      Size of files: 90 kB

      Homepage:      http://sourceforge.net/projects/acpiclient/

      Description:   Attempts to replicate the functionality of the 'old' apm command on ACPI systems.

      License:       GPL-2

*  sys-power/acpid

      Latest version available: 2.0.6

      Latest version installed: 2.0.6

      Size of files: 72 kB

      Homepage:      http://tedfelix.com/linux/acpid-netlink.html

      Description:   Daemon for Advanced Configuration and Power Interface

      License:       GPL-2

*  sys-power/acpitool

      Latest version available: 0.5.1

      Latest version installed: [ Not Installed ]

      Size of files: 107 kB

      Homepage:      http://freeunix.dyndns.org:8088/site2/acpitool.shtml

      Description:   A small command line application, intended to be a replacement for the apm tool

      License:       GPL-2

*  sys-power/yacpi [ Masked ]

      Latest version available: 3.0.1

      Latest version installed: [ Not Installed ]

      Size of files: 14 kB

      Homepage:      http://www.ngolde.de/yacpi.html

      Description:   Yet Another Configuration and Power Interface

      License:       GPL-2

*  x11-misc/bbacpi [ Masked ]

      Latest version available: 0.1.5-r1

      Latest version installed: [ Not Installed ]

      Size of files: 235 kB

      Homepage:      http://bbacpi.sourceforge.net

      Description:   ACPI monitor for X11

      License:       GPL-2

*  x11-plugins/wmacpi

      Latest version available: 2.2_rc1

      Latest version installed: [ Not Installed ]

      Size of files: 29 kB

      Homepage:      http://himi.org/wmacpi/

      Description:   WMaker DockApp: ACPI status monitor for laptops

      License:       GPL-2

*  x11-plugins/wmacpiload-ac [ Masked ]

      Latest version available: 0.2.0

      Latest version installed: [ Not Installed ]

      Size of files: 110 kB

      Homepage:      http://wmacpiload.tuxfamily.org/

      Description:   Hacked version of WMACPILoad, a dockapp to monitor CPU temp and battery time on ACPI kernels.

      License:       GPL-2

*  x11-plugins/wmacpimon [ Masked ]

      Latest version available: 0.2.1

      Latest version installed: [ Not Installed ]

      Size of files: 25 kB

      Homepage:      http://www.vrlteam.org/wmacpimon/

      Description:   WMaker DockApp that monitors the temperature and Speedstep features in new ACPI-based systems.

      License:       GPL-2
```

----------

## kwesadilo

I am using the GNOME Power Management tool and the CPU Frequency Scaling Monitor applet (from the gnome-applets package. Normally, I have this applet set to use the Performance governor. I think that it has been set this way during my most recent reproductions of this problem. I will see what happens when I use the Conservative governor.

I do dual boot. I will post back later after looking at my BIOS and trying Windows XP. I might even open up the case. I hope it's not horribly dirty, because I don't have any compressed air on hand.

----------

## kwesadilo

I switched the tray applet to the Conservative governor, which seems to keep it at 800 MHz. The Performance governor keeps it at 2.5 GHz. Using the Conservative governor, Quake 4 was unplayable (1-2 fps) on ultra high quality, which made testing much more boring. I experienced a shutdown after about 11 minutes and 35 seconds of playing. Here are the contents of /var/log/everything/current surrounding that event. I have highlighted lines that are not unhandled processor event spam and that do not seem irrelevant to this problem. The last line is the first log entry after booting back up. I am pretty sure that the ACPI unhandled button press messages are from me trying to adjust the volume in-game. This is the first time that I have observed two thermal emergencies in one shutdown event. Except for in the few minutes prior to shutdown events, the unhandled processor events hardly occur at all in the logs and are never repeated in the volume that you see in the section I posted.

My fans seem to stay at a pretty constant, pretty quiet speed, regardless of demand. As far as I can tell from my sensor-monitoring tray applet, only one fan sensor is exported to ACPI, and it seems to stay between 2500 and 3200 rpm in low- and high-demand situations.

I took a look at my BIOS. There aren't any settings or data directly related to frequency or temperature. In the power settings section, Intel SpeedStep Technology was enabled, and under that category, the mode for AC power was Automatic, and the mode for battery power was Battery Optimized. Maximum Performance and Maximum Battery were also options. Under the Adaptive Thermal Management category, the scheme for AC was set to Maximize Performance, and the scheme for battery was set to Balanced. Those were the only two options for those two settings. CPU Power Management was set to automatic, as opposed to disabled. There were some other settings about the PCI bus and the CD drive speed, but they didn't seem relevant. After booting into Windows, I saw that an update for the BIOS was available online, and I installed it. None of the listed changes since my original version had anything to do with this problem. Later, when I looked in the BIOS again, the update had been applied, but none of the power settings had changed, and no settings had been added in the places I looked.

After updating the BIOS (and dealing with Windows' complaints related to not being booted for several months), I started playing Portal. I don't have the room on the Windows partition to install Quake, but the games are probably not that far off from each other in resource requirements. I played for about 37 minutes with no shutdowns. The computer did spontaneously go to sleep during that time. I woke it back up and kept playing. I don't know what to make of that. I hope that it's some Windows thing unrelated to this and that I never see it again.   :Confused: 

I tried again in Gentoo after updating the BIOS. There was no difference in behavior from my second-most recent reproduction in Gentoo, documented above.

Edit: Added record of latest shutdown occurrence while running Gentoo.

----------

## ASID

Since the fan is working, as you observed, then there is probably a driver/ kernel problem or tones of dust on it  :Very Happy: 

Did you have the same problem with previous kernels?

Is it possible to test one?

Also, can you please post the output of the following, while stressing a little bit your laptop:

```

cat /sys/class/thermal/thermal_zone0/temp

cat /sys/class/thermal/thermal_zone0/trip_point_0_temp

cat /sys/class/thermal/thermal_zone0/trip_point_0_type

cat /sys/class/thermal/thermal_zone0/mode

cat /sys/class/thermal/cooling_device?/*_state

```

Maybe turning on CONFIG_THINKPAD_ACPI_DEBUG would produce more output; haven't tried it though.

I would also test turning off GNOME Power Management tool, or even disabling the CONFIG_THINKPAD_ACPI. Maybe a bug there is preventing the fan to work as it should.

Good luck!

----------

## kwesadilo

I have not looked inside my case yet, so I don't know what the dust situation is. I have not been using Gentoo that long, so the only kernels I have used so far are revisions of 2.6.34. As far as I know, I have always had this problem. I noticed today that kernel 2.6.36-r5 is available, so I updated. As you may have noticed on the forum, it conflicts with a previously working version of nvidia-drivers, so I updated that to the latest unstable version. Everything appears to work exactly the way it used to, including still shutting down when I don't want it to. If you think it will help, I will try an older kernel. How far back do you think I should go?

Before switching to the new kernel, I ran some tests with kernel 2.6.34-r12 with CONFIG_THINKPAD_ACPI_DEBUG turned on. Here is the output you requested:

```
92000       // thermal_zone0/temp

127000      // thermal_zone0/trip_point_0_temp

critical    // thermal_zone0/trip_point_0_type

enabled     // thermal_zone0/mode

0           // cooling_device0/cur_state

10          // cooling_device0/max_state

0           // cooling_device1/cur_state

10          // cooling_device1/max_state
```

I tried to label the outputs correctly, but it is possible that I erred. I ran the exact command lines that you told me to in the order that you wrote them while stressing the system.

I tried disabling CONFIG_THINKPAD_ACPI, but it didn't make a difference. The output I generated via those same commands was identical, except that the 92000 was a 99000. The GNOME Power Management tool controls things like the backlight brightness and what happens when you close the lid. I don't think there's a way of turning it off short of uninstalling gnome-power-manager or stopping GNOME. I tried the latter, but it looks like Quake 4, my primary testing tool, needs X to run (or something like that more complicated than I want to get into right now). I'll see if I can run it with ctwm or some other lightweight window manager. Let me know if there is any other information that would help you figure out what's going on.

Update: I ran Quake from a failsafe XTerm session, and I still got the shutdown. To my knowledge, none of the GNOME software referenced above was running at that time.

----------

## DirtyHairy

I'm using gentoo for about five years now on a T60 and, while I never had issues with actual thermal shutdowns, the fan system indeed attract a significant amount of dust over time, and cleaning it can easily make a difference between >90 degrees under full load before and ~70 degrees after. Also, I switched to the ondemand governor a long time ago (which clocks up the CPU only under load) without any noticable speed penality. My wife is using a Fujitsu something laptop (don't know the exact model from the top of my head), and there the "dust problem" is so bad that we get thermal shutdowns from time to time, forcing us to clean it periodically. So, cleaning the fan is definitely a possible remedy  :Wink: 

----------

## ASID

kwesadilo,

everything seems fine with your configuration. 

Getting 92 and 99 C with a little stress points that you don't have sufficient cooling.

I suggest you clean your laptop or/and even buy a external cooling device.

----------

## Raptor85

Sound similar to an issue I had with one of my intel boards, is there any sort of "mode" setting for the ACPI in the bios?   It sounds like fan control is failing due to using the wrong module, or having an incompatible BIOS/BIOS setting.

----------

## kwesadilo

Raptor85, I don't think there is a way to set the ACPI "mode" from the BIOS. The available settings are as I described them above. Do you think the "Adaptive Thermal Management" setting could be what you are talking about?

After a tedious deconstruction, I removed the fan (and the heatsink) from the laptop. It's dirty, but I've seen desktops that were a lot worse. The fan blades and various places inside the case have a significant coating of dust, but the fan doesn't appear clogged. The heatsink had thermal paste between itself and the processor and what I assume is the graphics card. Residue of the same is on both sides of both points of thermal contact. A little bit of it rubbed off on my finger (oops). The hardware replacement manual says to apply thermal paste before installing a new fan. Presumably, new fans come with no thermal paste. Should I remove the existing thermal paste and put new thermal paste on, or can I safely leave it like it is?

If I have been brief, it is because I am posting this from my cellphone. I can't very well use my laptop with half of the guts lying on the table.

----------

## DirtyHairy

I'd clean out the old paste carefully (I usually use a bit of alcohol for this) and apply a new serving of thermal paste before reassembling it. Be careful with the part of heatsink attached to the GPU though; if the assembly is similar to that of the T60, this part already has a pillow of thermally conductive material on it and doesn't like being touched with alcohol (also, you should get away without thermal paste here). For me, it helps to blow out the complete assembly with a bit of compressed air; my T60 usually runs significantly colder after this "surgery".

----------

## kwesadilo

I blew out the fan assembly (and all of the other surfaces I encountered) with compressed air in what I think is the the manner described by DirtyHairy. I also cleaned the blades individually with an an alcohol swab. The fan initially looked like this. Now, it looks like this. Sorry if you get hit by pop-ups or something. The site seemed OK when I was there. The pictures aren't that great. There's still a little bit of dust on the blades, but there was a lot more. A bunch of dust came out when I blew compressed air threw the grille.

Here is what the stuff below the fan assembly looks like. Here is what the bottom of the fan assembly looks like. I have indicated the parts that touch, using my enviable MS Paint and trackpoint skills. I believe that part A on the motherboard is the CPU, based on the hardware replacement manual. A fairly cursory traversal of that same manual does not provide clues as to what part B is.

I haven't done anything to put it back together yet, because I don't have any thermal paste, and there is nowhere that I can go on the Sunday after Christmas to get any. Does anything in any of my pictures look like the "pillow" that you mentioned, DirtyHairy? How frequently do you clean your computer?

----------

## DirtyHairy

Hi Kwesadilo! The cleaned fan looks fine to me. The assembly is similar to mine, but there are some differences: on mine, both the part you marked with "B" and the part south of it are covered with the kind of pillowy thing you find on your assembly south of "B". Anyway, that's the pillow I mentioned. As for the parts, on mine (which has a discrete ATI GPU), the northern chip is part of the chipset (the northbridge I guess) and the southern one is the discrete ATI GPU. If you have a discrete GPU, I'd wager that the assignment is the same for you, but of course, I cannot say for sure - you should be able to find out once you have cleaned out the old paste.

In the five years, I have cleaned out the fan assembly perhaps three or four times (and I have also replaced the assembly the last time as it had started to make unpleasant running noises). After reassembling the machine, you can check the success of your efforts by fully loading all cores ("cat /dev/urandom > /dev/null" for instance) and monitoring the thermal sensors http://www.thinkwiki.org/wiki/Thermal_sensors  :Wink: 

----------

## NeddySeagoon

kwesadilo,

The two chips in contact with the heatsink are the CPU and the GPU.

Your original dirty fan image looks ok but dirt inside the heatsink will make it much less effective.

From the photos of the heatsink underside and bare chips, it looks like too much thermal paste was applied during assembly.

This is a very bad thing as the thermal paste is only supposed to drive out any air from the microscopic pockets that are left because the contact surfaces are not perfectly flat.

Heatsink compount is not a very good conductor compared to metals but its much better than air.

You do get overheating issues when you use too much.

Heatsink compound should never be reused - did you give it a good clean and replace it with new ?

----------

## kwesadilo

This morning, I went down to RadioShack and got some thermal compound. I went ahead and got the most expensive kind they had (Arctic Silver 5), because I didn't want to have to redo the entire disassembly and reassembly just because I was too cheap to spend another few bucks. I put my computer back together, and I guess I managed to plug everything in correctly, because it booted up with no problems. Right now, while I browse this forum, my CPU temperature is 46 C, and my GPU temperature is 54 C. I think my CPU used to be hotter than my GPU, but they're both doing OK now. My fan is also several hundred RPM lower than it used to be.

I played Quake for over an hour without a shutdown. About half an hour in, the CPU temperature was 71 C, which is awesome. I'm cured!

All of you, thanks so much for your help.

----------

