# Wrong ACPI thermal reading causes shutdown

## Rinne

Hi all,

I'm currently having a problem with random shutdowns due to alleged overheating.

The following message is logged in /var/log/messages:

```
Feb 10 20:06:01 Raven kernel: thermal thermal_zone0: critical temperature reached(80 C),shutting down

Feb 10 20:06:05 Raven shutdown[6201]: shutting down for system halt

Feb 10 20:06:05 Raven root[6208]: ACPI event unhandled: thermal_zone LNXTHERM:00 000000f0 00000001

Feb 10 20:06:05 Raven init[1]: Switching to runlevel: 0

Feb 10 20:06:06 Raven sshd[1449]: Received signal 15; terminating.

...

```

thermal_zone0 is the only zone acpi recognizes if I use 

```
acpi -t
```

.

I don't even know which sensor this zone represents

```
rinne@Raven ~ $ acpi -t

Thermal 0: ok, 0.0 degrees C
```

The temperature sometimes randomly switches to 7°C, and apparently, at some points to >80°C, thus causing a shutdown.

I have a Mainboard with an it87 chipset with rather controversial driver support that requires me to use 

```
acpi_enforce_resources=lax
```

, so I might have an interference here.

I have lm_sensors installed and all my (real) temperatures are rather ok:

```
acpitz-virtual-0

Adapter: Virtual device

temp1:         +0.0°C  (crit = +80.0°C)

k10temp-pci-00c3

Adapter: PCI adapter

CPU:           +0.0°C  (high = +70.0°C)

                       (crit = +70.0°C, hyst = +69.0°C)

it8772-isa-0a30

Adapter: ISA adapter

in0:          +0.50 V  (min =  +0.00 V, max =  +0.10 V)  ALARM

in1:          +1.52 V  (min =  +0.00 V, max =  +3.06 V)

in2:          +2.02 V  (min =  +0.00 V, max =  +3.06 V)

in3:          +2.06 V  (min =  +0.00 V, max =  +3.06 V)

in4:          +1.13 V  (min =  +0.00 V, max =  +3.06 V)

in5:          +1.13 V  (min =  +0.00 V, max =  +3.06 V)

in6:          +2.22 V  (min =  +0.00 V, max =  +3.06 V)

3VSB:         +3.31 V  (min =  +0.00 V, max =  +6.12 V)

Vbat:         +3.34 V  

fan1:           0 RPM  (min =   18 RPM)  ALARM

fan2:         384 RPM  (min =   10 RPM)

fan3:         740 RPM  (min =   13 RPM)

temp1:        +42.0°C  (low  = +75.0°C, high = +80.0°C)  sensor = thermistor

temp2:        +43.0°C  (low  = +127.0°C, high = +127.0°C)  sensor = thermistor

temp3:         +0.0°C  (low  = +127.0°C, high = +80.0°C)  sensor = thermal diode

intrusion0:  ALARM

```

The shutdown can also happen in both idle and under load.

Can I somehow safely disable the shutdown function, or better completely disable the broken sensor?

Or can anyone think of any other possible fix?

Best regards,

Rinne

----------

## eccerr0r

Does it do this if you don't override acpi_enforce_resources=lax ?  Unfortunately this indeed could cause issues.  I stopped using lm_sensors with machines that have this issue, which truly is a real potential for stepping on one another's toes.

You could try stop compiling the kernel with CONFIG_ACPI_THERMAL ...

----------

## Roman_Gruber

the grub kernel line allows some stuff like acpi=off.

i suggest you fire up some rescue livecd, and check if it happens there, if so than it is probably a kernel issue. if it happens there too than it is a hardware issue.

i assume you already inspected your hardware cooling system.

I also expect that you are using or have already tested the latest kernel release from kernel.org. not hte stable gentoo sources.

----------

## Rinne

Hi, thanks a lot for your replies.

 *eccerr0r wrote:*   

> Does it do this if you don't override acpi_enforce_resources=lax ?  Unfortunately this indeed could cause issues.  I stopped using lm_sensors with machines that have this issue, which truly is a real potential for stepping on one another's toes.
> 
> You could try stop compiling the kernel with CONFIG_ACPI_THERMAL ...

 

it only happened since I configured acpi_enfore_resources=lax. However the ACPI readings were wrong before as well, they just didn't jump around as much.

I removed CONFIG_ACPI_THERMAL and it seems to have done the trick. I need lm_sensors to use fancontrol. Unfortunately the Motherboards BIOS fan control doesn't work as expected (I'm very unsatisfied with the motherboard in general).

It seems to have issues with recognizing the fans pwm range. With fan control it works flawlessly for some reason.

 *tw04l124 wrote:*   

> the grub kernel line allows some stuff like acpi=off.
> 
> i suggest you fire up some rescue livecd, and check if it happens there, if so than it is probably a kernel issue. if it happens there too than it is a hardware issue.
> 
> i assume you already inspected your hardware cooling system.
> ...

 

The hardware cooling is fine. I actually put quite some effort into it and sensor readings are ok.

For now I'm going with disabling CONFIG_ACPI_THERMAL. Completely disabling ACPI doesn't make sense to me.

I'll follow the it87 driver news on new kernel releases. For now I'm not keen on leaving stable gentoo-sources.

Thanks a lot again.

----------

## eccerr0r

Indeed, outright disabling ACPI will cause serious issues with the system including disabling additional cores on the machine.

However I do not have much hope for the I2C drivers when they conflict with ACPI.  There's not really a way to safely make sure they do not conflict with each other - unless somehow they find the semaphore that ACPI uses, and have the i2c driver use the same lock.  Else it pretty much means one or the other - ACPI or lm_sensors, I ended up forced to use the former and not have the additional features of lm_sensors.  (Or pray that they do not clash...  In this case it is praying as it very likely will cause a conflict at sometime or another when ACPI and lm_sensors tries to use the i2c bus at the same time.))

----------

