# excessive battery drain during hibernate-ram [Unsolved]

## mikegpitt

I'm having an issue with one of my machines.  When I hibernate the machine to ram there is excessive battery drain.  When issuing the hibernate-ram command, the machine will run a fully charged battery to 0% in around 1 hour.  This is compared to maybe having it drain 25% in one hour when running normally off battery, so obviously something is out of whack.  The other thing I noticed is that after resuming from a suspend-to-ram, the desktop acts laggy for maybe 10 seconds.

To debug this issue, I have tried turning off many services (including X) and rmmodding all modules, however the issue remains.  The logs don't show anything out of the ordinary.

I'm really unsure on what might be happening and how to debug the issue.  It almost seems that the CPU continues to run and eat up battery when suspended, even though the display/internet/etc. are all off.

Ideas?Last edited by mikegpitt on Thu Sep 10, 2009 2:56 pm; edited 1 time in total

----------

## eccerr0r

I don't understand, hibernate usually means S4 sleep, and involves writing context to nonvolatile storage...

Do you mean suspend to RAM (S3) or true hibernation (S4)?

If you are talking about true hibernate, the machine should actually appear to turn off.  CPU should be stone cold, fans should be off.  I assume it's not getting to that state, any idea what it's doing?  If context is saved and you turn it off manually, it should restore from the disk image?

S3 Suspend-to-RAM requires ACPI to be working properly, and your firmware working properly, what motherboard/firmware, is it latest?

Have you tried suspend on other installs, do they work? (ubuntu, etc.)?

----------

## Hu

The hibernate script from TuxOnIce has an alternate invocation named hibernate-ram, which performs many of the same functions as the main hibernate script, but is used to enter S3 instead of hibernating.

----------

## mikegpitt

@eccerr0r:  Thanks for the comments.  As Hu mentioned, the the hibernate-ram script does an s3 suspend-to-ram.  s4, suspend-to-disk appears to be working fine.  I'm actually not using the tuxonice sources on this particular laptop, but as far as I know the sources aren't a dependency for the scripts... and the scripts provide a nice interface for removing problem modules starting/stopping services, etc.

I haven't tried another distro or installation on the machine, although that might be something worthwhile to try.  No firmware upgrades either.  For what it's worth, I have a clone of this machine that is installed on another laptop (different hardware) and suspend to ram appears to be working fine.

I did have one other issue with this laptop, where I needed to unload the uhci_hcd and ehci_hcd modules before suspending, or the machine would lock up.  I'm not sure if this could be related to the issue I'm having now...  I'm having the hibernate scripts rmmod these two modules before suspend.

Other suggestions welcome...

----------

## dmpogo

 *mikegpitt wrote:*   

> @eccerr0r:  Thanks for the comments.  As Hu mentioned, the the hibernate-ram script does an s3 suspend-to-ram.  s4, suspend-to-disk appears to be working fine.  I'm actually not using the tuxonice sources on this particular laptop, but as far as I know the sources aren't a dependency for the scripts... and the scripts provide a nice interface for removing problem modules starting/stopping services, etc.
> 
> I haven't tried another distro or installation on the machine, although that might be something worthwhile to try.  No firmware upgrades either.  For what it's worth, I have a clone of this machine that is installed on another laptop (different hardware) and suspend to ram appears to be working fine.
> 
> I did have one other issue with this laptop, where I needed to unload the uhci_hcd and ehci_hcd modules before suspending, or the machine would lock up.  I'm not sure if this could be related to the issue I'm having now...  I'm having the hibernate scripts rmmod these two modules before suspend.
> ...

 

Interesting, I use tuxonice patches, but suspend-to-ram is done straightforwardly.  3 weeks ago I suspended to ram (if I remember correctly) and week later found my laptop completely drained the battery.   It could have been that a week is too long anyway, but now i wonder if something similar to what you experience happened.

BTW I also unloand ehci_hcd and uhci_hcd. Had lockups in the past, and left it like that since.

----------

## mikegpitt

@dmpogo:  IMHO, a week is likely too long.  I use tuxonice with my regular laptop, and if I leave it suspended to ram for 24 hours the battery is already down to 40% or so.  I would recommend suspend-to-disk or powering down if you need longer times without being able to recharge...  of course I could always be doing something wrong...

Back to the initial topic, I just tried an older version of Ubuntu on the machine giving me problems.  I left it suspended to ram (starting with a fully charged battery) for about 30 minutes, and upon resume the battery was at 39%!  There is definitely something weird going on with the hardware or some driver.

It seems that something could be done via software to work around the issue, but I'm not even sure how to start debugging it.

----------

## dmpogo

 *mikegpitt wrote:*   

> @dmpogo:  IMHO, a week is likely too long.  I use tuxonice with my regular laptop, and if I leave it suspended to ram for 24 hours the battery is already down to 40% or so.  I would recommend suspend-to-disk or powering down if you need longer times without being able to recharge...  of course I could always be doing something wrong...
> 
> Back to the initial topic, I just tried an older version of Ubuntu on the machine giving me problems.  I left it suspended to ram (starting with a fully charged battery) for about 30 minutes, and upon resume the battery was at 39%!  There is definitely something weird going on with the hardware or some driver.
> 
> It seems that something could be done via software to work around the issue, but I'm not even sure how to start debugging it.

 

I can swear back in May when I traveled extensively, I was leaving suspend-to-Ram for 2-3 days with use of something like 10% of the battery (but I have a dual 5 hour batteries, so 10% is 30 min work). I'll investigate it more.

----------

## mikegpitt

 *dmpogo wrote:*   

> I can swear back in May when I traveled extensively, I was leaving suspend-to-Ram for 2-3 days with use of something like 10% of the battery (but I have a dual 5 hour batteries, so 10% is 30 min work). I'll investigate it more.

 That's the type of performance I would love to get out of suspend-to-ram (re: 2-3 days w/ 10% power loss).  I only recently started using suspend-to-ram again on my laptop, since I had so many problems with it in the past.  I'm not sure if the battery loss issues are with newer kernels, misconfiguration, something else...???

Obviously the problem I wrote about in the original post is related to something else.  I think the only solution at the moment is to use suspend-to-disk.  It's so weird that it would utilize more power than when it is on regularly while suspended.

----------

## eccerr0r

The machine must be 'on' for some reason or another, do you see any large increase in your /proc/interrupts before and after S3 sleep?  Can you tell if the CPU/chipset is actually still warm (finger test is OK)?  At most the RAM and only the RAM should be *slightly* warm.

The S3 sleep mode depends on the computer.  It should be less than being in S0 of course -- it *should* just be consuming whatever power to keep RAM contents from decaying.  It's definitely not right for it to use more power than if it were on.  But if it turns out to be a firmware issue, I don't know... BTW I'm not familar with TuxOnIce scripts, is it actually invoking ACPI S3 or is it doing something else to the machine?  If it's the former, I'd have to say it's a firmware bug.  The latter, it's probably a Linux issue...

My eeePC uses quite a bit of power in S3 sleep.  I lose about 1% battery capacity per hour that it's in S3.  It will go flat over a few days in suspend to RAM.

----------

## dmpogo

 *eccerr0r wrote:*   

>  I lose about 1% battery capacity per hour that it's in S3.  It will go flat over a few days in suspend to RAM.

 

Well, even if you have 10 hour (on idle) battery, that means that RAM uses 10% of the idle power (even less if your idle battery life is less), which is pretty reasonable.

----------

## mikegpitt

 *eccerr0r wrote:*   

> The machine must be 'on' for some reason or another, do you see any large increase in your /proc/interrupts before and after S3 sleep?  Can you tell if the CPU/chipset is actually still warm (finger test is OK)?  At most the RAM and only the RAM should be *slightly* warm.
> 
> The S3 sleep mode depends on the computer.  It should be less than being in S0 of course -- it *should* just be consuming whatever power to keep RAM contents from decaying.  It's definitely not right for it to use more power than if it were on.  But if it turns out to be a firmware issue, I don't know... BTW I'm not familar with TuxOnIce scripts, is it actually invoking ACPI S3 or is it doing something else to the machine?  If it's the former, I'd have to say it's a firmware bug.  The latter, it's probably a Linux issue...
> 
> My eeePC uses quite a bit of power in S3 sleep.  I lose about 1% battery capacity per hour that it's in S3.  It will go flat over a few days in suspend to RAM.

 The hibernate-ram script does put the machine into s3 sleep... and I've also tried it manually, so the issue isn't with the script it's with the suspend.

I just left the machine suspended for about a half hour and there wasn't any warmth to speak of (battery dropped to 67% though).

What is the proper way to monitor /proc/interupts when suspending?  Should I just have a script that cat's it out to a file over and over?

----------

## dmpogo

 *mikegpitt wrote:*   

>  *eccerr0r wrote:*   The machine must be 'on' for some reason or another, do you see any large increase in your /proc/interrupts before and after S3 sleep?  Can you tell if the CPU/chipset is actually still warm (finger test is OK)?  At most the RAM and only the RAM should be *slightly* warm.
> 
> The S3 sleep mode depends on the computer.  It should be less than being in S0 of course -- it *should* just be consuming whatever power to keep RAM contents from decaying.  It's definitely not right for it to use more power than if it were on.  But if it turns out to be a firmware issue, I don't know... BTW I'm not familar with TuxOnIce scripts, is it actually invoking ACPI S3 or is it doing something else to the machine?  If it's the former, I'd have to say it's a firmware bug.  The latter, it's probably a Linux issue...
> 
> My eeePC uses quite a bit of power in S3 sleep.  I lose about 1% battery capacity per hour that it's in S3.  It will go flat over a few days in suspend to RAM. The hibernate-ram script does put the machine into s3 sleep... and I've also tried it manually, so the issue isn't with the script it's with the suspend.
> ...

 

I would think you can write down what /proc/interrupts shows before suspend and after suspend to see whether there were interrupts generated during.

----------

## mikegpitt

Ok, I did a fresh reboot and this is what /proc/interrupts looks like before the suspend:

```
           CPU0       CPU1       

  0:      69102          0   IO-APIC-edge      timer

  1:         55          0   IO-APIC-edge      i8042

  9:         38          0   IO-APIC-fasteoi   acpi

 12:       1421          0   IO-APIC-edge      i8042

 14:       3186          0   IO-APIC-edge      ide0

 16:        228          0   IO-APIC-fasteoi   HDA Intel, uhci_hcd:usb4, eth0, i915@pci:0000:00:02.0

 18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3

 19:          0          0   IO-APIC-fasteoi   uhci_hcd:usb2

 23:       3148          0   IO-APIC-fasteoi   uhci_hcd:usb1, ehci_hcd:usb5

NMI:          0          0   Non-maskable interrupts

LOC:      11832      81098   Local timer interrupts

SPU:          0          0   Spurious interrupts

RES:       9508       9348   Rescheduling interrupts

CAL:         46        130   Function call interrupts

TLB:       1297        892   TLB shootdowns

TRM:          0          0   Thermal event interrupts

ERR:          0

MIS:          0

```

I suspended it and left it alone for a few minutes.  This is what /proc/interrupts looks like after the resume:

```
           CPU0       CPU1       

  0:     117045          0   IO-APIC-edge      timer

  1:        422          0   IO-APIC-edge      i8042

  9:         49          0   IO-APIC-fasteoi   acpi

 12:       1757          0   IO-APIC-edge      i8042

 14:       3592          0   IO-APIC-edge      ide0

 16:        259          0   IO-APIC-fasteoi   eth0, HDA Intel, uhci_hcd:usb5, i915@pci:0000:00:02.0

 18:          1          0   IO-APIC-fasteoi   uhci_hcd:usb4

 19:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3

 23:       5841          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2

NMI:          0          0   Non-maskable interrupts

LOC:      11832     128910   Local timer interrupts

SPU:          0          0   Spurious interrupts

RES:      11135      11324   Rescheduling interrupts

CAL:         50        138   Function call interrupts

TLB:       1344        939   TLB shootdowns

TRM:          0          0   Thermal event interrupts

ERR:          0

MIS:          0

```

I've never done any debugging by looking at such data before so I don't know what is normal and what isn't.  It looks like there was a large increase in 0: and RES:.

----------

## dmpogo

Well it obviously shows some activity (timer interrupts for example) but we don't know if they were accumulated during suspend/restart.

So you probably want to have a different (longer) baseline in suspended state to draw conclusions.  So I would have done fresh reboot, short suspenb, and immediately long suspend (like 30 min), hoping that most activity between boot and first resume is due to suspension process.

Honestly, however, I'm not sure if we neccessarily learn something, but it does look that computer remains active in suspended mode.

----------

## mikegpitt

Ok...  here are the interrupts after I rebooted, and then suspended for about a half hour:

```
           CPU0       CPU1       

  0:     106076          0   IO-APIC-edge      timer

  1:        153          0   IO-APIC-edge      i8042

  9:         59          0   IO-APIC-fasteoi   acpi

 12:       1259          0   IO-APIC-edge      i8042

 14:       3404          0   IO-APIC-edge      ide0

 16:        275          0   IO-APIC-fasteoi   eth0, HDA Intel, uhci_hcd:usb5, i915@pci:0000:00:02.0

 18:          4          0   IO-APIC-fasteoi   uhci_hcd:usb4

 19:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3

 23:       5093          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2

NMI:          0          0   Non-maskable interrupts

LOC:      12211     117459   Local timer interrupts

SPU:          0          0   Spurious interrupts

RES:       9787       9722   Rescheduling interrupts

CAL:         42        149   Function call interrupts

TLB:       1041       1331   TLB shootdowns

TRM:          0          0   Thermal event interrupts

ERR:          0

MIS:          0

```

It looks like the numbers are actually lower than in the previous attempt where I suspended only for a few minutes.  They did increase before and after the suspend however, as they did last time.

----------

## eccerr0r

Okay. We have some non-sequitur information here.

Due to conservation of energy, battery energy loss should result in some other form of energy being generated, and usually that is heat (for the sake of computers, pretty much *all* energy is converted to heat eventually.)  If the computer is cold during suspend, where is the energy going?  Is the LCD still on full brightness and blackened?  Energy cannot be destroyed.

The timer increase is also suspicious, but as said it still could be due to the suspend/resume.  We'll ignore that for now but it is a somewhat huge increase (assuming you're using a tickless kernel?)

Question I have now: Is the battery truly dead after you let it sit suspended for an hour?  How long does the machine run when the battery meter is at 0%?

----------

## dmpogo

 *eccerr0r wrote:*   

> 
> 
> The timer increase is also suspicious, but as said it still could be due to the suspend/resume.  We'll ignore that for now but it is a somewhat huge increase (assuming you're using a tickless kernel?)
> 
> 

 

Yep, but given that the number of interrupts did not actually grow as suspend time increases, I would tend to think it is generated during the sleep (and especially wake-up) process.

----------

## dmpogo

 *eccerr0r wrote:*   

>   Is the LCD still on full brightness and blackened?  Energy cannot be destroyed.
> 
> 

 

That's an interesting idea !

----------

## eccerr0r

The only other thing I thought of that fits in with the cpu+chipset+gpu staying cold but battery still being consumed is that the battery fuel gauge is really confused during suspend for whatever reason, hence wonderring if even though it says 0% -- how much longer does it last?

The backlight on but pixels obscuring theory actually doesn't even hold water in my books. This is because to make the same amount of heat, same amount of battery power must be eaten, so the display should have gotten warm as the cpu...  which does not seem to be the case.

----------

## mikegpitt

Thanks for the extra ideas guys!  Unfortunately the screen does power off, but it was a good thought.

The battery is also truly dead at 0%.  If I let it go that long the machine shutsdown, and can't be powered back up without plugging it in.

I'm not using a tickless system...  is this option recommended these days?

----------

## dmpogo

 *mikegpitt wrote:*   

> 
> 
> I'm not using a tickless system...  is this option recommended these days?

 

Yes, especially for laptops since it decreases the number of timer interrupts dramatically, allowing CPU to spend more time in low power state.

----------

## alkan

how long does your battery last during normal use? maybe your battery is dead already.

----------

## mikegpitt

 *alkan wrote:*   

> how long does your battery last during normal use? maybe your battery is dead already.

 It will last around 3.5 hours with normal use, after charged to 100%.

----------

## alkan

I highly doubt it is a setup issue. Something is wrong with your hardware. It drains the battery 3.5h in normal use, and 1h in suspend-to-ram!!! very unlikely.

Just to make sure, boot from a cd,usb... gentoo minimal,puppy,ubuntu or what ever you are familiar with. and see if it goes away. Otherwise contact the manufacturer if you have the warranty.

----------

## eccerr0r

Something is really wrong here... if it doesn't get warm during suspend and eating battery power, energy is being destroyed somehow.  In conservation of energy, if it gets warm in 3.5h, it should get REALLY hot in 1h as it's burning energy 3x as fast.

Unless this can be accounted for, there's no way we can solve the problem.

----------

## albright

 *Quote:*   

> Something is really wrong here... if it doesn't get warm during suspend and eating battery power, energy is being destroyed somehow. In conservation of energy, if it gets warm in 3.5h, it should get REALLY hot in 1h as it's burning energy 3x as fast. 
> 
>  Unless this can be accounted for, there's no way we can solve the problem.

 

Or else Mikegpitt is in line for the Nobel prize in physics  :Smile: 

----------

## mikegpitt

 *albright wrote:*   

>  *Quote:*   Something is really wrong here... if it doesn't get warm during suspend and eating battery power, energy is being destroyed somehow. In conservation of energy, if it gets warm in 3.5h, it should get REALLY hot in 1h as it's burning energy 3x as fast. 
> 
>  Unless this can be accounted for, there's no way we can solve the problem. 
> 
> Or else Mikegpitt is in line for the Nobel prize in physics 

 Heh that would be cool   :Smile: 

Based on my observations and everyone's comments, I'm strongly leaning towards a hardware malfunction.  It really doesn't make sense, but it's happening.  I will double check again when I get a chance on the heat issue and see if anything seems on that shouldn't be.

----------

## dmpogo

 *eccerr0r wrote:*   

> Something is really wrong here... if it doesn't get warm during suspend and eating battery power, energy is being destroyed somehow.  In conservation of energy, if it gets warm in 3.5h, it should get REALLY hot in 1h as it's burning energy 3x as fast.
> 
> Unless this can be accounted for, there's no way we can solve the problem.

 

One can improve the experiment and try to capture the heat. Put laptop under the blanket *) while it is suspended (to exclude an efficient cooling in your room).

*) Check often if it is already cooked  :Smile: 

----------

