# Resume from suspend causes harddrive error

## nukem996

When ever I resume from suspend or hibernate my harddrive starts spinning and I cannt do anything that requires access to the harddrive I can only access things that are in memory. My dmesg is fill with the following in it after it resumes, this error only happens when I come back from resume, if I dont resume I never see this error.

```

ide: failed opcode was: unknown

hda: task_out_intr: status=0x51 { DriveReady SeekComplete Error }

hda: task_out_intr: error=0x10 { SectorIdNotFound }, LBAsect=74853287, sector=74853287

```

Thinking my harddrive may be dying I scanned it with the IBM hardware analysis tools(I have a Thinkpad T40) which said the drive was fine. I  then scanned it with the Hitachi Drive Fitness extended Test which also said the drive was fine. Upon closer look of the error I discoved that my drive only has 72037362 sectors while the error is on sector 74853287.

This happens with or without DMA and Ive tried disabling wireless, alsa, and hdparm. I also tried the kernel option use multi-mode by default which is supposted to fix this error but it dosnt. I currently use the Gentoo-sources but this also happens in the Suspend2-sources

Laptop: IBM Thinkpad T40

IDE Controller: Intel Corporation 82801DBM (ICH4-M) IDE Controller (rev 01)

Harddrive: Hitachi IC25N040ATCS05-0

Kernel: Gentoo-sources 2.6.17-r2

Can someone please help me? 

Thanks,

nuke

----------

## davidgurvich

If you have a hitachi hard drive, it's probably going bad.

----------

## nukem996

But I did the tests with two different harddrive testing programs and both said they were fine.

----------

## devsk

neither you nor your drive is going crazy. Its the IDE driver which is not suspend-resume cycle safe. Head over to suspend2-users/devel list and I have reported similar problems with IDE drivers. One thing to see would be if DMA is disabled after resume (for me it was *mdma1 whereas it should be *udma5 in 'hdparm -I /dev/hda'). If it is , then enable it using hdparm and see if it makes any difference. The whole suspend thing is just not safe as yet, until of course the drivers become fully resumable.

----------

## nukem996

On mine it stays at udma5. Disabling dma before I goto sleep mode helps a little and infact I thought it fixed it but after a few min of usage the error was back. The weird part about this is that when I first got this laptop I can Fedora which was fine with suspend I turned it into a Gentoo machine to take around the house I and I really didnt have a need for sleep mode. I noticed it wasnt working the other day which is how I discovered this error. Would you happen to know how to make it suspend aware?

----------

## devsk

There are other problems (apart from DMA settings) with IDE drivers. It seems to be fixed in 2.6.18-rc1, but that would unreliable in other ways. I have almost given up on suspend for now.

----------

## nukem996

I used suspend on Fedora everyday for about a year and a half and never had a problem, I just havnt used it recently since my laptop is always on. Im going to need suspend again next September since im going to college, do you think it will be ready by then?

----------

## devsk

I used it during 2.6.9 days pretty reliably for about a year as well. Its just that upgrades break something or the other. I hope it gets better with 2.6.18. There is also a concerted effort to get suspend2 into mm and mainline kernel. Lets' how far that goes.

----------

## nukem996

While experimenting with all this stuff I tired the suspend2-sources and I really didnt see a difference between them and the Gentoo sources. So if I may ask, what is the difference?

----------

## nukem996

I was just looking through the 2.6.18-rc1 patches and it seems that there is a fix for power management going in. If you look at ide-io.c there is a hole bunch of stuff for power management. I might try just that patch tomarrow but it seems that pretty much every IDE driver has been updated(although according to the comments that has nothing to do with power management)

----------

## devsk

 *nukem996 wrote:*   

> While experimenting with all this stuff I tired the suspend2-sources and I really didnt see a difference between them and the Gentoo sources. So if I may ask, what is the difference?

 the difference is exactly equal to suspend2 patch. i.e.

patch gentoo-sources with suspend2 patch from suspend2.net and you get suspend2-sources.

----------

## nukem996

 *devsk wrote:*   

>  *nukem996 wrote:*   While experimenting with all this stuff I tired the suspend2-sources and I really didnt see a difference between them and the Gentoo sources. So if I may ask, what is the difference? the difference is exactly equal to suspend2 patch. i.e.
> 
> patch gentoo-sources with suspend2 patch from suspend2.net and you get suspend2-sources.

 

What I mean to ask is that I have all the features suspend2 has while using the Gentoo-sources(suspend to ram and suspend to disk both work) so what is suspend2 giving me?

----------

## devsk

well if you are not using suspend2-sources or gentoo-sources patched with suspend2 patches, then you are using in-kernel swsusp. swsusp is more unreliable and has never worked for me. When I said I used it for a year, I used suspend2. suspend2 works better, faster and more reliably compared to in-kernel swsusp.

if any of your experiments with 2.6.18-rc1 succeed, please don't forget to post it here.

----------

## nukem996

I just tried 2.6.18-rc1 with the suspend2 patch and I get the same error. It seem this is not fixed(at least yet). Is there a kernel bug or a link about this that you know of?

----------

## devsk

darn! 

one which related to my problem:

http://bugzilla.kernel.org/show_bug.cgi?id=2039

can you please file another bug report with your exact errors? It will be better to open a new issue becase your errors are slightly different from mine.

----------

## pgolik

See my thread with similar problems. There is a patch in mm sources that helps with SATA (see the thread above), but I still have ATA isues on resume (like cdrom losing DMA). I'm glad to hear that there is an effort to resolve this in future kernels.

----------

## nukem996

I posted a kernel bug which can be found at http://bugzilla.kernel.org/show_bug.cgi?id=6840

----------

## pgolik

 *nukem996 wrote:*   

> I posted a kernel bug which can be found at http://bugzilla.kernel.org/show_bug.cgi?id=6840

 

I added my case to your bug, hope this will be dealt with - inability to suspend-to-ram on a modern system is a showstopper.

----------

## pgolik

 *devsk wrote:*   

> well if you are not using suspend2-sources or gentoo-sources patched with suspend2 patches, then you are using in-kernel swsusp. swsusp is more unreliable and has never worked for me

 

Does suspend2 do anything to S3 ACPI sleep (suspend to RAM)? I always thought it was for suspending to disk only. I saw absolutely no difference between regular and suspend2 kernel in behaviour on S3 sleep.

----------

## devsk

you are right about that. There are some subtle differences in the freezer code (shared) in the suspend2 patch, which might make suspend2 slightly different. But I don't think it would be drastically different as in suspend-to-disk case.

----------

## pgolik

Just bringing this old thread back to life. This patch finaly fixed my problem. It was the ide chipset driver allright. Now I have S3 suspend to ram fully working.[/url]

----------

## devsk

 *pgolik wrote:*   

> Just bringing this old thread back to life. This patch finaly fixed my problem. It was the ide chipset driver allright. Now I have S3 suspend to ram fully working.[/url]

 thanks for posting. This patch applies cleanly to 2.6.17 series as well. Now over to testing it to see if it resolves the intermittent DMA and stuff working IDE drives stuck in 'D' state problems.

----------

## pgolik

 *devsk wrote:*   

> This patch applies cleanly to 2.6.17 series as well.

 

I forgot to mention that I'm using this patch with regular gentoo-sources-2.6.17-r8 (newest stable on amd64) without any problem. I've been through about a dozen of suspend-resume cycles cycles over the last 24 hours and so far everything works fine. But I'm not sure this is going to do anything to you, in the first post you mention that you have an Intel ide controller, while I have nforce3 for amd (amd74xx kernel driver). But the solution for your chipset might be similar.

----------

## devsk

 *Quote:*   

> But I'm not sure this is going to do anything to you, in the first post you mention that you have an Intel ide controller

 that's the OP. I have an nForce4 mobo and hence the same driver (amd74xx) is used for nforce IDE and hence totally relevant for me. So far, the patch seems to work.

----------

## pgolik

 *devsk wrote:*   

> that's the OP. I have an nForce4 mobo and hence the same driver (amd74xx) is used for nforce IDE and hence totally relevant for me. So far, the patch seems to work.

 

Good for you! Anyway, it seems that much work is going into making ACPI work on linux the way it should, so solutions for other hardware are going to happen soon. Probably has to do with the growing share of laptops on the market (although desktops benefit from it as well).

----------

## devsk

This patch causes some slowdowns in my system. Although suspend-to-ram and resume work perfectly, I notice that the performance goes down to about half of what it was before the first suspend (i.e. freshly booted system) measured using bonnie++. I did it after I started getting lagging mouse during intensive I/O operations. And the funny thing is that the affected drives are all sata drives.

Since I have used the system without this patch for a while, I know that there is nothing else that has changed. drives are physically healthy because they are not really old, and smartctl/badblocks don't give me any errors. And moreover, if I freshly boot the system, I get my performance back to normal speeds.

PS: that's probably the reason why it never got into kernel (none of 2.6.18, 2.6.17, or 2.6.19, 2.6.19-mm series has it).

----------

## devsk

ok, dug it up a bit and it seems that the patch that you linked to, was rejected and a new patch to ide-io.c was proposed.

http://lkml.org/lkml/2006/7/28/136

which was accepted in mm-series and in 2.6.19 series, but is not present in 2.6.17 or 2.6.18.

edit: here is the same ide-io.c patch. It applies cleanly against 2.6.17.8 and works fine my machine:

```

$ diff -u ./drivers/ide/ide-io.c.orig ./drivers/ide/ide-io.c

--- ./drivers/ide/ide-io.c.orig 2006-10-11 19:22:21.000000000 -0700

+++ ./drivers/ide/ide-io.c      2006-11-17 14:35:43.000000000 -0800

@@ -136,7 +136,8 @@

        ide_pm_flush_cache      = ide_pm_state_start_suspend,

        idedisk_pm_standby,

-       idedisk_pm_idle         = ide_pm_state_start_resume,

+       idedisk_pm_restore_pio  = ide_pm_state_start_resume,

+       idedisk_pm_idle,

        ide_pm_restore_dma,

 };

@@ -155,7 +156,10 @@

        case idedisk_pm_standby:        /* Suspend step 2 (standby) complete */

                rq->pm->pm_step = ide_pm_state_completed;

                break;

-       case idedisk_pm_idle:           /* Resume step 1 (idle) complete */

+        case idedisk_pm_restore_pio:    /* Resume step 1 complete */

+                rq->pm->pm_step = idedisk_pm_idle;

+                break;

+       case idedisk_pm_idle:           /* Resume step 2 (idle) complete */

                rq->pm->pm_step = ide_pm_restore_dma;

                break;

        }

@@ -168,8 +172,11 @@

        memset(args, 0, sizeof(*args));

        if (drive->media != ide_disk) {

-               /* skip idedisk_pm_idle for ATAPI devices */

-               if (rq->pm->pm_step == idedisk_pm_idle)

+               /*

+                 * skip idedisk_pm_restore_pio and idedisk_pm_idle for ATAPI

+                 * devices

+                 */

+               if (rq->pm->pm_step == idedisk_pm_restore_pio)

                        rq->pm->pm_step = ide_pm_restore_dma;

        }

@@ -196,13 +203,19 @@

                args->handler      = &task_no_data_intr;

                return do_rw_taskfile(drive, args);

-       case idedisk_pm_idle:           /* Resume step 1 (idle) */

+       case idedisk_pm_restore_pio:    /* Resume step 1 (restore PIO) */

+               if (drive->hwif->tuneproc != NULL)

+                       drive->hwif->tuneproc(drive, 255);

+               ide_complete_power_step(drive, rq, 0, 0);

+               return ide_stopped;

+

+       case idedisk_pm_idle:           /* Resume step 2 (idle) */

                args->tfRegister[IDE_COMMAND_OFFSET] = WIN_IDLEIMMEDIATE;

                args->command_type = IDE_DRIVE_TASK_NO_DATA;

                args->handler = task_no_data_intr;

                return do_rw_taskfile(drive, args);

-       case ide_pm_restore_dma:        /* Resume step 2 (restore DMA) */

+       case ide_pm_restore_dma:        /* Resume step 3 (restore DMA) */

                /*

                 * Right now, all we do is call hwif->ide_dma_check(drive),

                 * we could be smarter and check for current xfer_speed

```

----------

## pgolik

I tried the new patch (against 2.16.18-r6), but it doesn't work as well as the old rejected one. With the new one I have no problems with PATA harddisks, but using the DVD-RW drive still locks up (every time when I try to run k3b after resume), which doesn't happen with the old patch.

 I'll wait for 2.6.19 and the new libata PATA stuff before I dig deeper into the problem.

----------

## devsk

 *pgolik wrote:*   

> I'll wait for 2.6.19 and the new libata PATA stuff before I dig deeper into the problem.

 gentoo-sources-2.6.19-r2 is in portage. What did you mean? I have been using gentoo-sources-2.6.19-r2 with suspend2 and reiser4 patches applied manually, no errors of any kind so far (uptime 15 days). Have been burning dvd's and playing movies all along these 15 days without any errors, suspending and resuming almost twice a day.

----------

## pgolik

 *devsk wrote:*   

> gentoo-sources-2.6.19-r2 is in portage. What did you mean?

 

It's still marked as ~amd64, and I usually stick to stable versions for kernel. I'll try and unmask it though. Did you switch your PATA stuff to the new libata driver? Did it improve things?

----------

## devsk

 *pgolik wrote:*   

> Did you switch your PATA stuff to the new libata driver? Did it improve things?

 nope, I left it alone because the patch for power management was in ide-io.c, so my assumption was that it applied to IDE anyway. And since I don't have any problems and I get around 60Mbytes/sec thruput from pata disk with IDE drivers, I don't care much.

----------

## pgolik

I tried gentoo-sources 2.6.19-r2

First: new libata drivers for pata (PATA_AMD) - disk recognized correctly, but still has the power management bug - accesing the drive after resume freezes system. So, no progress compared to unpatched earlier kernels.

Second: old IDE drivers. HDD resumes correctly (DMA gets turned off, but a re-run of hdparm fixes it), but the DVD-RW doesn't work (doesn't freeze the system, just k3b hangs silently). Dmesg:

```

hdc: DMA timeout retry

hdc: timeout waiting for DMA

hdc: DMA timeout retry

hdc: timeout waiting for DMA

hdc: DMA timeout retry

hdc: timeout waiting for DMA

hdc: DMA timeout retry

hdc: timeout waiting for DMA

```

So - no progress compared to 2.6.18 patched with the ide-io.c patch.

2.6.18 patched with the amd74xx.c works best for me, I haven't noticed any slowdowns, but the IDE hdd is my secondary drive that isn't used heavily. No problems with DVD-RW upon resume with this old rejected patch.

[EDIT]

There is a new patch against the libata driver (PATA_AMD), which solves the problem. The patch, which can be found here is by Alan Cox himself, and from what I gathered will be included even in later revisions of 2.6.19 (r2 still doesn't have it). With this patch my PATA hdd works fine with the new driver, all devices resume correctly and it seems a bit faster. Also, having all disks (PATA and SATA) show up as /dev/sdX is neat and clean.

----------

