# Sporadic lag -- really annoying [Solved]

## grooveman

Hi.

I'm having a really irritating problem with my PC.  Every few minutes (or sometimes several times a minute), my system will become unresponsive.  Nothing registers, no mouse clicks, keystrokes -- nothing, then all of a sudden, the system will become responsive again, and all my clicks and keystrokes spit out all at once.

This is a new system, and it is no wuss.  It is a quad-core amd with 8GB of ram.  I'm using KDE 4 as my desktop, and I have an NVidia geforce 9800 -- so the compositing should not be a burden at all.

I appreciate any help!

Thanks.

G

----------

## Rexilion

If I were you, I would:

- Try the nv driver for your card (completely disable nvidia (including opengl, kernel module etc...))

- Upgrade to a newer kernel (you have very recent hardware)

- Disable USB (I see a lot of those messages in your dmesg, this could *help* narrow down the problem)

- Try disabling cpu frequency schaling and any other power management related service (except for ACPI of course)

Hope that helps...

----------

## grooveman

Yeah, I think those are good tips... that nvidia driver seems very suspect to me in particular I'll give that a shot in the next day or two here...

Oh yes, I forgot to mention, I am running a raid 1 through mdadm/lvm2 -- thought I would think that should not make a difference...

----------

## Rexilion

 *grooveman wrote:*   

> Yeah, I think those are good tips... that nvidia driver seems very suspect to me in particular I'll give that a shot in the next day or two here...
> 
> Oh yes, I forgot to mention, I am running a raid 1 through mdadm/lvm2 -- thought I would think that should not make a difference...

 

Try without that too, but as a last resort as that seems highly unlikely to be the culprit.

----------

## transpetaflops

You wouldn't happen to have a Western Digital Green Power disk in that computer, would you? The Intelli-Park "feature" of that model is wrecking havoc in Linux-land unfortunately creating the exact behaviour you describe.

----------

## grooveman

 *transpetaflops wrote:*   

> You wouldn't happen to have a Western Digital Green Power disk in that computer, would you? The Intelli-Park "feature" of that model is wrecking havoc in Linux-land unfortunately creating the exact behaviour you describe.

 

Umm... yes.  I've raided two of them   :Shocked: 

And, of course, they are now no longer under warranty...

I have of these in RAID 1:

Western Digital Caviar Green WD10EADS 1TB 32MB Cache SATA 3.0Gb/s 3.5" Internal Hard Drive

But I can't find anything google-wise on these drives to this effect... the newegg reviews have a few linux users who do not complain of this...

----------

## depontius

Is this by anychance when Firefox is active?  Does it happen when Firefox isn't active?  I presume your /home is local ext3...  What are your mount options?  Did you accept the standard kernel options for building ext3?

I have /home mounted over nfs4, and used to have a bad case of this.  I moved ~/.mozilla onto local disk and things got better, but I know that there have also been a lot of complaints about firefox/sqlite/ext3 interactions.

----------

## transpetaflops

 *grooveman wrote:*   

>  *transpetaflops wrote:*   You wouldn't happen to have a Western Digital Green Power disk in that computer, would you? The Intelli-Park "feature" of that model is wrecking havoc in Linux-land unfortunately creating the exact behaviour you describe. 
> 
> Umm... yes.  I've raided two of them  
> 
> But I can't find anything google-wise on these drives to this effect... the newegg reviews have a few linux users who do not complain of this...

 

Problems started to surface back in 2008. Here's one thread on the kernel list: http://lkml.org/lkml/2008/4/10/360

Check your Load_Cycle_Count with smartctl -a /dev/sdX

The consumer drives are rated for 300,000 of these head offloading cycles. If this value is unreasonable high, then this is your problem. Intelli-Park parks the heads if the disk is idle for 8 seconds. It won't unload them again for 30 seconds after that and during this period a Linux system appears frozen. Windows users don't report any problem with this so the "feature" is obviously tuned for that OS. WD provided updated firmware at one point that disabled Intelli-Park and they also provided a tool called wdidle3.exe that could be used to disable it. Neither is provided anymore and wdidle3.exe doesn't work with newer models. You may be lucky if you can pick up a copy. I know Qnap still provides it on their support site. Many of their users experienced this problem since Qnap's products are Linux-based.

----------

## grooveman

 *depontius wrote:*   

> Is this by anychance when Firefox is active?  Does it happen when Firefox isn't active?  I presume your /home is local ext3...  What are your mount options?  Did you accept the standard kernel options for building ext3?
> 
> I have /home mounted over nfs4, and used to have a bad case of this.  I moved ~/.mozilla onto local disk and things got better, but I know that there have also been a lot of complaints about firefox/sqlite/ext3 interactions.

 

It seems to happen both when firefox is running, and when it isn't.  (Though it seems worse when FF is running).   A sure fire way to get it to act up is to start playing videos while I do something, anything, else.

My mount options are "noatime" --that's it.

Yes, I formatted the drives using mke2fs -j, not entirely sure by what you mean here:  *Quote:*   

> Did you accept the standard kernel options for building ext3?

 

But I'm not mounting over nfs or any other network connection.

----------

## grooveman

 *Quote:*   

> Problems started to surface back in 2008. Here's one thread on the kernel list: http://lkml.org/lkml/2008/4/10/360
> 
> Check your Load_Cycle_Count with smartctl -a /dev/sdX
> 
> The consumer drives are rated for 300,000 of these head offloading cycles. If this value is unreasonable high, then this is your problem. Intelli-Park parks the heads if the disk is idle for 8 seconds. It won't unload them again for 30 seconds after that and during this period a Linux system appears frozen. Windows users don't report any problem with this so the "feature" is obviously tuned for that OS. WD provided updated firmware at one point that disabled Intelli-Park and they also provided a tool called wdidle3.exe that could be used to disable it. Neither is provided anymore and wdidle3.exe doesn't work with newer models. You may be lucky if you can pick up a copy. I know Qnap still provides it on their support site. Many of their users experienced this problem since Qnap's products are Linux-based.

 

This does sound high, cuz these drives are only about 5 weeks old...  (I'm new to this utility, so can someone please confirm if this is high or not?)

```
smartctl -a /dev/sda                                                   

smartctl version 5.38 [x86_64-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen       

Home page is http://smartmontools.sourceforge.net/                                 

=== START OF INFORMATION SECTION ===

Device Model:     WDC WD10EADS-00M2B0

Serial Number:    WD-WCAV53396012    

Firmware Version: 01.00A01           

User Capacity:    1,000,204,886,016 bytes

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:   8                                                     

ATA Standard is:  Exact ATA specification draft version not indicated   

Local Time is:    Wed Feb 10 22:25:06 2010 EST                          

SMART support is: Available - device has SMART capability.              

SMART support is: Enabled                                               

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

General SMART Values:

Offline data collection status:  (0x84) Offline data collection activity

                                        was suspended by an interrupting command from host.

                                        Auto Offline Data Collection: Enabled.             

Self-test execution status:      (   0) The previous self-test routine completed           

                                        without error or no self-test has ever             

                                        been run.                                          

Total time to complete Offline                                                             

data collection:                 (19980) seconds.                                          

Offline data collection                                                                    

capabilities:                    (0x7b) SMART execute Offline immediate.                   

                                        Auto Offline data collection on/off support.       

                                        Suspend Offline collection upon new                

                                        command.                                           

                                        Offline surface scan supported.                    

                                        Self-test supported.                               

                                        Conveyance Self-test supported.                    

                                        Selective Self-test supported.                     

SMART capabilities:            (0x0003) Saves SMART data before entering                   

                                        power-saving mode.                                 

                                        Supports SMART auto save timer.                    

Error logging capability:        (0x01) Error logging supported.                           

                                        General Purpose Logging supported.                 

Short self-test routine                                                                    

recommended polling time:        (   2) minutes.                                           

Extended self-test routine                                                                 

recommended polling time:        ( 230) minutes.                                           

Conveyance self-test routine                                                               

recommended polling time:        (   5) minutes.                                           

SCT capabilities:              (0x3037) SCT Status supported.                              

                                        SCT Feature Control supported.                     

                                        SCT Data Table supported.                          

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:  

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0        

  3 Spin_Up_Time            0x0027   107   104   021    Pre-fail  Always       -       7650     

  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       74       

  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0        

  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0        

  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       431      

 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0

 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0

 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       72

192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       18

193 Load_Cycle_Count        0x0032   189   189   000    Old_age   Always       -       35482

194 Temperature_Celsius     0x0022   122   115   000    Old_age   Always       -       25

196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0

198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0

199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0

200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1

No Errors Logged

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1

 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

```

----------

## transpetaflops

Yes, unfortunately this confirms you suffer from the Intellipark issue. 35,000 load/unload cycles in 5 weeks means you will have passed the rated 300,000 in less than a year. The drive will of course continue to work after that but WD takes no responsibility for it. If your drives are that new they probably won't allow you to disable the feature. WD revoked their utility some time ago claiming it would hurt to use it on drives it wasn't designed for. I have seen some user script writing to the drives every 5 seconds to circumvent the problem but that's an ugly hack in my opinion. I have two WD15EADS used as paperweights on my desk right now. One of them counted 216,000 load cycles during the 8 month they were used and I can never trust them with data again. I experienced the exact same lag as you describe and it took me several months to track down but when I finally got on the right track it was actually possible to hear the heads park if I listened carefully. I replaced the drives with cheap Samsungs and the problem disappeared.

EDIT: I noticed now that your drive has only been active 431 hours, that's 2.5 weeks so at that rate you'll pass 300,000 cycles in 5 months.

----------

## grooveman

Ugh...

Soo... what drives can a Linux guy buy now days??  Everything at 1TB+ seems to have this technology, be they made by WD or someone else!  Which drives are safe??  Only samsung, or only certain samsungs?

BTW... I did find a copy of wdidle3, and I tried setting ti to 25500 and I tried disabling it.  It didn't seem to change a thing. So........  Either my problem lies eslewhere, or the drives ignore the setting (because using wdidle3 after a reboot does show them to be where I set them).

----------

## grooveman

Okay....

As a test, I have moved my system to an old sata WD 80GB caviar disk.... and lo and behold... the problems have abated.  

*sigh*

So, I blew $250 on useless drives.  Now, what the heck do I buy that I can put into software raid?  Cash is tight right now, so I do not want a repeat experience.

Anything here a good drive for a linux system in a RAID 1 mdadm/lvm2 solution?

Thanks so much, especially to you, transpetaflops, for your help thus far.

-G

----------

## transpetaflops

Sorry to be the bringer of bad news but glad I could help you identify the problem. I've gone through incidents on every major brand of harddrives. In general the enterprise models don't have these "features" but they are usually way to expensive for regular users. I'm trying Samsung's Ecogreen F2 right now. The power management functions can be completely disabled on them with hdparm -B 255 (this doesn't work on WD). AFAIK, WD are the only ones that actually offloads the heads this way to save power so you should probably be fine with any other brand or another WD model that lacks Intellipark.

----------

## transpetaflops

For future reference, here's also a link to Qnap's info about this: http://forum.qnap.com/viewtopic.php?f=182&t=22363

----------

## Monkeh

As far as I know, only WD's Green series drives do the head parking in that manner. Any other WD drive should be fine. And a hell of a lot faster.

----------

## grooveman

 *transpetaflops wrote:*   

> For future reference, here's also a link to Qnap's info about this: http://forum.qnap.com/viewtopic.php?f=182&t=22363

 

That's cool, but keep in mind that this utility does not work on the newer models, like my WD10EADSs.  It says it changes it, but the performance is still just as lousy.  

 *Monkeh wrote:*   

> As far as I know, only WD's Green series drives do the head parking in that manner. Any other WD drive should be fine. And a hell of a lot faster.

 

I'm getting some RE3 WD1002FBYSs to replace them, these are more for servers than home machines, so I think I should be okay (I have learned that WD nerfs their newer drives made for home users regarding their performance in RAID anyway).

Thanks again.

----------

