# [SOLVED] Major slow down after a day

## snelson

Hello,

This is my first post so hopefully I've put it in the right category..  :Smile: 

I've got an IBM ThinkCentre desktop machine which I'm re-using as a dedicated Gentoo server.

Everything has been installed on there and the system seems to work fine for a while - however after a few hours/days the machine slows to a crawl such that logging in via SSH takes about 3 minutes and running top can take half a day.

My complete guess as to the cause of the problem is that the machine is slowing down its clock when not in use and never increases it again. Its a P4 Prescott with 512mb of RAM. It's currently running Mysql, Apache2 and Samba. We're also using Rails and php. 

The output of /proc/meminfo is below:

```

MemTotal:       497216 kB

MemFree:          6876 kB

Buffers:        224664 kB

Cached:          39136 kB

SwapCached:          0 kB

Active:         178728 kB

Inactive:       156792 kB

HighTotal:           0 kB

HighFree:            0 kB

LowTotal:       497216 kB

LowFree:          6876 kB

SwapTotal:      506036 kB

SwapFree:       503280 kB

Dirty:           12556 kB

Writeback:           0 kB

AnonPages:       71192 kB

Mapped:          23096 kB

Slab:           147648 kB

SReclaimable:   139208 kB

SUnreclaim:       8440 kB

PageTables:        868 kB

NFS_Unstable:        0 kB

Bounce:              0 kB

CommitLimit:    754644 kB

Committed_AS:   250200 kB

VmallocTotal:   524280 kB

VmallocUsed:      5372 kB

VmallocChunk:   518904 kB

HugePages_Total:     0

HugePages_Free:      0

HugePages_Rsvd:      0

Hugepagesize:     4096 kB

```

cpuinfo:

```

processor       : 0

vendor_id       : GenuineIntel

cpu family      : 15

model           : 3

model name      : Intel(R) Pentium(R) 4 CPU 3.00GHz

stepping        : 4

cpu MHz         : 2992.646

cache size      : 1024 KB

physical id     : 0

siblings        : 2

core id         : 0

cpu cores       : 1

fdiv_bug        : no

hlt_bug         : no

f00f_bug        : no

coma_bug        : no

fpu             : yes

fpu_exception   : yes

cpuid level     : 5

wp              : yes

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc pni monitor ds_cpl cid xtpr

bogomips        : 5989.60

clflush size    : 64

processor       : 1

vendor_id       : GenuineIntel

cpu family      : 15

model           : 3

model name      : Intel(R) Pentium(R) 4 CPU 3.00GHz

stepping        : 4

cpu MHz         : 2992.646

cache size      : 1024 KB

physical id     : 0

siblings        : 2

core id         : 0

cpu cores       : 1

fdiv_bug        : no

hlt_bug         : no

f00f_bug        : no

coma_bug        : no

fpu             : yes

fpu_exception   : yes

cpuid level     : 5

wp              : yes

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc pni monitor ds_cpl cid xtpr

bogomips        : 5985.44

clflush size    : 64

```

If it's handy to show any other logs please let me know. I was gonna post dmesg but it's quite long so I'll wait till someone asks for it.

If anyone can give some pointers on what to check I'd be most grateful.

Many thanks

StephenLast edited by snelson on Mon Jul 16, 2007 10:37 am; edited 1 time in total

----------

## Rob1n

Are these details from at start-up time or from after the slow-down has occurred?

I'd hazard a guess that you're running into memory issues (these sort of slow-downs are often caused by the system thrashing the swap disk) - the output of 'cat /proc/meminfo' and 'ps aux' once it's slowed down should show up any issues here.

----------

## ToeiRei

do not forget a cat /proc/swaps...

----------

## DirtyHairy

This sounds like a memory leak somewhere which, as Rob1n suggested, causes the system to swap itself to near-death. I can't imagine a clock speed at which ssh takes the order of minutes (well I CAN, but that is unrealistic  :Smile:  ). If it indeed is a memory leak, then waiting a bit longer should make the out-of-memory killer kick in as a nice proof of this theory (I'm not sure if it can be disabled in the kernel config, but if it can, it should be enabled by default). After that, the system will be responsive again, and dmesg will show the massacre....

----------

## snelson

Hi

Thanks for the replies. I didn't really know what the problem might be - a memory leak sounds like a possible candidate. We had to reboot it today so I can't produce any logs when its performing badly. Right now it's doing an emerge -e world as I updated the compiler flags for the prescott processor.

The swaps file is currently:

```

Filename                                Type            Size    Used    Priority

/dev/hda2                               partition       506036  2760    -1

```

This is current dmesg:

```

hobnob ~ # tail -n 50 /var/log/dmesg

i810_audio: Defaulting to base 2 channel mode.

i810_audio: Resetting connection 0

i810_audio: Connection 0 with codec id 0

ac97_codec: AC97 Audio codec, id: ADS116 (Analog Devices AD1981B)

i810_audio: AC'97 codec 0 supports AMAP, total channels = 2

oprofile: using NMI interrupt.

TCP cubic registered

NET: Registered protocol family 1

NET: Registered protocol family 10

IPv6 over IPv4 tunneling driver

NET: Registered protocol family 17

Using IPI Shortcut mode

Time: tsc clocksource has been installed.

md: Autodetecting RAID arrays.

md: autorun ...

md: ... autorun DONE.

ReiserFS: hda3: found reiserfs format "3.6" with standard journal

ReiserFS: hda3: using ordered data mode

ReiserFS: hda3: journal params: device hda3, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

ReiserFS: hda3: checking transaction log (hda3)

ReiserFS: hda3: Using r5 hash to sort names

VFS: Mounted root (reiserfs filesystem) readonly.

Freeing unused kernel memory: 272k freed

device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel@redhat.com

ReiserFS: dm-0: found reiserfs format "3.6" with standard journal

ReiserFS: dm-0: using ordered data mode

ReiserFS: dm-0: journal params: device dm-0, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

ReiserFS: dm-0: checking transaction log (dm-0)

ReiserFS: dm-0: Using r5 hash to sort names

ReiserFS: dm-1: found reiserfs format "3.6" with standard journal

ReiserFS: dm-1: using ordered data mode

ReiserFS: dm-1: journal params: device dm-1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

ReiserFS: dm-1: checking transaction log (dm-1)

ReiserFS: dm-1: Using r5 hash to sort names

ReiserFS: dm-2: found reiserfs format "3.6" with standard journal

ReiserFS: dm-2: using ordered data mode

ReiserFS: dm-2: journal params: device dm-2, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

ReiserFS: dm-2: checking transaction log (dm-2)

ReiserFS: dm-2: Using r5 hash to sort names

ReiserFS: dm-3: found reiserfs format "3.6" with standard journal

ReiserFS: dm-3: using ordered data mode

ReiserFS: dm-3: journal params: device dm-3, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

ReiserFS: dm-3: checking transaction log (dm-3)

ReiserFS: dm-3: Using r5 hash to sort names

ReiserFS: dm-4: found reiserfs format "3.6" with standard journal

ReiserFS: dm-4: using ordered data mode

ReiserFS: dm-4: journal params: device dm-4, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

ReiserFS: dm-4: checking transaction log (dm-4)

ReiserFS: dm-4: Using r5 hash to sort names

Adding 506036k swap on /dev/hda2.  Priority:-1 extents:1 across:506036k

```

Tts performing ok right now after the reboot. Will take a look after the weekend to see if there's anything in there. Thanks again and have a pleasant weekend.

Cheers

Stephen

----------

## snelson

Hi,

I'm back with a slow system!!

Here's the output of /proc/meminfo:

```

MemTotal:       497216 kB

MemFree:          6432 kB

Buffers:        220352 kB

Cached:          59216 kB

SwapCached:          0 kB

Active:         209716 kB

Inactive:       112732 kB

HighTotal:           0 kB

HighFree:            0 kB

LowTotal:       497216 kB

LowFree:          6432 kB

SwapTotal:      506036 kB

SwapFree:       503276 kB

Dirty:           12480 kB

Writeback:           0 kB

AnonPages:       42932 kB

Mapped:          13032 kB

Slab:           161468 kB

SReclaimable:   151784 kB

SUnreclaim:       9684 kB

PageTables:        620 kB

NFS_Unstable:        0 kB

Bounce:              0 kB

CommitLimit:    754644 kB

Committed_AS:   220244 kB

VmallocTotal:   524280 kB

VmallocUsed:      5340 kB

VmallocChunk:   518936 kB

HugePages_Total:     0

HugePages_Free:      0

HugePages_Rsvd:      0

Hugepagesize:     4096 kB

```

cat /proc/swaps

```

Filename                                Type            Size    Used    Priority

/dev/hda2                               partition       506036  2760    -1

```

and ps aux

```

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND

root         1  0.0  0.0   1528   472 ?        Ss   Jun29   0:00 init [3]

root         2  0.0  0.0      0     0 ?        S    Jun29   0:03 [migration/0]

root         3  0.0  0.0      0     0 ?        SN   Jun29   0:00 [ksoftirqd/0]

root         4  0.0  0.0      0     0 ?        S    Jun29   0:00 [watchdog/0]

root         5  0.0  0.0      0     0 ?        S    Jun29   0:04 [migration/1]

root         6  0.0  0.0      0     0 ?        SN   Jun29   0:00 [ksoftirqd/1]

root         7  0.0  0.0      0     0 ?        S    Jun29   0:00 [watchdog/1]

root         8  0.0  0.0      0     0 ?        S<   Jun29   0:00 [events/0]

root         9  0.0  0.0      0     0 ?        S<   Jun29   0:00 [events/1]

root        10  0.0  0.0      0     0 ?        S<   Jun29   0:00 [khelper]

root        11  0.0  0.0      0     0 ?        S<   Jun29   0:00 [kthread]

root        97  0.0  0.0      0     0 ?        S<   Jun29   0:01 [kblockd/0]

root        98  0.0  0.0      0     0 ?        S<   Jun29   0:00 [kblockd/1]

root        99  0.0  0.0      0     0 ?        S<   Jun29   0:00 [kacpid]

root       173  0.0  0.0      0     0 ?        S<   Jun29   0:00 [ata/0]

root       174  0.0  0.0      0     0 ?        S<   Jun29   0:00 [ata/1]

root       175  0.0  0.0      0     0 ?        S<   Jun29   0:00 [ata_aux]

root       176  0.0  0.0      0     0 ?        S<   Jun29   0:00 [ksuspend_usbd]

root       179  0.0  0.0      0     0 ?        S<   Jun29   0:00 [khubd]

root       181  0.0  0.0      0     0 ?        S<   Jun29   0:00 [kseriod]

root       191  0.0  0.0      0     0 ?        S<   Jun29   0:00 [khpsbpkt]

root       216  0.0  0.0      0     0 ?        S<   Jun29   0:05 [kswapd0]

root       217  0.0  0.0      0     0 ?        S<   Jun29   0:00 [aio/0]

root       218  0.0  0.0      0     0 ?        S<   Jun29   0:00 [aio/1]

root       606  0.0  0.1   2084   764 ?        S    00:01   0:00 /usr/sbin/cron

root       607  0.0  0.2   2412  1024 ?        Ss   00:01   0:00 /bin/sh -c /usr/bin/emerge --sync > /var/log/crontab/emerge.log

root       608  0.0  2.2  13140 11128 ?        S    00:01   0:00 /usr/bin/python -O /usr/bin/emerge --sync

root       950  0.0  0.0      0     0 ?        S<   Jun29   0:00 [kpsmoused]

root       960  0.0  0.0      0     0 ?        S<   Jun29   0:00 [reiserfs/0]

root       961  0.0  0.0      0     0 ?        S<   Jun29   0:00 [reiserfs/1]

root      1377  0.0  0.4   6944  2196 ?        Ss   08:16   0:00 sshd: nelsons [priv]

nelsons   1379  0.0  0.2   6944  1408 ?        S    08:17   0:00 sshd: nelsons@pts/0

nelsons   1380  0.0  0.3   2876  1540 pts/0    Ss   08:17   0:00 -bash

nelsons   1561  0.0  0.1   2152   848 pts/0    R+   10:14   0:00 ps aux

root      4147  0.0  0.1   1896   608 ?        Ss   Jun29   0:00 /usr/sbin/syslog-ng

mysql     4699  0.0  5.4 142344 26944 ?        Ssl  Jun29   0:00 /usr/sbin/mysqld --defaults-file=/etc/mysql/my.cnf --basedir=/usr --datadir=/var/lib/mysq

root      4776  0.0  0.2   3928  1016 ?        Ss   Jun29   0:00 /usr/sbin/sshd

root      4832  0.0  1.5  17052  7552 ?        Ss   Jun29   0:00 /usr/sbin/apache2 -D DEFAULT_VHOST -D PHP5 -d /usr/lib/apache2 -f /etc/apache2/httpd.conf

apache    4833  0.0  1.1  15976  5656 ?        S    Jun29   0:00 /usr/sbin/apache2 -D DEFAULT_VHOST -D PHP5 -d /usr/lib/apache2 -f /etc/apache2/httpd.conf

apache    4884  0.0  1.3  17052  6496 ?        S    Jun29   0:00 /usr/sbin/apache2 -D DEFAULT_VHOST -D PHP5 -d /usr/lib/apache2 -f /etc/apache2/httpd.conf

apache    4885  0.0  1.3  17052  6496 ?        S    Jun29   0:00 /usr/sbin/apache2 -D DEFAULT_VHOST -D PHP5 -d /usr/lib/apache2 -f /etc/apache2/httpd.conf

apache    4886  0.0  1.3  17052  6496 ?        S    Jun29   0:00 /usr/sbin/apache2 -D DEFAULT_VHOST -D PHP5 -d /usr/lib/apache2 -f /etc/apache2/httpd.conf

apache    4887  0.0  1.3  17052  6496 ?        S    Jun29   0:00 /usr/sbin/apache2 -D DEFAULT_VHOST -D PHP5 -d /usr/lib/apache2 -f /etc/apache2/httpd.conf

apache    4888  0.0  1.3  17052  6496 ?        S    Jun29   0:00 /usr/sbin/apache2 -D DEFAULT_VHOST -D PHP5 -d /usr/lib/apache2 -f /etc/apache2/httpd.conf

root      4889  0.0  0.5   9800  2572 ?        Ss   Jun29   0:00 /usr/sbin/smbd -D

root      4898  0.0  0.2   9800  1192 ?        S    Jun29   0:00 /usr/sbin/smbd -D

root      4899  0.0  0.2   5456  1328 ?        Ss   Jun29   0:02 /usr/sbin/nmbd -D

root      4955  0.0  0.1   1780   672 ?        Ss   Jun29   0:00 /usr/sbin/cron

root      5039  0.0  0.1   2740   652 ?        Ss   Jun29   0:00 /usr/kde/3.5/bin/kdm

root      5101  0.0  0.2   2348  1152 tty1     Ss   Jun29   0:00 /bin/login --

root      5103  0.0  0.1   1660   688 tty2     Ss+  Jun29   0:00 /sbin/agetty 38400 tty2 linux

root      5104  0.0  0.1   1664   692 tty3     Ss+  Jun29   0:00 /sbin/agetty 38400 tty3 linux

root      5107  0.0  0.1   1664   692 tty4     Ss+  Jun29   0:00 /sbin/agetty 38400 tty4 linux

root      5109  0.0  0.1   1664   692 tty5     Ss+  Jun29   0:00 /sbin/agetty 38400 tty5 linux

root      5111  0.0  0.1   1660   688 tty6     Ss+  Jun29   0:00 /sbin/agetty 38400 tty6 linux

nelsons   5139  0.0  0.3   2888  1516 tty1     S    Jun29   0:00 -bash

root      5145  0.0  0.1   2236   992 tty1     S    Jun29   0:00 su -

root      5146  0.0  0.3   2624  1516 tty1     S+   Jun29   0:00 -su

root      6495  0.0  0.0      0     0 ?        S    Jun29   0:00 [pdflush]

root     14866  0.0  0.0      0     0 ?        S    Jun29   0:02 [pdflush]

root     16119  0.0  0.1   1816   588 ?        S<s  Jun29   0:00 /sbin/udevd --daemon

```

To my (untrained) eye things don't look too bad but it's really slow to do anything. 

Thanks again

Stephen

----------

## Rob1n

Memory certainly looks fine.  It is running an "emerge sync" there - that could be hitting the disks pretty hard.  You could try running that manually after a reboot and seeing whether you can trigger the problem.  I can't see anything else that looks odd - what're the load averages (reported by top) like when it's slow?

----------

## snelson

Yeah, that seems a bit odd as it should be running that at around midnight. It rsync's to a local mirror on the network. Possibly that point is failing and its hanging? Will try and see if there's any issues there.

Thanks

Stephen

----------

## red-wolf76

I have a similar problem, although not on a P4 but an Athlon.

It seems as if pdflush grabs hold of the cpu, on and off, the usage goes up to 99% with that process. Once that hits, I can only reboot, which will fix it.

I'm still in the process of tracking this down, but it appears to be happening while using sudo, portage, gnome-terminal gnome and latest X, all under mm-sources.

I've read about pdflush misbehaving under mm with network file systems and reiserfs but that problem dated from 2004, if I read correctly.

----------

## snelson

Actually, yes, I remember pdflush taking up a lot of CPU time overall on one of the occasions where the slowdown occured.

I'm using ReiserFS too so there is a few common sililarities there. If you find a fix let me know!!

Thanks

Stephen

----------

## red-wolf76

I'm not at my box to test it, but it may be related to my kernel, which is mm-sources and locked to the 2.6.20-rc1-mm1 revision because nvidia-drivers didn't like compiling with something more recent.   :Crying or Very sad:  What kernel version are you using?

It also is most prevalent under Gnome and gnome-terminal. I'll try using xterm next and then compiling in the console to see if I can narrow it down to a detailed culprit. I expect a couple of weeks for that though...

----------

## snelson

Hi,

Thanks for the reply. 

I'm using 2.6.20-gentoo-r8 kernel. It's running the latest of everything, I believe, as emerge -p -u world has nothing to merge. Just a quick question -  what's the MM sources? 

Cheers

Stephen

----------

## red-wolf76

They're a set of advanced and sometimes experimental patches maintained by Andrew Morton, one of the maintainers for the linux kernel, just like Con Kolivas offers a separate set with his ck-sources. Both sys-kernel/mm-sources and sys-kernel/ck-sources are in portage even though they're keyword masked ~arch if I recall correctly and they're usually unsupported with regard to bug reports, so unless you can rule out that it isn't the kernel, you get to keep the pieces if something breaks on your box.

If you type emerge -p mm-sources or emerge -p ck-sources it should tell you that all packages of that description have been masked.

There's a number of kernel features not yet in the more stable run-of-the-mill patchsets such as gentoo-sources and it's really a philosophical question whether a tickless timer or Reiser4 support is actually needed, but these "experimental" usually have more recent kernel versions and therefore sometimes also better support for the newest hardware.

There's a number of other patchsets out there (nitro, viper), all of which are more or less officially "unsupported" by gentoo, meaning that they'll run, but you'd better not file a bug report with them if you hit any rocks...  :Twisted Evil: 

EDIT:I was getting it in the console as well now. I've now done two things that appear to have fixed it. One was changing the parameter for harddrive DMA from "-c1" to "-c3" (32 /w sync) in hdparm, the other was compiling and booting a ck-sources-2.6.21-ck2-r1 kernel. I'm going to reboot now to check if I need to stick to this kernel or if it was the hdparm DMA sync that was needed.

UPDATE:Yes, it seems to be the kernel. ck fixes it.

----------

## snelson

I seem to have resolved it by putting noapic and apic=no in the kernel parameters. Seems to be a problem with this IBM and its power management under linux. Running all weekend and still responsive!

Many thanks for all the help on this thread.

----------

