# Performance Tips for Kernel 2.6

## Pointer

Purpose of this thread is to give performance related hints for

kernel configuration and parametrisation. Please share your tricks!

I like to select only what I need  and compile all into the kernel.

 S E L E C T / U N S E L E C T 

S = select,  U = unselect

Processor type and features:

S: HPET Timer Support , not supported by all hardware

S: ..Provide RTC interrupt (HPET_EMULATE_RTC)

S: MTRR

S: Local APIC support

S: ..IO-APIC support, (not supported by all hardware)

S: Enable kernel irq balancing (IRQBALANCE)

http://lwn.net/Articles/33344/ (scroll down)

S: Use register arguments (EXPERIMENTAL) (REGPARM)

https://forums.gentoo.org/viewtopic.php?t=183954&highlight=register+arguments

U: Generic x86 support

U: Preemptible Kernel (for server use)

Bus options:

S: Message Signaled Interrupts (MSI and MSI-X) (PCI_MSI)

Related forum post

Kernel hacking:

S: Use 4Kb for kernel stacks instead of 8Kb (4KSTACKS)

http://lwn.net/Articles/84583/

Device Drivers:

-->ATA/ATAPI/...

S: Include IDE/ATA-2 DISK support

S:   Use multi-mode default

U: IDE Taskfile Access

U: IDE Taskfile IO

U: generic/default IDE chipset support

-->PCI IDE chipset support

U: Sharing PCI IDE interrupts support !!

S: Generic PCI bus-master DMA support (BLK_DEV_IDEDMA_PCI)

S:   Use PCI DMA by default when available (IDEDMA_PCI_AUTO)

S: The chipset driver of your board 

Networking support:

-->Packet socket

S: Packet socket:mmapped IO

Profiling support:

U: All

Gentoo needs:

http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=1&chap=7

File systems --->

  Pseudo Filesystems --->

    [*] /proc file system support

    [*] Virtual memory file system support (former shm fs)

(Select one or more of the following options as needed by your system)

  <*> Reiserfs support

  <*> Ext3 journalling file system support

  <*> JFS filesystem support

  <*> Second extended fs support

  <*> XFS filesystem support

CFLAGS for kernel can be set to:

.../linux/Makefile

.../linux/arch/i386/Makefile

The options of make

make help - Lists all options of Makefile

For example these are nice!

make allnoconfig  - New minimal config (a good starting point  :Wink: )

make defconfig    - New config with default answer to all options

Another tips:

Native Posix Thread Library

http://linuxdevices.com/articles/AT6753699732.html

Related forum post

Select the scheduler that is best for your task, AS(default) is only one possibility.

https://forums.gentoo.org/viewtopic-t-462268-highlight-.html

HDPARM-options

http://gentoo-wiki.com/HOWTO_Use_hdparm_to_improve_IDE_device_performance

Swappiness, if you think that there is too much swapping going on try this.

http://kerneltrap.org/node/view/3000

Good reading for everyone:

Red Hat Linux Manuals

Red Hat Enterprise Linux Documentation

The Relation of kernel headers and kernel source (good to know)

http://uwsg.iu.edu/hypermail/linux/kernel/0007.3/0587.html

Related news group post

ps.

Gentoo should have a  performance optimization and benchmarking HOWTO.Last edited by Pointer on Thu Jun 01, 2006 2:39 pm; edited 15 times in total

----------

## ectospasm

 *Pointer wrote:*   

> 
> 
> Unselect:
> 
> Preemptible Kernel, standard kernel is very responsive even without this, at least in my machine.

 

Note that the preemptive kernel options will make your machine even more responsive.  And some programs, notably JACK, require this.  Unless you're running a lot of CPU intensive applications that use a lot of kernel space functions, you will generally want a preemptable kernel.

----------

## zieloo

I have this option disabled as there were some problems with it and my -ck kernel.

----------

## GaMMa

Nice guide, my system is a lot more responsive now. I always used the "Preemptible Kernel" and didn't know about some of the other things..

----------

## frilled

 *GaMMa wrote:*   

> Nice guide, my system is a lot more responsive now. I always used the "Preemptible Kernel" and didn't know about some of the other things..

 

I have always been using preemption on workstations as soon as I could get a grip on it, but interestingly, on this hyperthreaded P4 from which I'm typing this I decided to turn it off after some months, since responsiveness is actually better without preemption on this particular machine. I don't really have an explanation for this, but keep in mind that you might want to try both options to see what fits you best.

----------

## madmango

 *wgi wrote:*   

>  *GaMMa wrote:*   Nice guide, my system is a lot more responsive now. I always used the "Preemptible Kernel" and didn't know about some of the other things.. 
> 
> I have always been using preemption on workstations as soon as I could get a grip on it, but interestingly, on this hyperthreaded P4 from which I'm typing this I decided to turn it off after some months, since responsiveness is actually better without preemption on this particular machine. I don't really have an explanation for this, but keep in mind that you might want to try both options to see what fits you best.

 

Placebo? Honestly, the docs say that preemption is for desktop use because it decreases latency, but it also decreases throughput, which is what you need for servers.

----------

## frilled

 *Quote:*   

> Placebo? Honestly, the docs say that preemption is for desktop use because it decreases latency, but it also decreases throughput, which is what you need for servers.

 

Well, I specifically stated "workstation". No problems with my production servers  :Smile: 

I have reproduced the effect on another P4 workstation with HT, although less dramatically.  It seems that the worst part on the aforementioned workstation seems to be the (ATA) disk on the ICH5 controller. Somehow NOT using preemption seems to make things better. It is less notable on the other workstation which also uses an ICH5, but with SATA disks.

And, as I said, this is just meant to make you try both settings, for it seems there's no absolute truth here. I have been using the preemption patches for a long time since it naturally seemd A Good Thing [TM] to do, but it isn't in all cases. That's my experience at least.

----------

## helmholtz

 *Quote:*   

> I have always been using preemption on workstations as soon as I could get a grip on it, but interestingly, on this hyperthreaded P4 from which I'm typing this I decided to turn it off after some months, since responsiveness is actually better without preemption on this particular machine. I don't really have an explanation for this, but keep in mind that you might want to try both options to see what fits you best.

 

I wonder if could be cause hyperthreading is in itself a form of "hardware preemption". The second logical processor, somewhat acting like a real second processor, can "immediately" pick up a new request. Of course, since there really is only one processor, your throughput will suffer, just like with kernel preemption. So adding kernel preemption on top of the hardware preemption really doesn't help any, and in some cases can hurt.

I'm certainly no engineer, but if I was in front of a firing squad, that's the hypothesis I'd put forward.  :Wink: 

----------

## frilled

The benefits of HT are without doubt debatable; I still don't really "feel" any major difference. I won't retell the whole seremon, but in the end it all depends. Given the right environment there may be benefits, on the other hand there's the overhead and sharing of limited ressources that work against it. Nevertheless, preemption should be completely independend of that, because it works on a different layer. I did not really dive into this yet, but if it was related to HT/SMP I think it would be an unwanted side effect. I'd rather say it's got something to do with either I/O scheduling [I'm currently testing whether the different schedulers give me really different results] or maybe even something on a lower (=hardware) level with the chipset drivers. I have no clue yet, and I guess by the time I dug into the sources somebody else will already have an explanation for this   :Cool: 

----------

## TheThaWav

I'd suggest setting the option section Processor Type and Features/Timer Frequency to 1000hertz. that'll speed up the boot process.

----------

## frilled

 *TheThaWav wrote:*   

> I'd suggest setting the option section Processor Type and Features/Timer Frequency to 1000hertz. that'll speed up the boot process.

 

Not if you use SMP or HT. You'll get sick desktop effects; at least I do.

----------

## Tazok

 *wgi wrote:*   

>  *TheThaWav wrote:*   I'd suggest setting the option section Processor Type and Features/Timer Frequency to 1000hertz. that'll speed up the boot process. 
> 
> Not if you use SMP or HT. You'll get sick desktop effects; at least I do.

 

For example?

----------

## frilled

 *Tazok wrote:*   

>  *wgi wrote:*    *TheThaWav wrote:*   I'd suggest setting the option section Processor Type and Features/Timer Frequency to 1000hertz. that'll speed up the boot process. 
> 
> Not if you use SMP or HT. You'll get sick desktop effects; at least I do. 
> 
> For example?

 

Recurring delays. Mouse chopping, applications grinding to a halt for a fraction of a second once every couple of seconds. Just try it, it's massively evil. It's easily reproducable on my boxes.

----------

## Tazok

So, which frequency do you suggest?

----------

## frilled

 *Tazok wrote:*   

> So, which frequency do you suggest?

 

100. IIRC it was mentioned somewhere, I guess LKML. But I'd be interested to hear if you get good results with something else!

----------

## moosh

Hi wgi,

I moved from gentoo-sources-2.6.12-r10 to gentoo-sources-2.6.13-r3 I noticed a considerable slowdown. For example: booting time (until the KDE login screen) was 1 minute 25 seconds, and this becamse after the upgrade, 2 minutes and 10 seconds. The system was choppy and screen redrawing was slow.

After reading your post, I tried changing the frequency from the 1000Hz setting to the 100Hz setting. This however, caused my system to be even slower, WAAAAAAAAAAY slower. Bootup time increased to 8 minutes and 20 seconds!!!! and the screen redrawing was painfully slow. It's as if I need more than 1000Hz to get the performance I had in gentoo-sources-2.6.12-r10. (I have pentium4 with HT BTW) Did any of you get the same slowdown in 2.6.13?

----------

## raf

I have the exact same performance as I had before upgrading from 2.6.12.5-vanilla to 2.6.13-gentoo-r3. I tried both 100Hz and 1000Hz with no noticable performance hit/boost.

Did you use the EXACT same configuration as your previous kernel? ie make oldconfig and say no to all new options?

----------

## frilled

There's definitely something wrong there. I don't have any noticable difference between 2.6.12 and 2.6.13. In fact, even with the frequency setting set to 100 this is by far the smoothest box I have. The setting _shouldn't_ affect overall speed ... must be some evil side effect. Like something that really hangs and completely hogs its timeslice. 8 minutes is completely insane. I haven't timed it but I doubt my box needs more than a minute to boot to the desktop.

Here's what I have under Processor type and features for my P4-3GHz with HT:

```

    Subarchitecture Type (PC-compatible)  --->

    Processor family (Pentium-4/Celeron(P4-based)/Pentium-4 M/Xeon)  --->

[ ] Generic x86 support

[*] HPET Timer Support

[*] Symmetric multi-processing support

(2)   Maximum number of CPUs (2-255)

[*]   SMT (Hyperthreading) scheduler support

    Preemption Model (Voluntary Kernel Preemption (Desktop))  --->

[*] Preempt The Big Kernel Lock

[*] Machine Check Exception

<*>   Check for non-fatal errors on AMD Athlon/Duron / Intel Pentium 4

[*]   check for P4 thermal throttling interrupt.

< > Toshiba Laptop support

< > Dell laptop support

[ ] Enable X86 board specific fixups for reboot

< > /dev/cpu/microcode - Intel IA32 CPU microcode support

< > /dev/cpu/*/msr - Model-specific register support

< > /dev/cpu/*/cpuid - CPU information support

    Firmware Drivers  --->

    High Memory Support (4GB)  --->

    Memory model (Flat Memory)  --->

[ ] Allocate 3rd-level pagetables from highmem

[ ] Math emulation

[*] MTRR (Memory Type Range Register) support

[ ] Boot from EFI support (EXPERIMENTAL)

[*] Enable kernel irq balancing

[ ] Use register arguments (EXPERIMENTAL)

[*] Enable seccomp to safely compute untrusted bytecode

    Timer frequency (100 HZ)  --->

[ ] kexec system call (EXPERIMENTAL)

```

... and I use the cfq I/O scheduler.

Things you might want to consider are:

- not using preemption at all (I had ill effects with another P4 with Intel Chipset and parallel ATA harddrive [this one uses SATA])

- not using kernel IRQ balancing (also had strange effects with this on another machine)

- not using SMP/SMT at all. You may well set the timer to 250 or 1000 then (although I think 250 is really enough, but YMMV - as it might with all the other stuff)

Double check your dmesg and chipset support!

----------

## moosh

Hi all,

Thanks for the really quick replies. Yes, I did use make oldconfig. Afterwards, seeing everything is so slow I started doing a methodical cleaning of everything which I did not need.

wgi, if I compare your configuration to mine I have but minor differences:

1) The preemption model is "preemtible kernel"

2) I don't have high memory support (since I don't have more than 1GB)

(there's some other stuff which is clearly unrelated, like Dell laptop support compiled as a module, and /dev/cpu/... stuff compiled in)

I'll try changing the preemption mode and see what happens. I already tried irq balancing on/off and nothing changed. dmesg shows nothing out of the ordinary (I'll post it in case nothing helps, maybe you can see something).

Thanks

----------

## moosh

Well, nothing really helped, so here is my dmesg:

```
Linux version 2.6.13-gentoo-r3 (root@rama) (gcc version 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)) #9 SMP Wed Oct 19 21:38:17 PDT 2005

BIOS-provided physical RAM map:

 BIOS-e820: 0000000000000000 - 000000000009f000 (usable)

 BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)

 BIOS-e820: 0000000000100000 - 000000001ffaa800 (usable)

 BIOS-e820: 000000001ffaa800 - 0000000020000000 (reserved)

 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)

 BIOS-e820: 00000000fecf0000 - 00000000fecf1000 (reserved)

 BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved)

 BIOS-e820: 00000000feda0000 - 00000000fee10000 (reserved)

 BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)

511MB LOWMEM available.

On node 0 totalpages: 130986

  DMA zone: 4096 pages, LIFO batch:1

  Normal zone: 126890 pages, LIFO batch:31

  HighMem zone: 0 pages, LIFO batch:1

DMI 2.3 present.

ACPI: RSDP (v000 DELL                                  ) @ 0x000fde90

ACPI: RSDT (v001 DELL    CPi R   0x27d50308 ASL  0x00000061) @ 0x1ffefbcd

ACPI: FADT (v001 DELL    CPi R   0x27d50308 ASL  0x00000061) @ 0x1fff0400

ACPI: MADT (v001 DELL    CPi R   0x27d50308 ASL  0x00000047) @ 0x1fff0c00

ACPI: SSDT (v001  PmRef    CpuPm 0x00003000 INTL 0x20030522) @ 0x1ffefbfd

ACPI: DSDT (v001 INT430 SYSFexxx 0x00001001 MSFT 0x0100000e) @ 0x00000000

ACPI: Local APIC address 0xfee00000

ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)

Processor #0 15:3 APIC version 20

ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)

Processor #1 15:3 APIC version 20

ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])

ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])

ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])

IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23

ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)

ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)

ACPI: IRQ0 used by override.

ACPI: IRQ2 used by override.

ACPI: IRQ9 used by override.

Enabling APIC mode:  Flat.  Using 1 I/O APICs

Using ACPI (MADT) for SMP configuration information

Allocating PCI resources starting at 20000000 (gap: 20000000:dec00000)

Built 1 zonelists

Kernel command line: root=/dev/hda7 video=vesafb:1024x768-16@60,nomtrr splash=silent,theme:livecd-2005.1 quiet CONSOLE=/dev/tty1

mapped APIC to ffffd000 (fee00000)

mapped IOAPIC to ffffc000 (fec00000)

Initializing CPU#0

PID hash table entries: 2048 (order: 11, 32768 bytes)

Detected 2793.498 MHz processor.

Using tsc for high-res timesource

Console: colour VGA+ 80x25

Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)

Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)

Memory: 512032k/523944k available (3218k kernel code, 11404k reserved, 1229k data, 248k init, 0k highmem)

Checking if this processor honours the WP bit even in supervisor mode... Ok.

Calibrating delay using timer specific routine.. 5590.67 BogoMIPS (lpj=2795337)

Mount-cache hash table entries: 512

CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 0000441d 00000000 00000000

CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 0000441d 00000000 00000000

monitor/mwait feature present.

using mwait in idle threads.

CPU: Trace cache: 12K uops, L1 D cache: 16K

CPU: L2 cache: 1024K

CPU: Physical Processor ID: 0

CPU: After all inits, caps: bfebfbff 00000000 00000000 00000080 0000441d 00000000 00000000

Intel machine check architecture supported.

Intel machine check reporting enabled on CPU#0.

CPU0: Intel P4/Xeon Extended MCE MSRs (12) available

CPU0: Thermal monitoring enabled

mtrr: v2.0 (20020519)

Enabling fast FPU save and restore... done.

Enabling unmasked SIMD FPU exception support... done.

Checking 'hlt' instruction... OK.

CPU0: Intel(R) Pentium(R) 4 CPU 2.80GHz stepping 04

Booting processor 1/1 eip 2000

Initializing CPU#1

Calibrating delay using timer specific routine.. 5585.31 BogoMIPS (lpj=2792657)

CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 0000441d 00000000 00000000

CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 0000441d 00000000 00000000

monitor/mwait feature present.

CPU: Trace cache: 12K uops, L1 D cache: 16K

CPU: L2 cache: 1024K

CPU: Physical Processor ID: 0

CPU: After all inits, caps: bfebfbff 00000000 00000000 00000080 0000441d 00000000 00000000

Intel machine check architecture supported.

Intel machine check reporting enabled on CPU#1.

CPU1: Intel P4/Xeon Extended MCE MSRs (12) available

CPU1: Thermal monitoring enabled

CPU1: Intel(R) Pentium(R) 4 CPU 2.80GHz stepping 04

Total of 2 processors activated (11175.98 BogoMIPS).

ENABLING IO-APIC IRQs

..TIMER: vector=0x31 pin1=2 pin2=-1

checking TSC synchronization across 2 CPUs: 

CPU#0 had 0 usecs TSC skew, fixed it up.

CPU#1 had 0 usecs TSC skew, fixed it up.

Brought up 2 CPUs

checking if image is initramfs... it is

Freeing initrd memory: 1235k freed

NET: Registered protocol family 16

ACPI: bus type pci registered

PCI: PCI BIOS revision 2.10 entry at 0xfcc7e, last bus=2

PCI: Using configuration type 1

ACPI: Subsystem revision 20050408

ACPI: Interpreter enabled

ACPI: Using IOAPIC for interrupt routing

ACPI: PCI Root Bridge [PCI0] (0000:00)

PCI: Probing PCI hardware (bus 00)

ACPI: Assume root bridge [\_SB_.PCI0] segment is 0

ACPI: Assume root bridge [\_SB_.PCI0] bus is 0

PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1

Boot video device is 0000:01:00.0

PCI: Transparent bridge - 0000:00:1e.0

ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]

ACPI: PCI Interrupt Link [LNKA] (IRQs 9 10 *11)

ACPI: PCI Interrupt Link [LNKB] (IRQs 5 7) *11

ACPI: PCI Interrupt Link [LNKC] (IRQs 9 10 *11)

ACPI: PCI Interrupt Link [LNKD] (IRQs 5 7 9 10 *11)

ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.

ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)

ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGP_._PRT]

ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCIE._PRT]

Linux Plug and Play Support v0.97 (c) Adam Belay

pnp: PnP ACPI init

pnp: PnP ACPI: found 10 devices

SCSI subsystem initialized

usbcore: registered new driver usbfs

usbcore: registered new driver hub

PCI: Using ACPI for IRQ routing

PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report

pnp: 00:01: ioport range 0x4d0-0x4d1 has been reserved

pnp: 00:01: ioport range 0x1000-0x1005 could not be reserved

pnp: 00:01: ioport range 0x1008-0x100f could not be reserved

pnp: 00:01: ioport range 0x800-0x80f has been reserved

pnp: 00:02: ioport range 0xf400-0xf4fe has been reserved

pnp: 00:02: ioport range 0x1006-0x1007 has been reserved

pnp: 00:02: ioport range 0x1010-0x105f could not be reserved

pnp: 00:02: ioport range 0x1060-0x107f has been reserved

pnp: 00:02: ioport range 0x1080-0x10bf has been reserved

pnp: 00:02: ioport range 0x10c0-0x10df has been reserved

pnp: 00:07: ioport range 0x900-0x97f has been reserved

PCI: Bridge: 0000:00:01.0

  IO window: c000-cfff

  MEM window: fc000000-fdffffff

  PREFETCH window: f0000000-f7ffffff

PCI: Bus 3, cardbus bridge: 0000:02:01.0

  IO window: 0000e000-0000efff

  IO window: 00002000-00002fff

  PREFETCH window: 20000000-21ffffff

  MEM window: 24000000-25ffffff

PCI: Bridge: 0000:00:1e.0

  IO window: e000-efff

  MEM window: fa000000-fbffffff

  PREFETCH window: 20000000-21ffffff

PCI: Setting latency timer of device 0000:00:1e.0 to 64

PCI: Enabling device 0000:02:01.0 (0000 -> 0003)

ACPI: PCI Interrupt 0000:02:01.0[A] -> GSI 19 (level, low) -> IRQ 169

Machine check exception polling timer started.

IA-32 Microcode Update Driver: v1.14 <tigran@veritas.com>

audit: initializing netlink socket (disabled)

audit(1129758039.963:1): initialized

Squashfs 2.2 (released 2005/07/03) (C) 2002-2005 Phillip Lougher

Installing knfsd (copyright (C) 1996 okir@monad.swb.de).

Initializing Cryptographic API

Real Time Clock Driver v1.12

Linux agpgart interface v0.101 (c) Dave Jones

agpgart: Detected an Intel 865 Chipset.

agpgart: AGP aperture is 128M @ 0xe8000000

vesafb: ATI Technologies Inc., P11 , 01.00 (OEM: ATI MOBILITY RADEON 9700   )

vesafb: VBE version: 2.0

vesafb: protected mode interface info at c000:5acf

vesafb: pmi: set display start = c00c5b63, set palette = c00c5baf

vesafb: pmi: ports = ec10 ec16 ec54 ec38 ec3c ec5c ec00 ec04 ecb0 ecb2 ecb4 

vesafb: monitor limits: vf = 0 Hz, hf = 0 kHz, clk = 0 MHz

vesafb: scrolling: redraw

Console: switching to colour frame buffer device 128x48

fbsplash: console 0 using theme 'livecd-2005.1'

fbsplash: switched splash state to 'on' on console 0

vesafb: framebuffer at 0xf0000000, mapped to 0xe0880000, using 15000k, total 65472k

fb0: VESA VGA frame buffer device

ACPI: AC Adapter [AC] (on-line)

ACPI: Battery Slot [BAT0] (battery present)

ACPI: Lid Switch [LID]

ACPI: Power Button (CM) [PBTN]

ACPI: Sleep Button (CM) [SBTN]

ACPI: Video Device [VID] (multi-head: yes  rom: no  post: no)

Using specific hotkey driver

ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3])

ACPI: Processor [CPU0] (supports 8 throttling states)

ACPI: CPU1 (power states: C1[C1] C2[C2] C3[C3])

ACPI: Processor [CPU1] (supports 8 throttling states)

ACPI: Thermal Zone [THM] (64 C)

PNP: PS/2 Controller [PNP0303:KBC,PNP0f13:PS2M] at 0x60,0x64 irq 1,12

serio: i8042 AUX port at 0x60,0x64 irq 12

serio: i8042 KBD port at 0x60,0x64 irq 1

Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled

ACPI: PCI Interrupt 0000:00:1f.6[B] -> GSI 17 (level, low) -> IRQ 177

ACPI: PCI interrupt for device 0000:00:1f.6 disabled

mice: PS/2 mouse device common for all mice

input: PC Speaker

io scheduler noop registered

io scheduler cfq registered

RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize

loop: loaded (max 8 devices)

pktcdvd: v0.2.0a 2004-07-14 Jens Axboe (axboe@suse.de) and petero2@telia.com

b44.c:v0.95 (Aug 3, 2004)

ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 18 (level, low) -> IRQ 185

eth0: Broadcom 4400 10/100BaseT Ethernet 00:0f:1f:22:87:38

PPP generic driver version 2.4.2

PPP Deflate Compression module registered

Linux video capture interface: v1.00

Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2

ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx

ICH5: IDE controller at PCI slot 0000:00:1f.1

PCI: Enabling device 0000:00:1f.1 (0005 -> 0007)

ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 16 (level, low) -> IRQ 193

ICH5: chipset revision 2

ICH5: not 100% native mode: will probe irqs later

    ide0: BM-DMA at 0xbfa0-0xbfa7, BIOS settings: hda:DMA, hdb:pio

    ide1: BM-DMA at 0xbfa8-0xbfaf, BIOS settings: hdc:DMA, hdd:pio

Probing IDE interface ide0...

input: AT Translated Set 2 keyboard on isa0060/serio0

hda: FUJITSU MHT2040AH, ATA DISK drive

input: PS/2 Mouse on isa0060/serio1

input: AlpsPS/2 ALPS GlidePoint on isa0060/serio1

ide0 at 0x1f0-0x1f7,0x3f6 on irq 14

Probing IDE interface ide1...

hdc: SAMSUNG CDRW/DVD SN-324S, ATAPI CD/DVD-ROM drive

ide1 at 0x170-0x177,0x376 on irq 15

hda: max request size: 128KiB

hda: 78140160 sectors (40007 MB) w/8192KiB Cache, CHS=65535/16/63, UDMA(100)

hda: cache flushes supported

 hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 >

hdc: ATAPI 24X DVD-ROM CD-R/RW drive, 2048kB Cache, UDMA(33)

Uniform CD-ROM driver Revision: 3.20

libata version 1.12 loaded.

ohci1394: $Rev: 1299 $ Ben Collins <bcollins@debian.org>

ACPI: PCI Interrupt 0000:02:01.1[A] -> GSI 19 (level, low) -> IRQ 169

ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[169]  MMIO=[faffd800-faffdfff]  Max Packet=[2048]

ieee1394: raw1394: /dev/raw1394 device initialized

NFTL driver: nftlcore.c $Revision: 1.97 $, nftlmount.c $Revision: 1.40 $

INFTL: inftlcore.c $Revision: 1.18 $, inftlmount.c $Revision: 1.16 $

ACPI: PCI Interrupt 0000:02:01.0[A] -> GSI 19 (level, low) -> IRQ 169

Yenta: CardBus bridge found at 0000:02:01.0 [1028:0159]

Yenta: Using CSCINT to route CSC interrupts to PCI

Yenta: Routing CardBus interrupts to PCI

Yenta TI: socket 0000:02:01.0, mfunc 0x012c1202, devctl 0x64

Yenta: ISA IRQ mask 0x0cf8, PCI irq 169

Socket status: 30000086

pcmcia: parent PCI bridge I/O window: 0xe000 - 0xefff

pcmcia: parent PCI bridge Memory window: 0xfa000000 - 0xfbffffff

pcmcia: parent PCI bridge Memory window: 0x20000000 - 0x21ffffff

ACPI: PCI Interrupt 0000:00:1d.7[D] -> GSI 23 (level, low) -> IRQ 201

PCI: Setting latency timer of device 0000:00:1d.7 to 64

ehci_hcd 0000:00:1d.7: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller

ehci_hcd 0000:00:1d.7: debug port 1

ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1

ehci_hcd 0000:00:1d.7: irq 201, io mem 0xf8fffc00

PCI: cache line size of 128 is not supported by device 0000:00:1d.7

ehci_hcd 0000:00:1d.7: USB 2.0 initialized, EHCI 1.00, driver 10 Dec 2004

hub 1-0:1.0: USB hub found

hub 1-0:1.0: 8 ports detected

ohci_hcd: 2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)

USB Universal Host Controller Interface driver v2.3

ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 193

PCI: Setting latency timer of device 0000:00:1d.0 to 64

uhci_hcd 0000:00:1d.0: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1

uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2

uhci_hcd 0000:00:1d.0: irq 193, io base 0x0000bf80

hub 2-0:1.0: USB hub found

hub 2-0:1.0: 2 ports detected

ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 169

PCI: Setting latency timer of device 0000:00:1d.1 to 64

uhci_hcd 0000:00:1d.1: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2

uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3

uhci_hcd 0000:00:1d.1: irq 169, io base 0x0000bf60

hub 3-0:1.0: USB hub found

hub 3-0:1.0: 2 ports detected

ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 185

PCI: Setting latency timer of device 0000:00:1d.2 to 64

uhci_hcd 0000:00:1d.2: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI #3

uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4

uhci_hcd 0000:00:1d.2: irq 185, io base 0x0000bf40

hub 4-0:1.0: USB hub found

hub 4-0:1.0: 2 ports detected

ACPI: PCI Interrupt 0000:00:1d.3[A] -> GSI 16 (level, low) -> IRQ 193

PCI: Setting latency timer of device 0000:00:1d.3 to 64

uhci_hcd 0000:00:1d.3: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #4

uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 5

uhci_hcd 0000:00:1d.3: irq 193, io base 0x0000bf20

hub 5-0:1.0: USB hub found

hub 5-0:1.0: 2 ports detected

Initializing USB Mass Storage driver...

usbcore: registered new driver usb-storage

USB Mass Storage support registered.

usb 3-1: new full speed USB device using uhci_hcd and address 2

usbcore: registered new driver hiddev

usbcore: registered new driver usbhid

drivers/usb/input/hid-core.c: v2.01:USB HID core driver

i2c /dev entries driver

NET: Registered protocol family 2

IP route cache hash table entries: 8192 (order: 3, 32768 bytes)

TCP established hash table entries: 32768 (order: 7, 524288 bytes)

TCP bind hash table entries: 32768 (order: 6, 393216 bytes)

TCP: Hash tables configured (established 32768 bind 32768)

TCP reno registered

ip_conntrack version 2.1 (4093 buckets, 32744 max) - 216 bytes per conntrack

ip_tables: (C) 2000-2002 Netfilter core team

ipt_recent v0.3.1: Stephen Frost <sfrost@snowman.net>.  http://snowman.net/projects/ipt_recent/

ClusterIP Version 0.7 loaded successfully

arp_tables: (C) 2002 David S. Miller

TCP bic registered

Initializing IPsec netlink socket

NET: Registered protocol family 1

NET: Registered protocol family 17

NET: Registered protocol family 15

Starting balanced_irq

Using IPI Shortcut mode

BIOS EDD facility v0.16 2004-Jun-25, 1 devices found

kjournald starting.  Commit interval 5 seconds

EXT3-fs: mounted filesystem with ordered data mode.

VFS: Mounted root (ext3 filesystem) readonly.

Freeing unused kernel memory: 248k freed

usb 5-1: new low speed USB device using uhci_hcd and address 2

ieee1394: Host added: ID:BUS[0-00:1023]  GUID[374fc00031d948a1]

input: USB HID v1.10 Mouse [Logitech Optical USB Mouse] on usb-0000:00:1d.3-1

Adding 1004020k swap on /dev/hda6.  Priority:-1 extents:1

EXT3 FS on hda7, internal journal

fglrx: module license 'Proprietary. (C) 2002 - ATI Technologies, Starnberg, GERMANY' taints kernel.

[fglrx] Maximum main memory to use for locked dma buffers: 429 MBytes.

ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 193

[fglrx] module loaded - fglrx 8.14.13 [Jun  8 2005] on minor 0

ACPI: PCI Interrupt 0000:00:1f.5[B] -> GSI 17 (level, low) -> IRQ 177

PCI: Setting latency timer of device 0000:00:1f.5 to 64

intel8x0_measure_ac97_clock: measured 49485 usecs

intel8x0: clocking to 48000

ACPI: PCI Interrupt 0000:00:1f.6[B] -> GSI 17 (level, low) -> IRQ 177

PCI: Setting latency timer of device 0000:00:1f.6 to 64

b44: eth0: Link is up at 100 Mbps, full duplex.

b44: eth0: Flow control is off for TX and off for RX.

NETDEV WATCHDOG: eth0: transmit timed out

b44: eth0: transmit timed out, resetting

b44: eth0: Link is down.

b44: eth0: Link is up at 100 Mbps, full duplex.

b44: eth0: Flow control is off for TX and off for RX.

kjournald starting.  Commit interval 5 seconds

EXT3 FS on hda3, internal journal

EXT3-fs: mounted filesystem with ordered data mode.

[fglrx] Kernel AGP support doesn't provide agplock functionality.

[fglrx] AGP detected, AgpState   = 0x1f004a1b (hardware caps of chipset)

agpgart: Found an AGP 3.0 compliant device at 0000:00:00.0.

agpgart: Putting AGP V3 device at 0000:00:00.0 into 8x mode

agpgart: Putting AGP V3 device at 0000:01:00.0 into 8x mode

[fglrx] AGP enabled,  AgpCommand = 0x1f004312 (selected caps)

[fglrx] free  AGP = 121909248

[fglrx] max   AGP = 121909248

[fglrx] free  LFB = 43110400

[fglrx] max   LFB = 43110400

[fglrx] free  Inv = 0

[fglrx] max   Inv = 0

[fglrx] total Inv = 0

[fglrx] total TIM = 0

[fglrx] total FB  = 0

[fglrx] total AGP = 32768

```

I guess it's still a mystery why 2.6.13-r3 works so slowly for me.

----------

## moosh

Hi everyone,

wgi, I've read your posting again and noticed the HT removal suggestion. I tried it and it worked! Once I turn SMP and HT in the 2.6.13-r3 kernel everything works just fine (only without HT of course). I guess something is broken in this kernel.

Thanks

----------

## frilled

Hm, well, it's not completely broken, at least. I'm running like ten boxes with HT. Maybe something to do with your specific board or APIC. I'll have a look at your dmesg, but I doubt I'll find something useful  :Smile: 

You don't really suffer a lot without HT support, IMHO. I never bothered to benchmark it, but I guess it is heavily situational, anyway.

(Edit:)

What strikes me is that quirky 

```
checking TSC synchronization across 2 CPUs:

CPU#0 had 0 usecs TSC skew, fixed it up.

CPU#1 had 0 usecs TSC skew, fixed it up.
```

Mine says 

```
checking TSC synchronization across 2 CPUs: passed.

Brought up 2 CPUs
```

Now, 0Âµs is not that much  :Very Happy: 

----------

## moosh

Hi wgi,

I just found these two links which explain the problem:

http://bugzilla.kernel.org/show_bug.cgi?id=5165 and https://bugs.gentoo.org/show_bug.cgi?id=110661.

----------

## das_leid

 *Pointer wrote:*   

> Purpose of this thread is to give performance related hints for
> 
> kernel configuration and parametrisation. Please share your tricks!
> 
> 

 

Well, I'd like to say thanks. Using your guide my overall network performance increased from 2.9MB/s to 6.6MB/s. 

I wonder if this is the practical maximum of a SCP connection in a 100Mbit network?!

Currently I use Kernel 2.6.13-r5 which seems to be stable. At least for me. However I will try plain 2.6.14 soon. Seems to have some nice features.

Cheers

Das_Leid

PS: NEVER EVER I am going to use a 3com 3c905B-TX in my network. They play ping-pong with your interrupts and effectively increase your RX error rate.

----------

## frilled

 *Quote:*   

> Well, I'd like to say thanks. Using your guide my overall network performance increased from 2.9MB/s to 6.6MB/s. 
> 
> I wonder if this is the practical maximum of a SCP connection in a 100Mbit network?!
> 
> 

 

No. I get 10.5 - 11 MB/s between our Servers/Workstations (Full Duplex). But you need some horsepower for ssh. Since there's virtually nothing slower than 2.4 GHz x86 (at least of those who speak ssh  :Wink:  I almost always get full speed. But with a slow machine the protocol and encryption becomes the bottleneck.

----------

## das_leid

 *wgi wrote:*   

>  No. I get 10.5 - 11 MB/s between our Servers/Workstations (Full Duplex). 

 

I realized that too. NFS and SAMBA is way faster. But I wonder: Do you consider a 2*550 Mhz Server too slow for SSH and SCP ? Hmm, need to check that with my 1.6ghz notebook and my 1.5 ghz amd.

Cheers,

Das_Leid

----------

## frilled

 *Quote:*   

> I realized that too. NFS and SAMBA is way faster. But I wonder: Do you consider a 2*550 Mhz Server too slow for SSH and SCP ? Hmm, need to check that with my 1.6ghz notebook and my 1.5 ghz amd.

 

SMB sucks Royal Arse. I used it. It sucks. Now I use shfs for everything between the Linuxes. I'll have to get rid of the samba daemons, too, since it still doesn't work reliably in a Windoze 2003 domain environment. Guess I'll have to install Cygwin on all Windoze servers   :Cool: 

----------

## Monkeh

 *das_leid wrote:*   

> PS: NEVER EVER I am going to use a 3com 3c905B-TX in my network. They play ping-pong with your interrupts and effectively increase your RX error rate.

 

B as in Boomerang? I know how you feel. My home server has one built in and it likes to screw with me.

----------

## tSp

 *wgi wrote:*   

>  *Quote:*   Well, I'd like to say thanks. Using your guide my overall network performance increased from 2.9MB/s to 6.6MB/s. 
> 
> I wonder if this is the practical maximum of a SCP connection in a 100Mbit network?!
> 
>  
> ...

 

a little off topic for the thread, but when using scp you can get much faster speeds by using blowfish, i.e. 

# scp -c blowfish file user@host:/directory

----------

## JATMAN

 *moosh wrote:*   

> Hi wgi,
> 
> I just found these two links which explain the problem:
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=5165 and https://bugs.gentoo.org/show_bug.cgi?id=110661.

 

I had the same problem with a big slowdown after upgrading to the 2.6.14 kernel.  Unsetting SMP solved the problem for me too.  Thanks!  This thread helped me a lot.

JATMAN

----------

## brazzmonkey

great tips !! my system doesn't lag any more ! though there are a few options i couldn't find.

thanks !!

----------

## halfgaar

Whatever I do, my system keeps lagging too much. Every minute or so it hangs for 0.5 sec. In games this is particularly annoyting. Things like cron, or regular-interval mailcheckers are suspect here, but disabling them doesn't help. Also, I find it rather low-tech that any task can lock up my system that hard for a time.

Well, I guess when my new AMD dual core arrives, it'll all be better  :Smile: 

----------

## purpler

halfgar,probably bad kernel config..

try do it once more from scratch   :Twisted Evil: 

----------

## halfgaar

A reply to an old thread  :Smile: 

By now, I have that dual core Athlon X2, and it's indeed all better. I still think it shouldn't have happened on my old machine, but it's no longer really relevant. I do believe that my kernel config was proper BTW. I've gone by every option mentioned in kernel performance threads, and I pretty much know what I'm doing.

The problem may also have been just the mouse movement. My mouse is USB and USB doesn't have a very high interrupt priority. There is a reason the keyboard interrupt is the highest, besides the timer. It's pretty strange that we nowadays have an input device on one of the lowest priorities.

----------

## P0w3r3d

any updates??? there are many improvements into the kernel code.. this is the same??

----------

