# Gigabit router

## snape

Hall, it's my first post/

I have Gentoo router in my network with 2 gigabit Intel PCI-E cards.

When out traffic is <120Mbps router works ok, and in traffic is ~250Mbps.

But if traffic incrase more than 120Mbps i have a problem.

in traffic go down to ~172Mbps.

Before i had a PCI gigabit cards, and I expected improvement.on PCI-E with faster bus.

Can avice anybody what should I do ?

[sorry for my english]  :Smile: 

----------

## egberts

To boost the throughput, I've compiled a short checklist for you.

1. You'll need to check if your Gigabit Ethernet NIC card supports the following

  a. TCP/UDP/IP checksum offloading.

  b. segment offloading

  c. interrupt moderation

  d. multiple ports on a single NIC

(I use server-grade Intel PRO/1000-MF)

2. Do kernel config and enable CONFIG_IP_ADVANCED_ROUTER mode

3. Use CAT-5e cabling or better

4. Keep gigabit HUB and cheap gigabit switches away from your new network.

----------

## snape

 *egberts wrote:*   

> To boost the throughput, I've compiled a short checklist for you.
> 
> 1. You'll need to check if your Gigabit Ethernet NIC card supports the following
> 
>   a. TCP/UDP/IP checksum offloading.
> ...

 

Intel PRO/1000 PT Desktop Adapter

1

a - Yes, b - Yes,  c- Yes (from product brief), 

d - 1 port 

2. I have enabled CONFIG_IP_ADVANCED_ROUTER mode with FIB lookup algorithm = FIB_HASH

3. cabling CAT-5e and now CAT-6

4 one switch DELL PowerEdge DELL 2724 with 2 gbit ports (before in this place was Ellacoya 4000 with 2 gbic)

lspci -v

02:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06)

        Subsystem: Intel Corporation PRO/1000 PT Desktop Adapter

        Flags: bus master, fast devsel, latency 0, IRQ 1275

        Memory at dfee0000 (32-bit, non-prefetchable) [size=128K]

        Memory at dfec0000 (32-bit, non-prefetchable) [size=128K]

        I/O ports at dc00 [size=32]

        Expansion ROM at dfea0000 [disabled] [size=128K]

        Capabilities: [c8] Power Management version 2

        Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+

        Capabilities: [e0] Express Endpoint, MSI 00

03:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06)

        Subsystem: Intel Corporation PRO/1000 PT Desktop Adapter

        Flags: bus master, fast devsel, latency 0, IRQ 1276

        Memory at dffe0000 (32-bit, non-prefetchable) [size=128K]

        Memory at dffc0000 (32-bit, non-prefetchable) [size=128K]

        I/O ports at ec00 [size=32]

        Expansion ROM at dffa0000 [disabled] [size=128K]

        Capabilities: [c8] Power Management version 2

        Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+

        Capabilities: [e0] Express Endpoint, MSI 00

ethtool eth0

Settings for eth0:

        Supported ports: [ TP ]

        Supported link modes:   10baseT/Half 10baseT/Full 

                                100baseT/Half 100baseT/Full 

                                1000baseT/Full 

        Supports auto-negotiation: Yes

        Advertised link modes:  10baseT/Half 10baseT/Full 

                                100baseT/Half 100baseT/Full 

                                1000baseT/Full 

        Advertised auto-negotiation: Yes

        Speed: 1000Mb/s

        Duplex: Full

        Port: Twisted Pair

        PHYAD: 0

        Transceiver: internal

        Auto-negotiation: on

        Supports Wake-on: umbg

        Wake-on: d

        Current message level: 0x00000007 (7)

        Link detected: yes

----------

## think4urs11

is CONFIG_E1000_NAPI active for the device driver of the NIC?

----------

## snape

it isn't

----------

## egberts

Hate to burst your bubble.   You have no decent interrupt coelescence queue mechansim on those desktop NIC.

 *Quote:*   

>  Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+ 

 

Get a 'SERVER' PRO/1000, not those foo-foo underpowered NIC called 'Desktop PRO/1000'

----------

## snape

Think4UrS11: 

I addesd NAPI in kernel, and now I waiting for any results

>one day latter -> no better with NAPI

egberts:

so, i can't get nothing more on these NIC's ??

Of course i'll replace these cards with server model, but at this time I need to get at least 0,5Mbps.

There is any hope ???   :Smile: 

----------

## egberts

You can always google TCP-ACK and how to expand its TCP window size buffering.

Google for the following /proc/sys/net/ipv4 settings:

   tcp_mem

   tcp_rmem

   tcp_wmem

   tcp_adv_win_scale  

   tcp_tso_win_divisor  

   tcp_workaround_signed_windows

   tcp_app_win

   tcp_window_scaling

That'll get you from 0.5 Mbps to at least 8 Mbps or more.

----------

## snape

thx egberts

i waiting now for new server NICs not desktop

and woriking with tcp settings from Your post.

PS

my mistake of course I thought about 0,5Gbps  :Smile: 

----------

## snape

Hi!

I' replace  NICs in my serwer with Intel Pro/1000 PT Server on 82572EI chipset and new drivers 7.6.15-NAPI

 but I still can't achieve more then 350Mbps (~120Mbps up and aprox. 240Mbps down).

What cant I do ???

dmesg

Linux version 2.6.23-gentoo (root@serwer) (gcc version 4.2.2 (Gentoo 4.2.2 p1.0)) #12 SMP Tue Feb 12 10:41:48 CET 2008

Command line: root=/dev/sda5

BIOS-provided physical RAM map:

 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)

 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)

 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)

 BIOS-e820: 0000000000100000 - 000000007bfc0000 (usable)

 BIOS-e820: 000000007bfc0000 - 000000007bfce000 (ACPI data)

 BIOS-e820: 000000007bfce000 - 000000007bff0000 (ACPI NVS)

 BIOS-e820: 000000007bff0000 - 000000007c000000 (reserved)

 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)

 BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved)

 BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)

Entering add_active_range(0, 0, 159) 0 entries of 3200 used

Entering add_active_range(0, 256, 507840) 1 entries of 3200 used

end_pfn_map = 1048576

DMI present.

ACPI: RSDP 000FB840, 0014 (r0 ACPIAM)

ACPI: RSDT 7BFC0000, 0040 (r1 _ASUS_ Notebook  5000728 MSFT       97)

ACPI: FACP 7BFC0200, 0084 (r2 A_M_I_ OEMFACP   5000728 MSFT       97)

ACPI: DSDT 7BFC05C0, 4E0C (r1  A0557 A0557000        0 INTL 20051117)

ACPI: FACS 7BFCE000, 0040

ACPI: APIC 7BFC0390, 0070 (r1 A_M_I_ OEMAPIC   5000728 MSFT       97)

ACPI: MCFG 7BFC0400, 003C (r1 A_M_I_ OEMMCFG   5000728 MSFT       97)

ACPI: SLIC 7BFC0440, 0176 (r1 _ASUS_ Notebook  5000728 MSFT       97)

ACPI: OEMB 7BFCE040, 0060 (r1 A_M_I_ AMI_OEM   5000728 MSFT       97)

ACPI: HPET 7BFC53D0, 0038 (r1 A_M_I_ OEMHPET0  5000728 MSFT       97)

ACPI: SSDT 7BFC5410, 028A (r1 A_M_I_ POWERNOW        1 AMD         1)

Scanning NUMA topology in Northbridge 24

CPU has 2 num_cores

No NUMA configuration found

Faking a node at 0000000000000000-000000007bfc0000

Entering add_active_range(0, 0, 159) 0 entries of 3200 used

Entering add_active_range(0, 256, 507840) 1 entries of 3200 used

Bootmem setup node 0 0000000000000000-000000007bfc0000

Zone PFN ranges:

  DMA             0 ->     4096

  DMA32        4096 ->  1048576

  Normal    1048576 ->  1048576

Movable zone start PFN for each node

early_node_map[2] active PFN ranges

    0:        0 ->      159

    0:      256 ->   507840

On node 0 totalpages: 507743

  DMA zone: 56 pages used for memmap

  DMA zone: 1682 pages reserved

  DMA zone: 2261 pages, LIFO batch:0

  DMA32 zone: 6887 pages used for memmap

  DMA32 zone: 496857 pages, LIFO batch:31

  Normal zone: 0 pages used for memmap

  Movable zone: 0 pages used for memmap

ACPI: PM-Timer IO Port: 0x508

ACPI: Local APIC address 0xfee00000

ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)

Processor #0 (Bootup-CPU)

ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)

Processor #1

ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])

IOAPIC[0]: apic_id 2, address 0xfec00000, GSI 0-23

ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 0 dfl dfl)

ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)

ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)

ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)

ACPI: IRQ0 used by override.

ACPI: IRQ9 used by override.

ACPI: IRQ14 used by override.

ACPI: IRQ15 used by override.

Setting APIC routing to flat

ACPI: HPET id: 0x10de8201 base: 0xfed00000

Using ACPI (MADT) for SMP configuration information

swsusp: Registered nosave memory region: 000000000009f000 - 00000000000a0000

swsusp: Registered nosave memory region: 00000000000a0000 - 00000000000e0000

swsusp: Registered nosave memory region: 00000000000e0000 - 0000000000100000

Allocating PCI resources starting at 80000000 (gap: 7c000000:82c00000)

SMP: Allowing 2 CPUs, 0 hotplug CPUs

PERCPU: Allocating 34664 bytes of per cpu data

Built 1 zonelists in Node order.  Total pages: 499118

Policy zone: DMA32

Kernel command line: root=/dev/sda5

Initializing CPU#0

PID hash table entries: 4096 (order: 12, 32768 bytes)

Marking TSC unstable due to TSCs unsynchronized

time.c: Detected 2611.845 MHz processor.

Console: colour VGA+ 80x25

console [tty0] enabled

Checking aperture...

CPU 0: aperture @ 1e24000000 size 32 MB

Aperture too small (32 MB)

No AGP bridge found

Memory: 1996020k/2031360k available (3441k kernel code, 34952k reserved, 1854k data, 348k init)

Calibrating delay using timer specific routine.. 5228.67 BogoMIPS (lpj=10457342)

Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)

Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)

Mount-cache hash table entries: 256

CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)

CPU: L2 Cache: 1024K (64 bytes/line)

CPU 0/0 -> Node 0

CPU: Physical Processor ID: 0

CPU: Processor Core ID: 0

SMP alternatives: switching to UP code

ACPI: Core revision 20070126

Using local APIC timer interrupts.

result 12556940

Detected 12.556 MHz APIC timer.

SMP alternatives: switching to SMP code

Booting processor 1/2 APIC 0x1

Initializing CPU#1

Calibrating delay using timer specific routine.. 5223.70 BogoMIPS (lpj=10447403)

CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)

CPU: L2 Cache: 1024K (64 bytes/line)

CPU 1/1 -> Node 0

CPU: Physical Processor ID: 0

CPU: Processor Core ID: 1

AMD Athlon(tm) 64 X2 Dual Core Processor 5200+ stepping 02

Brought up 2 CPUs

NET: Registered protocol family 16

ACPI: bus type pci registered

PCI: BIOS Bug: MCFG area at e0000000 is not E820-reserved

PCI: Not using MMCONFIG.

PCI: Using configuration type 1

ACPI: EC: Look up EC in DSDT

ACPI: Interpreter enabled

ACPI: (supports S0 S1 S3 S4 S5)

ACPI: Using IOAPIC for interrupt routing

ACPI: PCI Root Bridge [PCI0] (0000:00)

PCI: Transparent bridge - 0000:00:04.0

ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]

ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT]

ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P2._PRT]

ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.BR11._PRT]

ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.BR12._PRT]

ACPI: PCI Interrupt Link [LNKA] (IRQs 16 17 18 19) *0, disabled.

ACPI: PCI Interrupt Link [LNKB] (IRQs 16 17 18 19) *0, disabled.

ACPI: PCI Interrupt Link [LNKC] (IRQs 16 17 18 19) *0, disabled.

ACPI: PCI Interrupt Link [LNKD] (IRQs 16 17 18 19) *0, disabled.

ACPI: PCI Interrupt Link [LNEA] (IRQs 16 17 18 19) *0, disabled.

ACPI: PCI Interrupt Link [LNEB] (IRQs 16 17 18 19) *11

ACPI: PCI Interrupt Link [LNEC] (IRQs 16 17 18 19) *0, disabled.

ACPI: PCI Interrupt Link [LNED] (IRQs 16 17 18 19) *11

ACPI: PCI Interrupt Link [LUB0] (IRQs 20 21 22 23) *11

ACPI: PCI Interrupt Link [LUB2] (IRQs 20 21 22 23) *10

ACPI: PCI Interrupt Link [LMAC] (IRQs 20 21 22 23) *0, disabled.

ACPI: PCI Interrupt Link [LAZA] (IRQs 20 21 22 23) *0, disabled.

ACPI: PCI Interrupt Link [LACI] (IRQs 20 21 22 23) *0, disabled.

ACPI: PCI Interrupt Link [LMC9] (IRQs 20 21 22 23) *11

ACPI: PCI Interrupt Link [LSMB] (IRQs 20 21 22 23) *10

ACPI: PCI Interrupt Link [LPMU] (IRQs 20 21 22 23) *0, disabled.

ACPI: PCI Interrupt Link [LSA0] (IRQs 20 21 22 23) *15

ACPI: PCI Interrupt Link [LSA1] (IRQs 20 21 22 23) *5

ACPI: PCI Interrupt Link [LATA] (IRQs 20 21 22 23) *0, disabled.

ACPI Warning (tbutils-0217): Incorrect checksum in table [OEMB] -  25, should be 18 [20070126]

Linux Plug and Play Support v0.97 (c) Adam Belay

pnp: PnP ACPI init

ACPI: bus type pnp registered

pnp: ACPI device : hid PNP0A03

pnp: ACPI device : hid PNP0200

pnp: ACPI device : hid PNP0B00

pnp: ACPI device : hid PNP0800

pnp: ACPI device : hid PNP0C04

pnp: ACPI device : hid PNP0700

pnp: ACPI device : hid PNP0401

pnp: ACPI device : hid PNP0C02

pnp: ACPI device : hid PNP0103

pnp: ACPI device : hid PNP0C02

pnp: ACPI device : hid PNP0303

pnp: ACPI device : hid PNP0C02

pnp: ACPI device : hid PNP0501

pnp: ACPI device : hid PNP0C02

pnp: ACPI device : hid PNP0C01

pnp: PnP ACPI: found 15 devices

ACPI: ACPI bus type pnp unregistered

SCSI subsystem initialized

libata version 2.21 loaded.

usbcore: registered new interface driver usbfs

usbcore: registered new interface driver hub

usbcore: registered new device driver usb

PCI: Using ACPI for IRQ routing

PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report

Sangoma WANPIPE Router v1.1 (c) 1995-2000 Sangoma Technologies Inc.

hpet0: at MMIO 0xfed00000, IRQs 2, 8, 31

hpet0: 3 32-bit timers, 25000000 Hz

Time: hpet clocksource has been installed.

pnp: the driver 'system' has been registered

pnp: match found with the PnP device '00:07' and the driver 'system'

pnp: 00:07: iomem range 0xfefe0000-0xfefe01ff has been reserved

pnp: 00:07: iomem range 0xfefe1000-0xfefe1fff has been reserved

pnp: 00:07: iomem range 0xfee01000-0xfeefffff has been reserved

pnp: 00:07: iomem range 0xffb80000-0xffffffff could not be reserved

pnp: match found with the PnP device '00:09' and the driver 'system'

pnp: 00:09: iomem range 0xfec00000-0xfec00fff could not be reserved

pnp: 00:09: iomem range 0xfee00000-0xfee00fff could not be reserved

pnp: 00:09: iomem range 0x7c000000-0x7fffffff has been reserved

pnp: match found with the PnP device '00:0b' and the driver 'system'

pnp: 00:0b: ioport range 0x230-0x23f has been reserved

pnp: 00:0b: ioport range 0x290-0x29f has been reserved

pnp: 00:0b: ioport range 0xa00-0xa0f has been reserved

pnp: 00:0b: ioport range 0xa10-0xa1f has been reserved

pnp: match found with the PnP device '00:0d' and the driver 'system'

pnp: 00:0d: iomem range 0xe0000000-0xefffffff has been reserved

pnp: match found with the PnP device '00:0e' and the driver 'system'

pnp: 00:0e: iomem range 0x0-0x9ffff could not be reserved

pnp: 00:0e: iomem range 0xc0000-0xcffff has been reserved

pnp: 00:0e: iomem range 0xe0000-0xfffff could not be reserved

pnp: 00:0e: iomem range 0x100000-0x7bffffff could not be reserved

PCI: Bridge: 0000:00:04.0

  IO window: disabled.

  MEM window: disabled.

  PREFETCH window: disabled.

PCI: Bridge: 0000:00:09.0

  IO window: d000-dfff

  MEM window: dfe00000-dfefffff

  PREFETCH window: disabled.

PCI: Bridge: 0000:00:0b.0

  IO window: e000-efff

  MEM window: dff00000-dfffffff

  PREFETCH window: disabled.

PCI: Bridge: 0000:00:0c.0

  IO window: disabled.

  MEM window: disabled.

  PREFETCH window: disabled.

PCI: Setting latency timer of device 0000:00:04.0 to 64

PCI: Setting latency timer of device 0000:00:09.0 to 64

PCI: Setting latency timer of device 0000:00:0b.0 to 64

PCI: Setting latency timer of device 0000:00:0c.0 to 64

NET: Registered protocol family 2

IP route cache hash table entries: 65536 (order: 7, 524288 bytes)

TCP established hash table entries: 262144 (order: 10, 6291456 bytes)

TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)

TCP: Hash tables configured (established 262144 bind 65536)

TCP reno registered

Total HugeTLB memory allocated, 0

Installing knfsd (copyright (C) 1996 okir@monad.swb.de).

io scheduler noop registered

io scheduler deadline registered (default)

io scheduler cfq registered

Boot video device is 0000:00:0d.0

PCI: Setting latency timer of device 0000:00:09.0 to 64

assign_interrupt_mode Found MSI capability

Allocate Port Service[0000:00:09.0:pcie00]

PCI: Setting latency timer of device 0000:00:0b.0 to 64

assign_interrupt_mode Found MSI capability

Allocate Port Service[0000:00:0b.0:pcie00]

PCI: Setting latency timer of device 0000:00:0c.0 to 64

assign_interrupt_mode Found MSI capability

Allocate Port Service[0000:00:0c.0:pcie00]

pci_hotplug: PCI Hot Plug PCI Core version: 0.5

fakephp: Fake PCI Hot Plug Controller Driver

acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5

pciehp: PCI Express Hot Plug Controller Driver version: 0.4

shpchp: Standard Hot Plug PCI Controller Driver version: 0.4

Real Time Clock Driver v1.12ac

hpet_resources: 0xfed00000 is busy

Linux agpgart interface v0.102

input: Power Button (FF) as /class/input/input0

ACPI: Power Button (FF) [PWRF]

input: Power Button (CM) as /class/input/input1

ACPI: Power Button (CM) [PWRB]

Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled

serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A

pnp: the driver 'serial' has been registered

pnp: match found with the PnP device '00:0c' and the driver 'serial'

00:0c: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A

pnp: the driver 'parport_pc' has been registered

pnp: match found with the PnP device '00:06' and the driver 'parport_pc'

parport_pc 00:06: reported by Plug and Play ACPI

parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE]

FDC 0 is a post-1991 82077

RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize

loop: module loaded

tun: Universal TUN/TAP device driver, 1.6

tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>

netconsole: not configured, aborting

Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2

ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx

NFORCE-MCP61: IDE controller at PCI slot 0000:00:06.0

NFORCE-MCP61: chipset revision 162

NFORCE-MCP61: not 100% native mode: will probe irqs later

NFORCE-MCP61: 0000:00:06.0 (rev a2) UDMA133 controller

    ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:pio, hdb:pio

Probing IDE interface ide0...

sata_nv 0000:00:08.0: version 3.5

ACPI: PCI Interrupt Link [LSA0] enabled at IRQ 23

ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LSA0] -> GSI 23 (level, low) -> IRQ 23

PCI: Setting latency timer of device 0000:00:08.0 to 64

scsi0 : sata_nv

scsi1 : sata_nv

ata1: SATA max UDMA/133 cmd 0x000000000001c480 ctl 0x000000000001c402 bmdma 0x000000000001bc00 irq 23

ata2: SATA max UDMA/133 cmd 0x000000000001c080 ctl 0x000000000001c002 bmdma 0x000000000001bc08 irq 23

ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

ata1.00: ATA-8: SAMSUNG HD501LJ, CR100-10, max UDMA7

ata1.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32)

ata1.00: configured for UDMA/133

ata2: SATA link down (SStatus 0 SControl 300)

scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG HD501LJ  CR10 PQ: 0 ANSI: 5

sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)

sd 0:0:0:0: [sda] Write Protect is off

sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00

sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)

sd 0:0:0:0: [sda] Write Protect is off

sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00

sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

 sda: sda1 sda2 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 >

sd 0:0:0:0: [sda] Attached SCSI disk

sd 0:0:0:0: Attached scsi generic sg0 type 0

ACPI: PCI Interrupt Link [LSA1] enabled at IRQ 22

ACPI: PCI Interrupt 0000:00:08.1[B] -> Link [LSA1] -> GSI 22 (level, low) -> IRQ 22

PCI: Setting latency timer of device 0000:00:08.1 to 64

scsi2 : sata_nv

scsi3 : sata_nv

ata3: SATA max UDMA/133 cmd 0x000000000001b880 ctl 0x000000000001b802 bmdma 0x000000000001b080 irq 22

ata4: SATA max UDMA/133 cmd 0x000000000001b480 ctl 0x000000000001b402 bmdma 0x000000000001b088 irq 22

ata3: SATA link down (SStatus 0 SControl 300)

ata4: SATA link down (SStatus 0 SControl 300)

Fusion MPT base driver 3.04.05

Copyright (c) 1999-2007 LSI Logic Corporation

Fusion MPT SPI Host driver 3.04.05

ieee1394: raw1394: /dev/raw1394 device initialized

ACPI: PCI Interrupt Link [LUB2] enabled at IRQ 21

ACPI: PCI Interrupt 0000:00:02.1[B] -> Link [LUB2] -> GSI 21 (level, low) -> IRQ 21

PCI: Setting latency timer of device 0000:00:02.1 to 64

ehci_hcd 0000:00:02.1: EHCI Host Controller

ehci_hcd 0000:00:02.1: new USB bus registered, assigned bus number 1

ehci_hcd 0000:00:02.1: debug port 1

PCI: cache line size of 64 is not supported by device 0000:00:02.1

ehci_hcd 0000:00:02.1: irq 21, io mem 0xdfdfec00

ehci_hcd 0000:00:02.1: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004

usb usb1: configuration #1 chosen from 1 choice

hub 1-0:1.0: USB hub found

hub 1-0:1.0: 10 ports detected

ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver

ACPI: PCI Interrupt Link [LUB0] enabled at IRQ 20

ACPI: PCI Interrupt 0000:00:02.0[A] -> Link [LUB0] -> GSI 20 (level, low) -> IRQ 20

PCI: Setting latency timer of device 0000:00:02.0 to 64

ohci_hcd 0000:00:02.0: OHCI Host Controller

ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 2

ohci_hcd 0000:00:02.0: irq 20, io mem 0xdfdff000

usb usb2: configuration #1 chosen from 1 choice

hub 2-0:1.0: USB hub found

hub 2-0:1.0: 10 ports detected

USB Universal Host Controller Interface driver v3.0

usbcore: registered new interface driver usblp

Initializing USB Mass Storage driver...

usbcore: registered new interface driver usb-storage

USB Mass Storage support registered.

pnp: the driver 'i8042 kbd' has been registered

pnp: match found with the PnP device '00:0a' and the driver 'i8042 kbd'

pnp: the driver 'i8042 aux' has been registered

PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1

PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp

serio: i8042 KBD port at 0x60,0x64 irq 1

mice: PS/2 mouse device common for all mice

device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel@redhat.com

usbcore: registered new interface driver usbhid

drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver

oprofile: using NMI interrupt.

Netfilter messages via NETLINK v0.30.

nf_conntrack version 0.5.0 (16384 buckets, 65536 max)

ctnetlink v0.93: registering with nfnetlink.

IPv4 over IPv4 tunneling driver

GRE over IPv4 tunneling driver

ip_tables: (C) 2000-2006 Netfilter Core Team

ClusterIP Version 0.8 loaded successfully

arp_tables: (C) 2002 David S. Miller

TCP bic registered

TCP cubic registered

TCP westwood registered

TCP highspeed registered

TCP hybla registered

TCP htcp registered

TCP vegas registered

TCP veno registered

TCP scalable registered

TCP yeah registered

TCP illinois registered

NET: Registered protocol family 1

NET: Registered protocol family 10

lo: Disabled Privacy Extensions

tunl0: Disabled Privacy Extensions

ip6_tables: (C) 2000-2006 Netfilter Core Team

NET: Registered protocol family 17

powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 5200+ processors (2 cpu cores) (version 2.00.00)

powernow-k8:    0 : fid 0x12 (2600 MHz), vid 0xa

powernow-k8:    1 : fid 0x10 (2400 MHz), vid 0xc

powernow-k8:    2 : fid 0xe (2200 MHz), vid 0xe

powernow-k8:    3 : fid 0xc (2000 MHz), vid 0x10

powernow-k8:    4 : fid 0xa (1800 MHz), vid 0x10

powernow-k8:    5 : fid 0x2 (1000 MHz), vid 0x12

EXT3-fs: INFO: recovery required on readonly filesystem.

EXT3-fs: write access will be enabled during recovery.

kjournald starting.  Commit interval 5 seconds

EXT3-fs: recovery complete.

EXT3-fs: mounted filesystem with ordered data mode.

VFS: Mounted root (ext3 filesystem) readonly.

Freeing unused kernel memory: 348k freed

Intel(R) PRO/1000 Network Driver - version 7.6.15

Copyright (c) 1999-2007 Intel Corporation.

ACPI: PCI Interrupt Link [LNED] enabled at IRQ 19

ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [LNED] -> GSI 19 (level, low) -> IRQ 19

PCI: Setting latency timer of device 0000:02:00.0 to 64

e1000: 0000:02:00.0: e1000_probe: (PCI Express:2.5Gb/s:Width x1) 00:15:17:68:59:25

e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection

ACPI: PCI Interrupt Link [LNEB] enabled at IRQ 18

ACPI: PCI Interrupt 0000:03:00.0[A] -> Link [LNEB] -> GSI 18 (level, low) -> IRQ 18

PCI: Setting latency timer of device 0000:03:00.0 to 64

e1000: 0000:03:00.0: e1000_probe: (PCI Express:2.5Gb/s:Width x1) 00:15:17:68:58:14

e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection

EXT3 FS on sda5, internal journal

kjournald starting.  Commit interval 5 seconds

EXT3 FS on sda1, internal journal

EXT3-fs: mounted filesystem with ordered data mode.

ReiserFS: sda6: found reiserfs format "3.6" with standard journal

ReiserFS: sda6: using ordered data mode

ReiserFS: sda6: journal params: device sda6, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

ReiserFS: sda6: checking transaction log (sda6)

ReiserFS: sda6: Using r5 hash to sort names

ReiserFS: sda7: found reiserfs format "3.6" with standard journal

ReiserFS: sda7: using ordered data mode

ReiserFS: sda7: journal params: device sda7, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

ReiserFS: sda7: checking transaction log (sda7)

ReiserFS: sda7: Using r5 hash to sort names

ReiserFS: sda8: found reiserfs format "3.6" with standard journal

ReiserFS: sda8: using ordered data mode

ReiserFS: sda8: journal params: device sda8, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

ReiserFS: sda8: checking transaction log (sda8)

ReiserFS: sda8: Using r5 hash to sort names

ReiserFS: sda11: found reiserfs format "3.6" with standard journal

ReiserFS: sda11: using ordered data mode

ReiserFS: sda11: journal params: device sda11, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

ReiserFS: sda11: checking transaction log (sda11)

ReiserFS: sda11: Using r5 hash to sort names

Adding 987956k swap on /dev/sda10.  Priority:-1 extents:1 across:987956k

----------

## Januszzz

 *Quote:*   

> I' replace NICs in my serwer with Intel Pro/1000 PT Server on 82572EI chipset and new drivers 7.6.15-NAPI
> 
> but I still can't achieve more then 350Mbps (~120Mbps up and aprox. 240Mbps down).
> 
> What cant I do ??? 

 

huh, my three pounds:

I got the same problem two years ago or so, but I had just  too many netfilter rules (I was filtering ~900 MAC addresses). My way to shorten this list was to use ipset module and putting off the bandwidth limiting onto fine SMC switches (then it left only 60 rules or so).

This and tweaking with interrupt coalescence gave me download about ~350Mbit and upload ~150Mbit.  With this amount of load conntrack was begging for a brake: we had problems with numbers of simultaneous connections (up to ~300 000 concurrent connections).

Problem with number of connections is harder than one can assume - you can limit only TCP ones. We were trying to limit UDP rate, but this is somewhat idiotic: every UDP connection lasts for a miliseconds or so and when reaching some load the number begins to increase rapidly effectively killing router.  And if UDP is limited too low overall throughput gets down... 

So I've finished in splitting the load between two machines. They are composed of Pentium 2.8 Ghz with HT and 2 GB RAM each. Some IBM cards inserted (on pci-e, e1000 is the module). The routers are based on GNAP and hardened sources compiled by me. Bootable CDs include dnsmasq, syslog logging to external machine and some script to draw statistics on local Boa server and create MAC filtering table using external file located on different http (central server).

Max of those machines is about 300Mbit download/150 upload each. For me its enough.

EDIT: I may add that once I really tried squashing the appliance and never got more than 440Mbit of total rates. Virtually no can do more, maybe on more NICs. Ah, and I have  ~32 VLANs on those machines, 8021q tagging may be also a stopper here.

I have campus network with ~900 machines turned on simultaneously.

----------

## egberts

Actually, Januszz, you refreshed my experience back a bit, and you are correct.   350 Mbps is a pretty good number all things considered.   The roadblock is now the OS and networking stack, no longer the hardware.  My bet is you're shooting for either a testing environment or cluster farm of which both needs them.

Snape, other than going FreeBSD-SMP, the only way to push it further is embedded the network protocol stack, mono-kernelized.  You're at that edge now.

Actually, you can speed up the netfiltering by bash-scripting the 600 odd MAC addresses into something that is only ten deep hash tree table (Patricia Trie comes to mind).  

Sure that'll cost ya in numbers of tables and manageability4, but the search will be way faster.

----------

## snape

I  give up and buy router Cisco  :Sad: 

----------

## sf_alpha

How you test your traffic ?

Generating traffic exceeding 300Mbps from one host may be very difficult compared to route traffic of 1 Gbps.

----------

## MorpheuS.Ibis

either go the cisco way or have a look at http://www.liberouter.org/

don't know the availability of those network accelerator for non-academic field...

----------

## Akaihiryuu

 *Quote:*   

> 4. Keep gigabit HUB and cheap gigabit switches away from your new network.

 

There is no such thing as a gigabit hub.

----------

## Januszzz

Snape,

Cisco is cool thing until you can afford for new one  :Wink:  we had Cisco also for the installation I've mentioned but it broke. We couldn't afford a longer break and our campus paper would take years., so we decided to take Linux instead.

Nothing against Cisco, but have you heard about vyatta?

http://www.vyatta.com/

We also tried this and works fine, however we still rely on those hand made appliances so we didn't insist on a replace (and we didn't test it more than basics), but vyatta definately will be an option for consideration in next purchase decision.

----------

## abrand15

I'm not a network engineer, but from what I've been told in the past, 350Mbps is the max you'll get out of a copper connection.  I worked at an ISP about 4 yrs ago and we used some home grown linux firewalls.  They routinely routed traffic in excess of 600Mbps.  We used the Intel Server Pro/1000MF cards.  If fiber is an option for you, it might solve your issue.

Cheers,

Allan

----------

## jonathanross

Hi All.

I've just read this thread with great interest. I admin a couple of border routers running Quagga/BGP and we're about to exceed 100Mbps.

I've read several papers and articles and the throughput claims seem to differ wildly.

We run Cisco in another Data Centre but I'd much rather stick to Gentoo up until performance is a problem and I have to change kit.

I trust this forum and believe the 300Mbps - 350Mbps maximum mentioned in this thread. Vyetta looks good but we've spent time on TC doing some nice filtering and would rather carry on up until the 300Mbps level with the build we have. Will TC have any negative performance ? 

Can I definitely achieve around 300Mbps with the specs below ? The testing is really difficult to get realistic results from. I'm going to build at least one new box so hopefully I have a chance to get the specs right beforehand. It would be good to re-use a Core2Duo as a failover box with the Server NICs being the only change.

o The Intel PRO/1000-MF Gbps cards mentioned. Will the default driver settings service the interrupt coalescence queue mentioned ? 

o One Quad Core CPU (will Quad Core make a massive difference over Dual Core; our Core2Duo box sits at 0.00 load at over 50Mbps on a default kernel).

o I looked at the IP Advanced Router but other than enabling klogd to log problems I can't see what else I need to enable. I've read about the TCP window stuff before. Is that a last resort or a necessity to get to 300Mbps ? Most of our traffic will be port 80 http outbound.

I've read that other than loading the Routing Table into memory RAM makes very little difference to forwarding. Is that correct ? It seems to make sense so a 2GB build should be fine.

Any help would be greatly appreciated ! We want to keep flying the Gentoo flag !

Many thanks,

JR   :Smile: 

----------

## richard.scott

Hi,

I'm in the process of configuring dual fail-over firewalls with sync'd iptable state tables! I'm just in the testing phase with the firewalls, but I've been using Gentoo as a dual fail-over load balancer for some time!

 *jonathanross wrote:*   

> o The Intel PRO/1000-MF Gbps cards mentioned. Will the default driver settings service the interrupt coalescence queue mentioned ?

 

How does your FC interface show up ... is it still something like eth0 or does it get called something else??

What do you see with the "ip add list" command. I get the following for standard ethernet cards:

```
# ip add list

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo

    inet6 ::1/128 scope host

       valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

    link/ether 00:0f:ea:5a:ac:4d brd ff:ff:ff:ff:ff:ff

    inet 192.168.200.8/24 brd 192.168.200.255 scope global eth0

    inet 192.168.200.5/32 scope global eth0

    inet6 fe80::20f:eaff:fe5a:ac4d/64 scope link

       valid_lft forever preferred_lft forever

3: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop

    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
```

Do you see the same sort of output..... any chance you could post your output? (or PM me?)

 *jonathanross wrote:*   

> o One Quad Core CPU (will Quad Core make a massive difference over Dual Core; our Core2Duo box sits at 0.00 load at over 50Mbps on a default kernel).

 

I don't think the additional CPU power will help. We have two load balancers running on a VIA 800Mhz CPU!  :Smile: 

From what I understand about kernel based routing i.e. not using a daemon or any additional software your CPU usage won't show up the effort required to do the routing stuff. I don't know if uptime or top will show this in enough detail as its the system usage rather than user usage that you need to look at.

In top you can just about see it listed as 2.4%sy, see my example output:

```
top - 12:14:04 up 23:28,  1 user,  load average: 0.43, 0.69, 0.70

Tasks: 102 total,   1 running, 101 sleeping,   0 stopped,   0 zombie

Cpu(s): 20.9%us,  2.4%sy,  1.3%ni, 70.1%id,  4.9%wa,  0.1%hi,  0.3%si,  0.0%st
```

 *jonathanross wrote:*   

> I've read that other than loading the Routing Table into memory RAM makes very little difference to forwarding. Is that correct ? It seems to make sense so a 2GB build should be fine.

 

As a forwarded packet shouldn't leave the kernel (unless its being load balanced or re-routed via a daemon process) your available system memory should make little difference.

----------

## jonathanross

Hi Richard,

Thanks for replying. Good to see someone else in the UK is flying the Gentoo flag   :Smile: 

The interfaces are just simply eth0 and eth1. I may have misunderstood why you asked for that "ip add" output so please just ask if you need more info. I haven't gone down the road of buying the superdooper NICs yet and at the moment they're just bog standard 10/100s.

We use Quagga's bgpd which barely shows up on 'top' unless it's initiated a new session and is loading all the routes from the routing table in. 

However that's a handy hint about Sys Usage: our Core2Duo 2.0Ghz box sits at 0.5%sy which is a good sign I presume. 0.5%si peaked at 1.0% but top's rather unreadable man page doesn't give away what that stands for easily  :Smile: 

Ahh, here we go:

us -> User

sy -> system

ni -> nice

id -> idle

wa -> iowait

hi -> H/w interrupt requests

si -> S/w interrupt requests

In cost terms that's good news about the Quad Core CPUs not making a difference. 

Are you using the aforementioned superdooper NICs now ?

JR

----------

## richard.scott

Hi JR,

Wow, where in the UK are you... I'm just south of Oxford   :Very Happy: 

I'm writing a web interface for the configuration of an appliance that boots from a solid state drive i.e. USB, Compact Flash, Disk-On-Module etc. 

I was curious to know what the Intel PRO/1000MF interface name was for this. I'd googled and assumed it could be a Fiber card. eth0 was what I was looking for as I currently don't have access to any Fiber Channel hardware to see what the device looks like to Linux  :Crying or Very sad: 

Have you setup MRTG on your routers (or something else like Cacti) to record CPU usage via SNMP I'm guessing that this should show total CPU usage rather than a split of User / System etc like top does?

Cheers,

Rich

----------

## jonathanross

Hi Rich,

Edinburgh   :Smile: 

We haven't bought the NICs yet so I can't help at the moment I'm afraid. Maybe someone else on this thread can. I would guess they're just standard interface names.

If you use udev, you should be able to change them (this is from a SPARC box not an x86 so don't let the tulip drivers mentioned confuse things):

```

prompt> tail /etc/udev/rules.d/70-persistent-net.rules

# This file was automatically generated by the /lib/udev/write_net_rules

# program run by the persistent-net-generator.rules rules file.

#

# You can modify it, as long as you keep each rule on a single line.

# PCI device 0x1282:0x9102 (tulip)

SUBSYSTEM=="net", DRIVERS=="?*", ATTR{address}=="00:03:ba:04:e2:3f", NAME="eth1"

# PCI device 0x1282:0x9102 (tulip)

SUBSYSTEM=="net", DRIVERS=="?*", ATTR{address}=="00:03:ba:04:e2:3e", NAME="eth0"

```

We do use MRTG but I know from watching the routing boxes they do little or no work at all ! Literally 0.00, 0.00, 0.00 !! This is why other than superdooper NICs I'm hoping the current Core2Duo spec boxes would forward a few hundred Mbps of traffic.

Are you familiar with the required kernel tweaks ? I have so many bookmarks ...   :Confused: 

JR   :Smile: 

----------

## richard.scott

Hi JR,

 *jonathanross wrote:*   

> Edinburgh  

 

Oh nice, I've not been that far north in years!

 *jonathanross wrote:*   

> 
> 
> We haven't bought the NICs yet so I can't help at the moment I'm afraid. Maybe someone else on this thread can. I would guess they're just standard interface names.

 

No, I may have to see if I can borrow a FC card from work... trouble is they are all in use on StorEdge 3510 Arrays!  :Laughing: 

 *jonathanross wrote:*   

> 
> 
> If you use udev, you should be able to change them (this is from a SPARC box not an x86 so don't let the tulip drivers mentioned confuse things):
> 
> 

 

I'd not thought of that, Udev is such a flexible thing too! Neat idea!

 *jonathanross wrote:*   

> 
> 
> We do use MRTG but I know from watching the routing boxes they do little or no work at all ! Literally 0.00, 0.00, 0.00 !! This is why other than superdooper NICs I'm hoping the current Core2Duo spec boxes would forward a few hundred Mbps of traffic.

 

If you look at Kemp Technologies they make linux load balancers. They use a tweaked version of SUSE, but from what I can tell its not tweaked very much... it still looks like they have the standard kernel config setup! From what I can see on their website the highest spec they go up to is a P4 of some type in the LM-3500. I know it lists "Dual Core" as the cpu spec, but I seem to remember seeing that as P4 before it got changed   :Wink: 

We have a pair of LM-1500s which is actually an 800Mhz Mini-ITX VIA board with 512MB Ram! They cost £1500 each and as its just a rack mount server running Linux I figuredI could do better with Gentoo!   :Laughing: 

I was thinking about using a Xeon system so I could activate I/OAT and offload some of the pressure back onto the NIC Card. I'm not sure if this would have any advantage in your situation, but it does apparently deal better with buffering of incoming traffic. From what I can see it only works on PCI-Express NIC's tho. I've found this one that I was going to get to test with:

http://www.intel.com/products/server/adapters/pro1000pt-quadport/pro1000pt-quadport-overview.htm

 *jonathanross wrote:*   

> Are you familiar with the required kernel tweaks ? I have so many bookmarks ...  

 

I'd be interested in seeing what kernel tweaks you have found useful.

Cheers,

Rich.

----------

## Sysa

 *snape wrote:*   

> Hall, it's my first post/
> 
> I have Gentoo router in my network with 2 gigabit Intel PCI-E cards.
> 
> When out traffic is <120Mbps router works ok, and in traffic is ~250Mbps.
> ...

 

BTW: Did you check and evaluate your other hardware (not the NIC itself)? I mean PCI bus and RAM buffers throughputs etc.

----------

## jonathanross

Thanks, Rich.

I'll have a good look at those URLs when I get a chance.

It looks like we're both after the correct kernel parameters to tweak for maximum forwarding performance.

When I said I had loads of bookmarks it was from hunting I did a couple of months ago and I only found partially-confirmed parameters to tweak.

If I find any concrete settings to change then I'll be sure to post them and I hope you will too.

Please let me know how you get on with those superdooper NICs if you get a chance !

JR   :Smile: 

----------

