# Kernel stack trace during heavy network transfer

## cruzr

Have had this dedi for about a month from a leaseweb reseller. Running Gentoo Hardened on a Linux 4.7.3 Grsecurity test patch. Out of the box we were seeing pretty inconsistent and poor network speeds. Of course, I'm in the US and this box is in the Netherlands.

rtorrent does most of the downloading and uploading on this box. For P2P, I adjusted the sysctl.conf with the following values, which greatly improved the download and upload speeds. But now on simultaneous download/uploads, the machine really slows down. For example, unrelated services such as Postfix and Dovecot will refuse to stop until the network transfers complete. I clipped the stack trace that prints to the kernel log, and it repeats at least 10 times back to back. 

The machine is not running out of memory. It has a 1 GbE port, and at most we'll see 90mb down and 60mb up at the same time, so the line isn't getting saturated. I'm suspecting either a misconfiguration in the kernel, or a bug in the network module (Tigon3).

lspci output: https://gigebox.pw/tmp/lspci.txt

kernel config: https://gigebox.pw/tmp/kconfig.txt

And for those who are instant to blame grsecurity, I have already booted a totally vanilla kernel.org 4.7.3 linux, without the patch (not just disabling Grsecurity), and I experienced the same issue.  :Smile: 

```
net.ipv4.ip_forward = 0

net.ipv4.conf.default.rp_filter = 1

net.ipv4.conf.all.rp_filter = 1

net.core.wmem_max = 12582912

net.core.rmem_max = 12582912

net.core.rmem_default = 262140

net.ipv4.tcp_rmem = 10240 87380 12582912

net.ipv4.tcp_wmem = 10240 87380 12582912

net.ipv4.tcp_window_scaling = 1

net.ipv4.tcp_timestamps = 1

net.ipv4.tcp_sack = 1

fs.file-max = 65535

net.ipv4.ip_forward = 0

kernel.shmmax = 67108864

```

```
[113797.143518] swapper/6: page allocation failure: order:0, mode:0x2080120(GFP_ATOMIC|__GFP_COLD)

[113797.143520] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 4.7.3-grsec #3

[113797.143521] Hardware name: HP ProLiant DL120 G6/ProLiant DL120 G6, BIOS O26    07/01/2013

[113797.143522]  0000000000000000 9d13b57302980b84 0000000000000086 0000000000000000

[113797.143524]  ffffffff8fef9602 0000000000000000 9d13b57302980b84 0000000000000000

[113797.143526]  ffff88043fd83bc8 ffffffff8fd46145 0208012000a68700 9d13b57302980b84

[113797.143528] Call Trace:

[113797.143529]  <IRQ>  [<ffffffff8fef9602>] ? 0xffffffff8fef9602

[113797.143531]  [<ffffffff8fd46145>] ? 0xffffffff8fd46145

[113797.143532]  [<ffffffff8ff20360>] ? 0xffffffff8ff20360

[113797.143533]  [<ffffffff8fd466d6>] ? 0xffffffff8fd466d6

[113797.143535]  [<ffffffff8fd46e79>] ? 0xffffffff8fd46e79

[113797.143536]  [<ffffffff8fd4730a>] ? 0xffffffff8fd4730a

[113797.143537]  [<ffffffff9005feb2>] ? 0xffffffff9005feb2

[113797.143538]  [<ffffffff90016b48>] ? 0xffffffff90016b48

[113797.143539]  [<ffffffff90017913>] ? 0xffffffff90017913

[113797.143541]  [<ffffffff9001b2c0>] ? 0xffffffff9001b2c0

[113797.143542]  [<ffffffff9007d692>] ? 0xffffffff9007d692

[113797.143543]  [<ffffffff8fc9edec>] ? 0xffffffff8fc9edec

[113797.143544]  [<ffffffff8fc9f037>] ? 0xffffffff8fc9f037

[113797.143545]  [<ffffffff8fc1bb6a>] ? 0xffffffff8fc1bb6a

[113797.143547]  [<ffffffff901d0b4b>] ? 0xffffffff901d0b4b

[113797.143547]  <EOI>  [<ffffffff8fc2295d>] ? 0xffffffff8fc2295d

[113797.143549]  [<ffffffff8fcd9450>] ? 0xffffffff8fcd9450

[113797.143551]  [<ffffffff8fc3231f>] ? 0xffffffff8fc3231f

[113797.143552] Mem-Info:

[113797.143555] active_anon:76960 inactive_anon:1372 isolated_anon:0

                 active_file:3416252 inactive_file:452419 isolated_file:32

                 unevictable:0 dirty:456383 writeback:17772 unstable:0

                 slab_reclaimable:110135 slab_unreclaimable:7865

                 mapped:105822 shmem:2591 pagetables:2555 bounce:0

                 free:18292 free_pcp:1622 free_cma:0

[113797.143563] DMA free:15360kB min:12kB low:24kB high:36kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15360kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes

[113797.143564] lowmem_reserve[]: 0 2960 15990 15990

[113797.143571] DMA32 free:53000kB min:2992kB low:6020kB high:9048kB active_anon:65672kB inactive_anon:176kB active_file:2175476kB inactive_file:587716kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:3120768kB managed:3033924kB mlocked:0kB dirty:599216kB writeback:20500kB mapped:128100kB shmem:1408kB slab_reclaimable:127052kB slab_unreclaimable:7072kB kernel_stack:432kB pagetables:2296kB unstable:0kB bounce:0kB free_pcp:2860kB local_pcp:800kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no

[113797.143572] lowmem_reserve[]: 0 0 13030 13030

[113797.143579] Normal free:4808kB min:13180kB low:26520kB high:39860kB active_anon:242168kB inactive_anon:5312kB active_file:11489532kB inactive_file:1221960kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:13631488kB managed:13343272kB mlocked:0kB dirty:1226316kB writeback:50588kB mapped:295188kB shmem:8956kB slab_reclaimable:313488kB slab_unreclaimable:24388kB kernel_stack:3488kB pagetables:7924kB unstable:0kB bounce:0kB free_pcp:3628kB local_pcp:456kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no

[113797.143580] lowmem_reserve[]: 0 0 0 0

[113797.143582] DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (M) = 15360kB

[113797.143589] DMA32: 1436*4kB (UME) 3665*8kB (UMEH) 539*16kB (UME) 165*32kB (UMH) 1*64kB (H) 1*128kB (H) 1*256kB (H) 1*512kB (H) 1*1024kB (H) 1*2048kB (H) 0*4096kB = 53000kB

[113797.143599] Normal: 834*4kB (UMEH) 0*8kB 4*16kB (H) 2*32kB (H) 1*64kB (H) 0*128kB 1*256kB (H) 2*512kB (H) 0*1024kB 0*2048kB 0*4096kB = 4808kB

[113797.143607] 3871296 total pagecache pages

[113797.143608] 4192060 pages RAM

[113797.143608] 0 pages HighMem/MovableOnly

[113797.143609] 93921 pages reserved

```

----------

## ct85711

 *Quote:*   

>  It has a 1 GbE port, and at most we'll see 90mb down and 60mb up at the same time, so the line isn't getting saturated. 

 

Just a note, your ethernet port is rarely the limiter on your speed, but rather your internet coming to your modem.  In general, your modem speed will almost always be your slowest point in your network (unless you have a 100M or a 10M ethernet port, then they may end up being the slow point).  If you are going by internal network communication, then your gigabit ethernet will have more of an effect.

An easy way to think of it this way for your speed, your coming up to a toll booth and the traffic at the booth is only 5 mph and you are going 50mph.  It doesn't matter what you want, as you are going have to slow down to what everyone at the toll booth is going.  Anywhere before the toll booth, you perfectly free to go what ever speed you want.

----------

## cruzr

Here's speedtest output:

```
~ # speedtest --simple

Ping: 3.867 ms

Download: 927.47 Mbit/s

Upload: 713.35 Mbit/s

```

Here's an iperf3 test, the endpoint has a 2.5GbE port. 

```
Connecting to host online.gigebox.pw, port 5201

[  4] local 85.17.147.135 port 46840 connected to 163.172.215.30 port 5201

[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd

[  4]   0.00-1.00   sec   112 MBytes   942 Mbits/sec    0    457 KBytes       

[  4]   1.00-2.00   sec   112 MBytes   938 Mbits/sec    1    329 KBytes       

[  4]   2.00-3.00   sec   109 MBytes   912 Mbits/sec    0    385 KBytes       

[  4]   3.00-4.00   sec   112 MBytes   937 Mbits/sec    0    397 KBytes       

[  4]   4.00-5.00   sec   112 MBytes   937 Mbits/sec    0    414 KBytes       

[  4]   5.00-6.00   sec   112 MBytes   941 Mbits/sec    0    436 KBytes       

[  4]   6.00-7.00   sec   112 MBytes   938 Mbits/sec    0    436 KBytes       

[  4]   7.00-8.00   sec   112 MBytes   942 Mbits/sec    0    436 KBytes       

[  4]   8.00-9.00   sec   112 MBytes   938 Mbits/sec    0    457 KBytes       

[  4]   9.00-10.00  sec   112 MBytes   941 Mbits/sec    0    457 KBytes       

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval           Transfer     Bandwidth       Retr

[  4]   0.00-10.00  sec  1.09 GBytes   937 Mbits/sec    1             sender

[  4]   0.00-10.00  sec  1.09 GBytes   936 Mbits/sec                  receiver

```

The box is capable of reaching greater speeds but I get your point.

----------

## toralf

 *Quote:*   

> I adjusted the sysctl.conf 

 What's about starting with a vanilla sysctl.conf to verify that you config values aren't the culprit ?

----------

## Ant P.

It's failing to allocate memory.

The reason why, we don't know. The stack trace provided is useless because all the symbols are missing.

----------

## cruzr

 *Ant P. wrote:*   

> It's failing to allocate memory.
> 
> The reason why, we don't know. The stack trace provided is useless because all the symbols are missing.

 

Should a config debugging option in the kernel show us the symbols if it were enabled?

 *Quote:*   

> 
> 
> What's about starting with a vanilla sysctl.conf to verify that you config values aren't the culprit ?

 

I will try and report back, but then we are left with poor connection speeds again. I might have to find a Liinux forum with someone who has better knowledge of these settings. I got these from researching "Linux TCP Settings" from some RHEL and Oracle articles, then played with combinations until I got good results.

----------

## Ant P.

 *cruzr wrote:*   

> Should a config debugging option in the kernel show us the symbols if it were enabled?

 

You need at least CONFIG_FRAME_POINTER=y. Turning it off doesn't speed anything up on 64-bit, it just makes debugging impossible.

----------

## s4e8

increase sysctl kernel.min_free_kbytes

----------

