# HTB (i guess) megaconfusion

## Lurch

Hi there,

this is a cry out for help to all Netfilter/iproute2's tc/HTB wizes out there.

Recently I had to install and configure a Linux router for a small, improvised "ISP" some friends of mine are running in their neighbourhood. Unfortunately the configuration of the traffic shaper turned out a bit complicated or rather tricky and since I'm not really aquainted with iproute2's tc command and HTB based traffic shapers/QoS scheduler things got a lot mixed up. But first things first.

Everything is running on a Pentium III 500 MHz, 512MB RAM, two RealTek 8139 based Fast Ethernet PCI card (I'm using 8139too driver) machine. (I don't know if any other hardware parameter matters, but if it does I'd be glad to fill it in.) I'm using kernel 2.6.7 ck-sources. Netfilter and QoS packet scheduler are compiled as modules. 

There is a simple Netfilter based firewall which filters inbound and outbound traffic while leaving forwarded traffic untouched (in a "free-for-all" mode ^__^). There are also a couple of NAT routines - some clients are "hard-linked" to a particular IP addresses while the majority are NAT-ed to a pool of IPs.

The biggest challenge for me was to make the traffic shaper.  The problem was that it had to devide the traffic of every client into three categories based on the destination (or rather source, since for now it only limits the download speed) IP address. This categorization is beacause our ISP has high bandwidth connections with the other big ISP in the country but has very poor international connectivity (BTW the third category is the ISP's clients FTP server the speed to which is "unlimited").  The general idea is to restrict the speed at one level for the "local" traffic and at another much lower level for the "international" traffic, so that users who generate lots of bulk intranational tarffic (say peer-to-peer networks) won't overload the international channel. In addition to that all users are placed into one of four subscription plans.

Confused already?

Now comes the real hodge-podge. In some online tutorials and reference materials about HTB I read that in HTB based scheduler filters should be attached to the root queueing discipline and cannot be distributed along the class hierarchy (i.e. like in CBQ the packet travels through the filters all the way up to a leaf, then gets scheduled only to get back down to the root). (Recently I found some collateral evidence from various sources that this may not be quite true and hierarchal HTB filtering is possible but unfortunately I'm not quite certain so any comments on this are welcome) This is why I decided to use ip packet  TOS field mangling in Netfilter to mark what comes from a local (for the country) network and then running a check for the value of the TOS fielad in the shaper.

It goes like this (the list is big - more than 100 IP ranges):

```

$IPTABLES -t mangle -A PREROUTING -i eth1 -s 192.168.56.0/255.255.255.0 -j TOS --set-tos 0x02

$IPTABLES -t mangle -A PREROUTING -i eth1 -s 192.168.56.0/255.255.255.0 -j ACCEPT

$IPTABLES -t mangle -A PREROUTING -i eth1 -s 172.19.0.0/255.255.0.0 -j TOS --set-tos 0x02

$IPTABLES -t mangle -A PREROUTING -i eth1 -s 172.19.0.0/255.255.0.0 -j ACCEPT

```

And the shaper itself (excerpt of course):

```

tc qdisc add dev $LOCALNET_INTERFACE root handle 1: htb default 14

tc class add dev $LOCALNET_INTERFACE parent 1: classid 1:1 htb rate 30Mbit ceil 30Mbit quantum 5760000

tc class add dev $LOCALNET_INTERFACE parent 1:1 classid 1:11 htb rate 5Mbit ceil 5Mbit quantum 960000

tc class add dev $LOCALNET_INTERFACE parent 1:1 classid 1:12 htb rate 640kbit ceil 640kbit quantum 120000

tc class add dev $LOCALNET_INTERFACE parent 1:1 classid 1:13 htb rate 10Mbit ceil 10Mbit quantum 1920000

tc qdisc add dev $LOCALNET_INTERFACE parent 1:12 handle 13: sfq perturb 10

tc class add dev $LOCALNET_INTERFACE parent 1:1 classid 1:14 htb rate 8kbit ceil 8kbit quantum 1500

tc qdisc add dev $LOCALNET_INTERFACE parent 1:14 handle 14: sfq perturb 10

 

##------------------------(one of these for every client)---------------------------

##---CLASSES

tc class add dev $LOCALNET_INTERFACE parent 1:11 classid 1:11002 htb rate 256kbit ceil 1Mbit quantum 48000

tc class add dev $LOCALNET_INTERFACE parent 1:12 classid 1:12002 htb rate 64kbit ceil 128kbit quantum 12000

##---SFQ QDISCs

tc qdisc add dev $LOCALNET_INTERFACE parent 1:11002 handle 11002: sfq perturb 10

tc qdisc add dev $LOCALNET_INTERFACE parent 1:12002 handle 12002: sfq perturb 10

##---FILTERS

####-----------------(here goes everithing from the FTP)

tc filter add dev $LOCALNET_INTERFACE protocol ip parent 1: prio 1 u32 match ip dst 192.168.0.2 match ip src 10.0.0.1 flowid 1:13 

####-----------------(here goes the highspeed "local" traffic)

tc filter add dev $LOCALNET_INTERFACE protocol ip parent 1: prio 1 u32 match ip dst 192.168.0.2 match ip tos 0x02 0xff flowid 1:11004

####-----------------(and finally all else is considered "international" and slow)

tc filter add dev $LOCALNET_INTERFACE protocol ip parent 1: prio 1 u32 match ip dst 192.168.0.2 flowid 1:12004

```

Or as a diagram (ok, I know this is really bad ASCII-art ^__^):

```

       +--->O--->O                  -->here goes everything from the FTP

       |

       |                      

       |               +--O

O------+--->O----------+--O         -->this is the "international" branch

       |               +--O

       |    

       |               +--O

       +--->O----------+--O         -->this is "local" branch

                       +--O

```

Now on to the problem - this just doesn't work. Ok, it actually woks but just for some time and then all forwarding of packets ceases (there is connection to both networks from the router itself though). Unfortunately it is hard to troubleshoot it because it seams to occur randomly (there is no specific event or time interval that explicitly triggers it). The one thing I was able to figure out is that most probably the scheduler itself is causing all that because after I stop it (delete the root qdisc) forwarding starts again (usually but not always ?!?). I suspect that the quantum settings are  the problem because it is something that took me the longest to get to work and I'm not shure if I got them right (at first I didn't set any and let HTB to calculate default ones but it gave me: Jul 13 14:16:32 localhost kernel: HTB: mindelay=2333, some class has too small rate messages in the systemlog). Here are all rate, ceil and quantum values used:

```

###--- This is the total bandwidth between our router and our ISP (WiFi wireless)

rate 30Mbit ceil 30Mbit quantum 5760000

###--- These are the "local" bandwidth and "international" bandwidth respectively

rate 5Mbit ceil 5Mbit quantum 960000

rate 640kbit ceil 640kbit quantum 120000

###--- This is the speed we decided to limit the access to the FTP to 

rate 10Mbit ceil 10Mbit quantum 1920000

###--- And the four subscription plans 

rate 256kbit ceil 1Mbit quantum 48000

rate 64kbit ceil 128kbit quantum 12000

rate 144kbit ceil 320kbit quantum 27000

rate 24kbit ceil 64kbit quantum 4500

rate 64kbit ceil 168kbit quantum 12000

rate 16kbit ceil 40kbit quantum 3000

rate 32kbit ceil 128kbit quantum 6000

rate 8kbit ceil 24kbit quantum 1500

```

Any ideas what could be the problem and how to solve it? Or probably you know a better way to do all that? Everything reasonable is appriciated.

Thanks anyway (at least for having the patience to read all this ^__^)

PS: I don't know if it matters but i ran the same configuration on the old machine which was a Slackware - kernel 2.4.21, patched with kernel.org patches to 2.4.26 - and for some strange reason instead of stoping forwarding the interfaces (either one and even both) just died (no connection whatsoever) also randomly but  simple "restart" by ifconfig eth* down/up did the trick

----------

## mrness

Your problem is lack of CPU power to do the job "brute force" style.

I've made a program called mipclasses which does what you want (what can I tell you, I have same problem here in Romania  :Wink: ). If you use it, with a Celeron 500MHz, you will be able to classify packets at 100Mbit throughput.

You can find the program either on sf.net or on http://metropolitana.loginet.ro. The readme of my progam and documents found on http://lartc.org should be enough.

----------

## Lurch

Thanks

The program looks great and will shurely save me some sleepless weeks. I'v always suspected lack of resources (either CPU time or RAM).

However, since the iptables mangling part and the tc commands that create the traffic shaper are in two seperate scripts, I'v tried to stop them one at a time and shutting down the packet mangling only doesn't seem to solve the problem (it shurely frees CPU cicles but unfortunately forwarding still doesn't work).

This doesn't mean that compacting IP ranges and thus cutting down required checks won't solve the problem. I'm not even closely experienced enough to say that. It merely means that I'm still open to other suggestions.

Nevertheless, I will try it this afternoon and keep you posted of the results.

Once again, thanks! ^__^

----------

## mrness

I use 2 IMQ interfaces to control international trafic. In that way,  the differences between rates are small enough to use a certain r2q without being concerned by quantums. However, if I have a certain small class with a rate expressed in kbit smaller than 12*$R2Q, then I force a quantum of 1514 on that class.

You should choose R2Q wisely, being the smallest integer at which you do not receive messages like "class has too big rate". Also, do not run additional services on that computer, at least not until you are convinced you have enough CPU power.

I've looked at your bandwidth distribution and it doesn't seem to adds up. You have a bandwidth of 30Mbit==30720Kbit and sum of your child classes rates is 16608Kbit. Btw, I think you are more than optimist if you consider to have 30Mbit on a wireless link  :Wink: .

----------

## tnt

I use the following script for my eth0 card 

```
#!/bin/bash

UPLINK=100

P2P=24

PUNISH=80

DEV=eth0

tc qdisc del dev $DEV root    2> /dev/null > /dev/null

tc qdisc del dev $DEV ingress 2> /dev/null > /dev/null

tc qdisc add dev $DEV root handle 1: htb default 10

tc class add dev $DEV parent 1: classid 1:1 htb rate ${UPLINK}mbit

tc class add dev $DEV parent 1:1 classid 1:10 htb rate ${UPLINK}mbit prio 1

tc class add dev $DEV parent 1:1 classid 1:20 htb rate ${P2P}kbit prio 2

tc class add dev $DEV parent 1:1 classid 1:30 htb rate ${PUNISH}kbit prio 3

tc qdisc add dev $DEV parent 1:10 handle 100: sfq perturb 11

tc qdisc add dev $DEV parent 1:20 handle 200: sfq perturb 13

tc qdisc add dev $DEV parent 1:30 handle 300: sfq perturb 15

tc filter add dev $DEV parent 1:0 protocol ip prio 10 handle 0x201 fw classid 1:20

tc filter add dev $DEV parent 1:0 protocol ip prio 15 handle 0x202 fw classid 1:20

tc filter add dev $DEV parent 1:0 protocol ip prio 20 handle 0x203 fw classid 1:20

tc filter add dev $DEV parent 1:0 protocol ip prio 25 handle 0x204 fw classid 1:20

tc filter add dev $DEV parent 1:0 protocol ip prio 30 handle 0x205 fw classid 1:20

tc filter add dev $DEV parent 1:0 protocol ip prio 35 handle 0x206 fw classid 1:20

tc filter add dev $DEV parent 1:0 protocol ip prio 40 handle 0x200 fw classid 1:20

tc filter add dev $DEV parent 1:0 protocol ip prio 45 handle 0x109 fw classid 1:30

```

and I always get message 

```
Jun 10 18:28:00 titan HTB: quantum of class 10001 is big. Consider r2q change.

Jun 10 18:28:00 titan HTB: quantum of class 10010 is big. Consider r2q change.

Jun 10 18:28:00 titan HTB: quantum of class 10020 is small. Consider r2q change.
```

What quantum values should I use for those 100Mbps, 24 and 80kbps classes?

 :Question: 

----------

