# [SOLVED] Ethernet Bridging under 2.6.22 broken

## didymos

So, here's the current (working) setup:

2.6.21-gentoo-r4, Hostapd-0.6.0, madwifi-ng-0.9.3.1

The onboard device is

```

02:00.0 Ethernet controller: Intel Corporation 82573V Gigabit Ethernet Controller (Copper) (rev 03)

```

using the e1000 driver.  The wifi device is:

```

01:0a.0 Ethernet controller: Atheros Communications, Inc. AR5212 802.11abg NIC (rev 01)

```

My /etc/conf.d net is:

```

# 802.1d Bridge

RC_NEED_br0="net.eth0 net.ath0"

bridge_br0="eth0 ath0"

config_br0=( "192.168.0.2/24" )

brctl_br0=( "setfd 0" "sethello 0" "stp off" )

# Ethernet

config_eth0=("null")

# Wireless

modules=( "iwconfig" )

config_ath0=( "null" )

essid_ath0="1122_2841_wireless"

mode_ath0="master"

iwpriv_ath0="mode 3"

preup() {

    local exit_status

    if [ "${IFACE}" = "ath0" ]; then

        if [ ! -d /sys/class/net/ath0 ]; then

            einfo "Creating VAP..."

            exit_status=$(/sbin/wlanconfig ath0 create wlandev wifi0 wlanmode ap)

            if [ "$exit_status" = "ath0" ]; then

                einfo "VAP ath0 created"

                return 0

            else

                eerror "VAP creation failed"

                return 1

            fi

        fi

    fi

postdown() {

    if [ "${IFACE}" = "ath0" ]; then

        if [ -d /sys/class/net/ath0 ]; then

            einfo "Destroying VAP"

            /sbin/wlanconfig ath0 destroy

                einfo "VAP ath0 destroyed"

                return 0

       fi

    fi

    return 0

}

#PPP

config_ppp0=( "ppp" )

link_ppp0="br0"  

plugins_ppp0=( "pppoe")

username_ppp0='<username>'

pppd_ppp0=(

    "noauth"

    "defaultroute"  

    "default-asyncmap"

    "ipcp-accept-remote"

    "ipcp-accept-local"

    "lcp-echo-interval 60"

    "lcp-echo-failure 5"

    "persist"

    "holdoff 2"

    "debug"

    "sync"

    "mru 1492"

    "mtu 1492"

    "lock")

    return 0

}

```

The pre-up/post-down are there solely in case I have to restart.  They don't get executed at boot since the ath_pci driver autocreates ath0 in AP mode. Anway, all this works just fine (and has worked for months).  Then I tested 2.6.22-gentoo(-r1).  Suddenly, nothing can go back over the bridge to either of the wireless stations: an XP box and a Wii.  They both associate and can send any traffic: arp, ip, pppoe, et cetera.  Everything will pass through ath0 to eth0 and then out ppp0 (if the internet is the destination, of course), but all incoming traffic only makes it as far as eth0.  On br0, it's just gone. tcpdump is completely silent.  I've just finished testing with the latest madwifi-ng svn revision, and nothing has changed.  All packets forwarded to the stations simply vanish.  Rebuilding hostapd makes no difference, and neither does stopping iptables (or not starting it in the first place).  On the stations, every connection times out.  The only "clue" so far is when I try to shutdown in order to reboot with the working 2.6.21-gentoo-r4 kernel:

```

unregister_netdevice: waiting for br0 to become free. Usage count = 1

```

It will do that endlessly, until I use the magic sysrq key to do an emergency sync/remount read-only and force a reboot.  Once I'm running under the 2.6.21 kernel again, everything works.  I can only test this using a wireless/wired bridge since that's the only hardware available (only have the onboard wired NIC), and it's not practical to hijack the XP box anyway. Also, I'm not desperate to fix the problem, as 2.6.21 works.  Mainly, I'm curious if anyone else has run into bridging trouble on 2.6.22, but if someone knows how to fix it, well, great.

----------

## didymos

Turns out, there's a new sysctl key:

```

bridge-nf-filter-pppoe-tagged - BOOLEAN

        1 : pass bridged pppoe-tagged IP/IPv6 traffic to {ip,ip6}tables.

        0 : disable this.

        Default: 1

```

Default value == dead bridge with DSL and PPPOE.  The other bit:

```

unregister_netdevice: waiting for br0 to become free. Usage count = 1 

```

is a separate issue entirely.  Possibly a regression, as I found a lot of old bug reports and posts through google:

http://www.google.com/search?q=unregister_netdevice%3A+waiting+for+br0+to+become+free.+Usage+count+%3D+1

Mostly, it seemed to affect 2.6.11 or .12, though later versions pop up.  Supposedly, it was a reference leak in the bridge code, but some of the stuff I read blamed it on apps keeping stuff in a packet queue.

----------

## didymos

OK, well the problem with this:

```

unregister_netdevice: waiting for br0 to become free. Usage count = 1 

```

has been fixed in sys-kernel/gentoo-sources-2.6.22-r4.  I never got around to running -r3, so that may have fixed the problem as well.  Based on the kernel.org changelog, I suspect it was -r3 that did it (which corresponds to vanilla-sources-2.6.22.2), but I have no idea what specific change it was.

----------

## KUV

I have this problem with 2.6.22-gentoo-r8, a nasty bug! Any thoughts what to do?

----------

## mistik1

This problem still persists on my current 2.6.24-gentoo-r8

```

# uname -a

Linux nehemiah 2.6.24-gentoo-r8 #3 SMP Tue May 20 03:35:55 MDT 2008 i686 Intel(R) Pentium(R) Dual  CPU  E2180  @ 2.00GHz GenuineIntel GNU/Linux

```

```

03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)

```

```

04:06.0 Ethernet controller: D-Link System Inc DGE-530T Gigabit Ethernet Adapter (rev 11) (rev 11)

```

My bridge config is as follows

```

config_lan0=( "null" )

config_wan0=( "null" )

bridge_br0="lan0 wan0"

config_br0=( "10.5.1.2/24" )

routes_br0=(

        "default via 10.5.1.1"

        "10.5.0.0/16 via 10.5.1.10"

)

brctl_br0=( "stp on" )

```

I have several machines running this exact config and hardware and all of them has this problem. As a tomporary method I have removed the wait code from the kernel, Since these machines do not reboot very often I have instituted a policy of reboot for netdev changes (I hate doing this) so I dont end up with a kernel panic if something tries to access this reference. 

This is really quite nasty and if the kernel.org cant get it fixed can the Gentoo kernel herd give it a shot? rebooting a machine that hundreds are depending on to add a new route for example is bad practice at best. These machines also runs things like squid that take some time to shutdown so the wait can be a bit tedious.

Here's to hoping someone can address this soon, this bug has been around a long time.

----------

