# Problems with bonding and ucarp

## Iced-Tux

Hello all,

I try to setup 2 SSH-gateways on differnt ports at a CISCO Catalyst 3550. 

Every PC has 2 NICs. To provide some high availability I try to bond the NICs *kernel 2.6* in active-backup bonding mode and connect the 2 PCs through ucarp.

The bonding-driver tells me, that all links are up *100BaseT-FD* as mii-tool does. 

ifconfig shows that only one interface has the ARP flag set, as it should be.

My problem is:

1) Switching to the other ethernet device is awfully slow *takes up to 1 min*

2) if all eth's are down and eventually one comes up, no connenction to the bonding device is restored

3) switching between Master/Slave with ucarp is also very slow

So worst case is that my high availability is broken if on one pc all the eth's go temporarily down, cause they won't be reachable anymore.

I have tried to play around with other module-settings to improve downdelay/updelay and such but with no success.

my bonding script:

```

#-----------------------------------------------

#!bin/sh

#Init script for starting/stoping bonding service

export PATH="{$PATH:+$PATH}:/usr/sbin:/sbin"

case "$1" in

   start)

      echo "Bonding starting"

      modprobe bonding miimon=100 mode=1 use_carrier=1 downdelay=100 updelay=100 primary=eth0

      ifconfig bond0 192.168.0.2 netmask 255.255.255.0 broadcast 192.168.0.255 up

      route add default gw 192.168.0.1

      ifenslave bond0 eth0 eth1

      ;;

   stop)

      echo "Bonding stoping"

      ifconfig bond0 down

      rmmod bonding

      ;;

   *)

      echo "Usage: /etc/init.d/bonding {start|stop}"

      exit 1

esac

exit 0

#-----------------------------------------------

```

my ucarp_service skript: *the addresses -s switch for ucarp and in the slave script differ per machine naturally*

```

#!/bin/sh

# /etc/init.d/ucarp: start/stop ucarp

export PATH="{$PATH:+$PATH:}/usr/sbin:/sbin"

case "$1" in 

   start)

      echo "UCARP wird gestartet"

      ucarp -i bond0 -s 192.168.0.2 -v 42 -a 192.168.0.100 -p failover -u /etc/ucarp/master -d /etc/ucarp/slave -z -f local0 &

      #-i Interface to bind to

      #-s "real ip" of interface

      #-v virtual server ID

      #-a virtual ip

      #-p shared password, encrypted

      #-u script executed when MASTER

      #-d script executed when BACKUP

      #-z use -d command when ucarp shuts down

      #-f facility for syslog

      echo $! > /var/run/ucarp.pid

      #capture PID 

      ;;

   stop)

      echo "UCARP wird gestoppt"

      kill `cat /var/run/ucarp.pid`

      #kill captured PID

      ;;

   *)

      echo "Usage: /etc/init.d/ucarp {start|stop}"

      exit 1

esac

exit 0

```

Master and Slave file, which are called by ucarp_service:

```

#!/bin/sh

#script is used to assign "virtal-ip" to the bonding device

ifconfig $1 down

ifconfig $1 192.168.0.100 netmask 255.255.255.0 broadcast 192.168.0.255 up

ifenslave bond0 eth0 eth1

route add default gw 192.168.0.1

//------------------------------------------------------------------------------------

#!/bin/sh

#script is used to assign "real-ip" to the bonding device

ifconfig $1 down

ifconfig $1 192.168.0.2 netmask 255.255.255.0 broadcast 192.168.0.255 up

ifenslave $1 eth0 eth1

route add default gw 192.168.0.1

```

----------

## frilled

Hm, I am just about to configure ucarp with bonding, too. It seems a little strange to me that you re-assign the IP addresses of the bonding interface instead of just upping and downing the virtual IP, because the virtual IP is supposed to exist in addition to the 'real' IPs (because that's where they exchange the heartbeat protocol).

I have no problems with bonding (without ucarp), though. Takeover is instant, the preferred master is always activated if it gets a link (which is important since it's 1GB whereas the failover is 100MB only), and even after unplugging and reconnecting both NICs, everything still works as expected.

My approach was slightly different, but I can't see any big difference. I load the bonding module via /etc/modules.autoload.d/kernel-2.6, set all parameters in /etc/conf.d/net (iface_bond0 ....) and added a "postup()" function to /etc/init.d/net.bond0 where the two physical interfaces are (if)enslaved. So no big difference other than I feel it's a little more 'the Gentoo way' like this.

I'll get back to you as soon as I got ucarp up and running (still emerging mysql to have a real testbed).

----------

## frilled

So, I changed the up and down scripts to this:

#! /bin/sh

# "up" script

/sbin/ifconfig $1:1 10.2.<#>.<#> netmask 255.255.0.0 broadcast 10.2.255.255 up

#! /bin/sh

# "down" script

/sbin/ifconfig $1:1 down

and it works, switching is *very* fast. As you can see I simply put up / drop an alias (:1) for the given interface.

It's not nice yet, since during initialization the down-script is called and produces an error since the alias does not exist yet. But this was just a proof of concept. I'm going to start working on the real solution now.

BTW: I changed the init script to Gentoo conventions, too  :Wink: 

----------

## UberLord

 *wgi wrote:*   

> 
> 
> It's not nice yet, since during initialization the down-script is called and produces an error since the alias does not exist yet. But this was just a proof of concept. I'm going to start working on the real solution now.
> 
> BTW: I changed the init script to Gentoo conventions, too 

 

In that case you may want to think about writing a network module instead of a full init script. baselayout-1.11.12-r4 is a good starting point and will go stable tomorrow - unless Some Bad Bug gets reported.

emerge that and checkout sample modules in /lib/rcscripts/net.modules.d

----------

## frilled

Thanks for the tip. Been waiting for a new stable baselayout for a while (because support for bonding is hacked in via postup() now). I hope the "Bad Bugs" stay away for now  :Smile: 

----------

