How do I keep a bridge enabled on a bonded interface?
- by jlawer
I'm working on setting up a pair of CentOS 6.3 servers that will run a couple of KVM vms and have come across a problem setting up a bridge on a bond.
I am using Mode 4 (802.3ad) bonding on a pair of stacked Dell Powerconnect 5524 switches connecting to R320 servers. There are 2 links (1 to each switch) that form a Link Aggregation Group (802.3ad / LACP bonding). On top of the bond I have VLAN Tagging.
I've verified this is a problem on multiple other bonding modes so it isn't just a mode 4 issue.
I am testing what happens when 1 link is dropped (ie switch dies, cable breaks, etc).
If I don't have a bridge (for KVM), everything works fine, failover happens as expected.
If I have the bridge enabled, it works fine until failover (unplugging a cable). When failover happens /var/log/messages shows the slave link going down, followed within a second by:
kernel: br1: port 1(bond0.8) entering disabled state
The thing is /proc/net/bonding/bond0 shows the link is up as expected (simply with only 1 slave instead of 2). If I plug the cable back in it recovers and brings the bridge back to an enabled state.
I actually have tested this while a ping is occuring and if the timing is right a packet will actually leave the system after the link is lost, but before the disabled message occurs.
This disabled state I assumed was STP, but I have disabled STP on the bridge configuration and this issue still occurs.
brctl showstp br1
still shows the link as disabled when it is running without a slave.
I also switched between the nics in the server (I have 2x Broadcom & 4x intel). It doesn't matter which configuration I have.
Does anyone know of a way to force the bridge to stay enabled or why its detecting the bond as disabled, when it isn't?