Stop duplicate icmp echo replies when bridging to a dummy interface?
- by mbrownnyc
I recently configured a bridge br0 with members as eth0 (real if) and dummy0 (dummy.ko if).
When I ping this machine, I receive duplicate replies as:
# ping SERVERA
PING SERVERA.domain.local (192.168.100.115) 56(84) bytes of data.
64 bytes from SERVERA.domain.local (192.168.100.115): icmp_seq=1 ttl=62 time=113 ms
64 bytes from SERVERA.domain.local (192.168.100.115): icmp_seq=1 ttl=62 time=114 ms (DUP!)
64 bytes from SERVERA.domain.local (192.168.100.115): icmp_seq=2 ttl=62 time=113 ms
64 bytes from SERVERA.domain.local (192.168.100.115): icmp_seq=2 ttl=62 time=113 ms (DUP!)
Using tcpdump on SERVERA, I was able to see icmp echo replies being sent from eth0 and br0 itself as follows (oddly two echo request packets arrive "from" my Windows box myhost):
23:19:05.324192 IP myhost.domain.local > SERVERA.domain.local: ICMP echo request, id 512, seq 43781, length 40
23:19:05.324212 IP SERVERA.domain.local > myhost.domain.local: ICMP echo reply, id 512, seq 43781, length 40
23:19:05.324217 IP myhost.domain.local > SERVERA.domain.local: ICMP echo request, id 512, seq 43781, length 40
23:19:05.324221 IP SERVERA.domain.local > myhost.domain.local: ICMP echo reply, id 512, seq 43781, length 40
23:19:05.324264 IP SERVERA.domain.local > myhost.domain.local: ICMP echo reply, id 512, seq 43781, length 40
23:19:05.324272 IP SERVERA.domain.local > myhost.domain.local: ICMP echo reply, id 512, seq 43781, length 40
It's worth noting, testing reveals that hosts on the same physical switch do not see DUP icmp echo responses (a host on the same VLAN on another switch does see a dup icmp echo response).
I've read that this could be due to the ARP table of a switch, but I can't find any info directly related to bridges, just bonds. I have a feeling my problem lay in the stack on linux, not the switch, but am opened to any suggestions.
The system is running centos6/el6 kernel 2.6.32-71.29.1.el6.i686.
How do I stop ICMP echo replies from being sent in duplicate when dealing with a bridge interface/bridged interfaces?
Thanks,
Matt
[edit]
Quick note:
It was recommended in #linux to:
[08:53] == mbrownnyc [gateway/web/freenode/] has joined ##linux
[08:57] <lkeijser> mbrownnyc: what happens if you set arp_ignore to 1 for the dummy interface?
[08:59] <lkeijser> also set arp_announce to 2 for that interface
[09:24] <mbrownnyc> lkeijser: I set arp_annouce to 2, arp_ignore to 2 in /etc/sysctl.conf and rebooted the machine... verifying that the bits are set after boot... the problem is still present
I did this and came up empty. Same dup problem.
I will be moving away from including the dummy interface in the bridge as:
[09:31] == mbrownnyc [gateway/web/freenode/] has joined #Netfilter
[09:31] <mbrownnyc> Hello all... I'm wondering, is it correct that even with an interface in PROMISC that the kernel will drop /some/ packets before they reach applications?
[09:31] <whaffle> What would you make think so?
[09:32] <mbrownnyc> I ask because I am receiving ICMP echo replies after configuring a bridge with a dummy interface in order for ipt_netflow to see all packets, only as reported in it's documentation: http://ipt-netflow.git.sourceforge.net/git/gitweb.cgi?p=ipt-netflow/ipt-netflow;a=blob;f=README.promisc
[09:32] <mbrownnyc> but I do not know if PROMISC will do the same job
[09:33] <mbrownnyc> I was referred here from #linux. any assistance is appreciated
[09:33] <whaffle> The following conditions need to be met: PROMISC is enabled (bridges and applications like tcpdump will do this automatically, otherwise they won't function).
[09:34] <whaffle> If an interface is part of a bridge, then all packets that enter the bridge should already be visible in the raw table.
[09:35] <mbrownnyc> thanks whaffle PROMISC must be set manually for ipt_netflow to function, but
[09:36] <whaffle> promisc does not need to be set manually, because the bridge will do it for you.
[09:36] <whaffle> When you do not have a bridge, you can easily create one, thereby rendering any kernel patches moot.
[09:36] <mbrownnyc> whaffle: I speak without the bridge
[09:36] <whaffle> It is perfectly valid to have a "half-bridge" with only a single interface in it.
[09:36] <mbrownnyc> whaffle: I am unfamiliar with the raw table, does this mean that PROMISC allows the raw table to be populated with packets the same as if the interface was part of a bridge?
[09:37] <whaffle> Promisc mode will cause packets with {a dst MAC address that does not equal the interface's MAC address} to be delivered from the NIC into the kernel nevertheless.
[09:37] <mbrownnyc> whaffle: I suppose I mean to clearly ask: what benefit would creating a bridge have over setting an interface PROMISC?
[09:38] <mbrownnyc> whaffle: from your last answer I feel that the answer to my question is "none," is this correct?
[09:39] <whaffle> Furthermore, the linux kernel itself has a check for {packets with a non-local MAC address}, so that packets that will not enter a bridge will be discarded as well, even in the face of PROMISC.
[09:46] <mbrownnyc> whaffle: so, this last bit of information is quite clearly why I would need and want a bridge in my situation
[09:46] <mbrownnyc> okay, the ICMP echo reply duplicate issue is likely out of the realm of this channel, but I sincerely appreciate the info on the kernels inner-workings
[09:52] <whaffle> mbrownnyc: either the kernel patch, or a bridge with an interface. Since the latter is quicker, yes
[09:54] <mbrownnyc> thanks whaffle
[edit2]
After removing the bridge, and removing the dummy kernel module, I only had a single interface chilling out, lonely. I still received duplicate icmp echo replies... in fact I received a random amount: http://pastebin.com/2LNs0GM8
The same thing doesn't happen on a few other hosts on the same switch, so it has to do with the linux box itself. I'll likely end up rebuilding it next week. Then... you know... this same thing will occur again.
[edit3]
Guess what? I rebuilt the box, and I'm still receiving duplicate ICMP echo replies. Must be the network infrastructure, although the ARP tables do not contain multiple entries.
[edit4]
How ridiculous.
The machine was a network probe, so I was (ingress and egress) mirroring an uplink port to a node that was the NIC. So, the flow (must have) gone like this:
ICMP echo request comes in through the mirrored uplink port.
(the real) ICMP echo request is received by the NIC
(the mirrored) ICMP echo request is received by the NIC
ICMP echo reply is sent for both.
I'm ashamed of myself, but now I know. It was suggested on #networking to either isolate the mirrored traffic to an interface that does not have IP enabled, or tag the mirrored packets with dot1q.