unexplainable packet drops with 5 ethernet NICs and low traffic on Ubuntu
- by jon
I'm stuck on problem where my machine started to drops packets with no sign of ANY system load or high interrupt usage after an upgrade to Ubuntu 12.04. My server is a network monitoring sensor, running Ubuntu LTS 12.04, it passively collects packets from 5 interfaces doing network intrusion type stuff. Before the upgrade I managed to collect 200+GB of packets a day while writing them to disk with around 0% packet loss depending on the day with the help of CPU affinity and NIC IRQ to CPU bindings. Now I lose a great deal of packets with none of my applications running and at very low PPS rate which a modern workstation NIC would have no trouble with.
Specs:
x64 Xeon 4 cores 3.2 Ghz
16 GB RAM
NICs: 5 Intel Pro NICs using the e1000 driver (NAPI). [1]
eth0 and eth1 are integrated NICs (in the motherboard)
There are 2 other PCI-X network cards, each with 2 Ethernet ports.
3 of the interfaces are running at Gigabit Ethernet, the others are not because they're
attached to hubs.
Specs:
[2] http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390aa.htm
uptime
17:36:00 up 1:43, 2 users, load average: 0.00, 0.01, 0.05
# uname -a
Linux nms 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
I also have the CPU governor set to performance mode and irqbalance off. The problem still occurs with them on.
# lspci -t -vv
-[0000:00]-+-00.0 Intel Corporation E7520 Memory Controller Hub
+-02.0-[01-03]--+-00.0-[02]----0e.0 Dell PowerEdge Expandable RAID controller 4
| \-00.2-[03]--
+-04.0-[04]--
+-05.0-[05-07]--+-00.0-[06]----07.0 Intel Corporation 82541GI Gigabit Ethernet Controller
| \-00.2-[07]----08.0 Intel Corporation 82541GI Gigabit Ethernet Controller
+-06.0-[08-0a]--+-00.0-[09]--+-04.0 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
| | \-04.1 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
| \-00.2-[0a]--+-02.0 Digium, Inc. Wildcard TE210P/TE212P dual-span T1/E1/J1 card 3.3V
| +-03.0 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
| \-03.1 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
+-1d.0 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1
+-1d.1 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2
+-1d.2 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3
+-1d.7 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller
+-1e.0-[0b]----0d.0 Advanced Micro Devices [AMD] nee ATI RV100 QY [Radeon 7000/VE]
+-1f.0 Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge
\-1f.1 Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller
I believe the NIC nor the NIC drivers are dropping the packets because ethtool reports
0 under rx_missed_errors and rx_no_buffer_count for each interface. On the old system, if it couldn't keep up this is where the drops would be. I drop packets on multiple interfaces just about every second, usually in small increments of 2-4.
I tried all these sysctl values, I'm currently using the uncommented ones.
# cat /etc/sysctl.conf
# high
net.core.netdev_max_backlog = 3000000
net.core.rmem_max = 16000000
net.core.rmem_default = 8000000
# defaults
#net.core.netdev_max_backlog = 1000
#net.core.rmem_max = 131071
#net.core.rmem_default = 163480
# moderate
#net.core.netdev_max_backlog = 10000
#net.core.rmem_max = 33554432
#net.core.rmem_default = 33554432
Here's an example of an interface stats report with ethtool. They are all the same, nothing is out of the ordinary ( I think ), so I'm only going to show one:
ethtool -S eth2
NIC statistics:
rx_packets: 7498
tx_packets: 0
rx_bytes: 2722585
tx_bytes: 0
rx_broadcast: 327
tx_broadcast: 0
rx_multicast: 1504
tx_multicast: 0
rx_errors: 0
tx_errors: 0
tx_dropped: 0
multicast: 1504
collisions: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_no_buffer_count: 0
rx_missed_errors: 0
tx_aborted_errors: 0
tx_carrier_errors: 0
tx_fifo_errors: 0
tx_heartbeat_errors: 0
tx_window_errors: 0
tx_abort_late_coll: 0
tx_deferred_ok: 0
tx_single_coll_ok: 0
tx_multi_coll_ok: 0
tx_timeout_count: 0
tx_restart_queue: 0
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
tx_tcp_seg_good: 0
tx_tcp_seg_failed: 0
rx_flow_control_xon: 0
rx_flow_control_xoff: 0
tx_flow_control_xon: 0
tx_flow_control_xoff: 0
rx_long_byte_count: 2722585
rx_csum_offload_good: 0
rx_csum_offload_errors: 0
alloc_rx_buff_failed: 0
tx_smbus: 0
rx_smbus: 0
dropped_smbus: 01
# ifconfig
eth0 Link encap:Ethernet HWaddr 00:11:43:e0:e2:8c
UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1
RX packets:373348 errors:16 dropped:95 overruns:0 frame:16
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:356830572 (356.8 MB) TX bytes:0 (0.0 B)
eth1 Link encap:Ethernet HWaddr 00:11:43:e0:e2:8d
UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1
RX packets:13616 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:8690528 (8.6 MB) TX bytes:0 (0.0 B)
eth2 Link encap:Ethernet HWaddr 00:04:23:e1:77:6a
UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1
RX packets:7750 errors:0 dropped:471 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2780935 (2.7 MB) TX bytes:0 (0.0 B)
eth3 Link encap:Ethernet HWaddr 00:04:23:e1:77:6b
UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1
RX packets:5112 errors:0 dropped:206 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:639472 (639.4 KB) TX bytes:0 (0.0 B)
eth4 Link encap:Ethernet HWaddr 00:04:23:b6:35:6c
UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1
RX packets:961467 errors:0 dropped:935 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:958561305 (958.5 MB) TX bytes:0 (0.0 B)
eth5 Link encap:Ethernet HWaddr 00:04:23:b6:35:6d
inet addr:192.168.1.6 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:4264 errors:0 dropped:16 overruns:0 frame:0
TX packets:699 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:572228 (572.2 KB) TX bytes:124456 (124.4 KB)
I tried the defaults, then started to play around with settings. I wasn't using any flow control and I increased the RxDescriptor count to 4096 before the upgrade as well without any problems.
# cat /etc/modprobe.d/e1000.conf
options e1000 XsumRX=0,0,0,0,0 RxDescriptors=4096,4096,4096,4096,4096 FlowControl=0,0,0,0,0 debug=16
Here's my network configuration file, I turned off checksumming and various offloading mechanisms along with setting CPU affinity with heavy use interfaces getting an entire CPU
and light use interfaces sharing a CPU. I used these settings prior to the upgrade without problems.
# cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
iface eth0 inet manual
pre-up /sbin/ethtool -G eth0 rx 4096 tx 0
pre-up /sbin/ethtool -K eth0 gro off gso off rx off
pre-up /sbin/ethtool -A eth0 rx off autoneg off
up ifconfig eth0 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
post-up echo "4" > /proc/irq/48/smp_affinity
down ifconfig eth0 down
post-down /sbin/ethtool -G eth0 rx 256 tx 256
post-down /sbin/ethtool -K eth0 gro on gso on rx on
post-down /sbin/ethtool -A eth0 rx on autoneg on
auto eth1
iface eth1 inet manual
pre-up /sbin/ethtool -G eth1 rx 4096 tx 0
pre-up /sbin/ethtool -K eth1 gro off gso off rx off
pre-up /sbin/ethtool -A eth1 rx off autoneg off
up ifconfig eth1 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
post-up echo "4" > /proc/irq/49/smp_affinity
down ifconfig eth1 down
post-down /sbin/ethtool -G eth1 rx 256 tx 256
post-down /sbin/ethtool -K eth1 gro on gso on rx on
post-down /sbin/ethtool -A eth1 rx on autoneg on
auto eth2
iface eth2 inet manual
pre-up /sbin/ethtool -G eth2 rx 4096 tx 0
pre-up /sbin/ethtool -K eth2 gro off gso off rx off
pre-up /sbin/ethtool -A eth2 rx off autoneg off
up ifconfig eth2 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
post-up echo "1" > /proc/irq/82/smp_affinity
down ifconfig eth2 down
post-down /sbin/ethtool -G eth2 rx 256 tx 256
post-down /sbin/ethtool -K eth2 gro on gso on rx on
post-down /sbin/ethtool -A eth2 rx on autoneg on
auto eth3
iface eth3 inet manual
pre-up /sbin/ethtool -G eth3 rx 4096 tx 0
pre-up /sbin/ethtool -K eth3 gro off gso off rx off
pre-up /sbin/ethtool -A eth3 rx off autoneg off
up ifconfig eth3 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
post-up echo "2" > /proc/irq/83/smp_affinity
down ifconfig eth3 down
post-down /sbin/ethtool -G eth3 rx 256 tx 256
post-down /sbin/ethtool -K eth3 gro on gso on rx on
post-down /sbin/ethtool -A eth3 rx on autoneg on
auto eth4
iface eth4 inet manual
pre-up /sbin/ethtool -G eth4 rx 4096 tx 0
pre-up /sbin/ethtool -K eth4 gro off gso off rx off
pre-up /sbin/ethtool -A eth4 rx off autoneg off
up ifconfig eth4 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
post-up echo "4" > /proc/irq/77/smp_affinity
down ifconfig eth4 down
post-down /sbin/ethtool -G eth4 rx 256 tx 256
post-down /sbin/ethtool -K eth4 gro on gso on rx on
post-down /sbin/ethtool -A eth4 rx on autoneg on
auto eth5
iface eth5 inet static
pre-up /etc/fw.conf
address 192.168.1.1
netmask 255.255.255.0
broadcast 192.168.1.255
gateway 192.168.1.1
dns-nameservers 192.168.1.2 192.168.1.3
up ifconfig eth5 up
post-up echo "8" > /proc/irq/77/smp_affinity
down ifconfig eth5 down
Here's a few examples of packet drops, i ran one after another, probabling totaling
3 or 4 seconds. You can see increases in the drops from the 1st and 3rd. This was a non-busy time, very little traffic.
# awk '{ print $1,$5 }' /proc/net/dev
Inter-|
face drop
eth3: 225
lo: 0
eth2: 505
eth1: 0
eth5: 17
eth0: 105
eth4: 1034
# awk '{ print $1,$5 }' /proc/net/dev
Inter-|
face drop
eth3: 225
lo: 0
eth2: 507
eth1: 0
eth5: 17
eth0: 105
eth4: 1034
# awk '{ print $1,$5 }' /proc/net/dev
Inter-|
face drop
eth3: 227
lo: 0
eth2: 512
eth1: 0
eth5: 17
eth0: 105
eth4: 1039
I tried the pci=noacpi options. With and without, it's the same. This is what my interrupt
stats looked like before the upgrade, after, with ACPI on PCI it showed multiple NICs bound
to an interrupt and shared with other devices such as USB drives which I didn't like so I think i'm going to keep it with ACPI off as it's easier to designate sole purpose interrupts. Is there any advantage I would have using the default i.e. ACPI w/ PCI. ?
# cat /etc/default/grub | grep CMD_LINE
GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 noacpi pci=noacpi"
GRUB_CMDLINE_LINUX=""
# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
0: 45 0 0 16 IO-APIC-edge timer
1: 1 0 0 7936 IO-APIC-edge i8042
2: 0 0 0 0 XT-PIC-XT-PIC cascade
6: 0 0 0 3 IO-APIC-edge floppy
8: 0 0 0 1 IO-APIC-edge rtc0
9: 0 0 0 0 IO-APIC-edge acpi
12: 0 0 0 1809 IO-APIC-edge i8042
14: 1 0 0 4498 IO-APIC-edge ata_piix
15: 0 0 0 0 IO-APIC-edge ata_piix
16: 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb2
18: 0 0 0 1350 IO-APIC-fasteoi uhci_hcd:usb4, radeon
19: 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb3
23: 0 0 0 4099 IO-APIC-fasteoi ehci_hcd:usb1
38: 0 0 0 61963 IO-APIC-fasteoi megaraid
48: 0 0 1002319 4 IO-APIC-fasteoi eth0
49: 0 0 38772 3 IO-APIC-fasteoi eth1
77: 0 0 130076 432159 IO-APIC-fasteoi eth4
78: 0 0 0 23917 IO-APIC-fasteoi eth5
82: 1329033 0 0 4 IO-APIC-fasteoi eth2
83: 0 4886525 0 6 IO-APIC-fasteoi eth3
NMI: 5 6 4 5 Non-maskable interrupts
LOC: 61409 57076 64257 114764 Local timer interrupts
SPU: 0 0 0 0 Spurious interrupts
IWI: 0 0 0 0 IRQ work interrupts
RES: 17956 25333 13436 14789 Rescheduling interrupts
CAL: 22436 607 539 478 Function call interrupts
TLB: 1525 1458 4600 4151 TLB shootdowns
TRM: 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 Machine check exceptions
MCP: 16 16 16 16 Machine check polls
ERR: 0
MIS: 0
Here's sample output of vmstat, showing the system. Barebones system right now.
root@nms:~# vmstat -S m 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 14992 192 1029 0 0 56 2 419 29 1 0 99 0
0 0 0 14992 192 1029 0 0 0 0 922 27 0 0 100 0
0 0 0 14991 192 1029 0 0 0 36 763 50 0 0 100 0
0 0 0 14991 192 1029 0 0 0 0 646 35 0 0 100 0
0 0 0 14991 192 1029 0 0 0 0 722 54 0 0 100 0
0 0 0 14991 192 1029 0 0 0 0 793 27 0 0 100 0
^C
Here's dmesg output. I can't figure out why my PCI-X slots are negotiated as PCI.
The network cards are all PCI-X with the exception of the integrated NICs that came with the server. In the output below it looks as if eth3 and eth2 negotiated at PCI-X speeds rather than PCI:66Mhz. Wouldn't they all drop to PCI:66Mhz? If your integrated NICs are PCI, as labeled below (eth0,eth1), then wouldn't all devices on your bus speed drop down to that slower bus speed? If not, I still don't know why only one of my NICs ( each has two ethernet ports) is labeled as PCI-X in the output below. Does that mean it is running at PCI-X speeds are is it showing that it's capable?
# dmesg | grep e1000
[ 3678.349337] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
[ 3678.349342] e1000: Copyright (c) 1999-2006 Intel Corporation.
[ 3678.349394] e1000 0000:06:07.0: PCI->APIC IRQ transform: INT A -> IRQ 48
[ 3678.409725] e1000 0000:06:07.0: Receive Descriptors set to 4096
[ 3678.409730] e1000 0000:06:07.0: Checksum Offload Disabled
[ 3678.409734] e1000 0000:06:07.0: Flow Control Disabled
[ 3678.586409] e1000 0000:06:07.0: eth0: (PCI:66MHz:32-bit) 00:11:43:e0:e2:8c
[ 3678.586419] e1000 0000:06:07.0: eth0: Intel(R) PRO/1000 Network Connection
[ 3678.586642] e1000 0000:07:08.0: PCI->APIC IRQ transform: INT A -> IRQ 49
[ 3678.649854] e1000 0000:07:08.0: Receive Descriptors set to 4096
[ 3678.649859] e1000 0000:07:08.0: Checksum Offload Disabled
[ 3678.649863] e1000 0000:07:08.0: Flow Control Disabled
[ 3678.826436] e1000 0000:07:08.0: eth1: (PCI:66MHz:32-bit) 00:11:43:e0:e2:8d
[ 3678.826444] e1000 0000:07:08.0: eth1: Intel(R) PRO/1000 Network Connection
[ 3678.826627] e1000 0000:09:04.0: PCI->APIC IRQ transform: INT A -> IRQ 82
[ 3679.093266] e1000 0000:09:04.0: Receive Descriptors set to 4096
[ 3679.093271] e1000 0000:09:04.0: Checksum Offload Disabled
[ 3679.093275] e1000 0000:09:04.0: Flow Control Disabled
[ 3679.130239] e1000 0000:09:04.0: eth2: (PCI-X:133MHz:64-bit) 00:04:23:e1:77:6a
[ 3679.130246] e1000 0000:09:04.0: eth2: Intel(R) PRO/1000 Network Connection
[ 3679.130449] e1000 0000:09:04.1: PCI->APIC IRQ transform: INT B -> IRQ 83
[ 3679.397312] e1000 0000:09:04.1: Receive Descriptors set to 4096
[ 3679.397318] e1000 0000:09:04.1: Checksum Offload Disabled
[ 3679.397321] e1000 0000:09:04.1: Flow Control Disabled
[ 3679.434350] e1000 0000:09:04.1: eth3: (PCI-X:133MHz:64-bit) 00:04:23:e1:77:6b
[ 3679.434360] e1000 0000:09:04.1: eth3: Intel(R) PRO/1000 Network Connection
[ 3679.434553] e1000 0000:0a:03.0: PCI->APIC IRQ transform: INT A -> IRQ 77
[ 3679.704072] e1000 0000:0a:03.0: Receive Descriptors set to 4096
[ 3679.704077] e1000 0000:0a:03.0: Checksum Offload Disabled
[ 3679.704081] e1000 0000:0a:03.0: Flow Control Disabled
[ 3679.738364] e1000 0000:0a:03.0: eth4: (PCI:33MHz:64-bit) 00:04:23:b6:35:6c
[ 3679.738371] e1000 0000:0a:03.0: eth4: Intel(R) PRO/1000 Network Connection
[ 3679.738538] e1000 0000:0a:03.1: PCI->APIC IRQ transform: INT B -> IRQ 78
[ 3680.046060] e1000 0000:0a:03.1: eth5: (PCI:33MHz:64-bit) 00:04:23:b6:35:6d
[ 3680.046067] e1000 0000:0a:03.1: eth5: Intel(R) PRO/1000 Network Connection
[ 3682.132415] e1000: eth0 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None
[ 3682.224423] e1000: eth1 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None
[ 3682.316385] e1000: eth2 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None
[ 3682.408391] e1000: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[ 3682.500396] e1000: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[ 3682.708401] e1000: eth5 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
At first I thought it was the NIC drivers but I'm not so sure. I really have no idea where else to look at the moment.
Any help is greatly appreciated as I'm struggling with this. If you need more information just ask.
Thanks!
[1]http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/Documentation/networking/e1000.txt?v=2.6.11.8
[2] http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390aa.htm