unexplainable packet drops with 5 ethernet NICs and low traffic on Ubuntu

Posted by jon on Server Fault See other posts from Server Fault or by jon
Published on 2012-09-05T21:54:09Z Indexed on 2012/09/12 15:40 UTC
Read the original article Hit count: 732

I'm stuck on problem where my machine started to drops packets with no sign of ANY system load or high interrupt usage after an upgrade to Ubuntu 12.04. My server is a network monitoring sensor, running Ubuntu LTS 12.04, it passively collects packets from 5 interfaces doing network intrusion type stuff. Before the upgrade I managed to collect 200+GB of packets a day while writing them to disk with around 0% packet loss depending on the day with the help of CPU affinity and NIC IRQ to CPU bindings. Now I lose a great deal of packets with none of my applications running and at very low PPS rate which a modern workstation NIC would have no trouble with.

Specs: x64 Xeon 4 cores 3.2 Ghz 16 GB RAM NICs: 5 Intel Pro NICs using the e1000 driver (NAPI). [1] eth0 and eth1 are integrated NICs (in the motherboard) There are 2 other PCI-X network cards, each with 2 Ethernet ports.

3 of the interfaces are running at Gigabit Ethernet, the others are not because they're attached to hubs.

Specs: [2] http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390aa.htm

uptime
 17:36:00 up  1:43,  2 users,  load average: 0.00, 0.01, 0.05

# uname -a
Linux nms 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64     x86_64 GNU/Linux

I also have the CPU governor set to performance mode and irqbalance off. The problem still occurs with them on.

# lspci -t -vv
-[0000:00]-+-00.0  Intel Corporation E7520 Memory Controller Hub
           +-02.0-[01-03]--+-00.0-[02]----0e.0  Dell PowerEdge Expandable RAID controller 4
           |               \-00.2-[03]--
           +-04.0-[04]--
           +-05.0-[05-07]--+-00.0-[06]----07.0  Intel Corporation 82541GI Gigabit Ethernet Controller
           |               \-00.2-[07]----08.0  Intel Corporation 82541GI Gigabit Ethernet Controller
           +-06.0-[08-0a]--+-00.0-[09]--+-04.0  Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
           |               |            \-04.1  Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
           |               \-00.2-[0a]--+-02.0  Digium, Inc. Wildcard TE210P/TE212P dual-span T1/E1/J1 card 3.3V
           |                            +-03.0  Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
           |                            \-03.1  Intel Corporation 82546EB Gigabit Ethernet Controller (Copper)
           +-1d.0  Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1
           +-1d.1  Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2
           +-1d.2  Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3
           +-1d.7  Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller
           +-1e.0-[0b]----0d.0  Advanced Micro Devices [AMD] nee ATI RV100 QY [Radeon 7000/VE]
           +-1f.0  Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge
           \-1f.1  Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller

I believe the NIC nor the NIC drivers are dropping the packets because ethtool reports 0 under rx_missed_errors and rx_no_buffer_count for each interface. On the old system, if it couldn't keep up this is where the drops would be. I drop packets on multiple interfaces just about every second, usually in small increments of 2-4.

I tried all these sysctl values, I'm currently using the uncommented ones.

# cat /etc/sysctl.conf
# high
net.core.netdev_max_backlog = 3000000
net.core.rmem_max = 16000000
net.core.rmem_default = 8000000
# defaults
#net.core.netdev_max_backlog = 1000
#net.core.rmem_max = 131071
#net.core.rmem_default = 163480
# moderate
#net.core.netdev_max_backlog = 10000
#net.core.rmem_max = 33554432
#net.core.rmem_default = 33554432

Here's an example of an interface stats report with ethtool. They are all the same, nothing is out of the ordinary ( I think ), so I'm only going to show one:

ethtool -S eth2
NIC statistics:
     rx_packets: 7498
     tx_packets: 0
     rx_bytes: 2722585
     tx_bytes: 0
     rx_broadcast: 327
     tx_broadcast: 0
     rx_multicast: 1504
     tx_multicast: 0
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 0
     multicast: 1504
     collisions: 0
     rx_length_errors: 0
     rx_over_errors: 0
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_no_buffer_count: 0
     rx_missed_errors: 0
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     tx_window_errors: 0
     tx_abort_late_coll: 0
     tx_deferred_ok: 0
     tx_single_coll_ok: 0
     tx_multi_coll_ok: 0
     tx_timeout_count: 0
     tx_restart_queue: 0
     rx_long_length_errors: 0
     rx_short_length_errors: 0
     rx_align_errors: 0
     tx_tcp_seg_good: 0
     tx_tcp_seg_failed: 0
     rx_flow_control_xon: 0
     rx_flow_control_xoff: 0
     tx_flow_control_xon: 0
     tx_flow_control_xoff: 0
     rx_long_byte_count: 2722585
     rx_csum_offload_good: 0
     rx_csum_offload_errors: 0
     alloc_rx_buff_failed: 0
     tx_smbus: 0
     rx_smbus: 0
     dropped_smbus: 01

# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:11:43:e0:e2:8c  
          UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST  MTU:1500  Metric:1
          RX packets:373348 errors:16 dropped:95 overruns:0 frame:16
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:356830572 (356.8 MB)  TX bytes:0 (0.0 B)

eth1      Link encap:Ethernet  HWaddr 00:11:43:e0:e2:8d  
          UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST  MTU:1500  Metric:1
          RX packets:13616 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:8690528 (8.6 MB)  TX bytes:0 (0.0 B)

eth2      Link encap:Ethernet  HWaddr 00:04:23:e1:77:6a  
          UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST  MTU:1500  Metric:1
          RX packets:7750 errors:0 dropped:471 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:2780935 (2.7 MB)  TX bytes:0 (0.0 B)

eth3      Link encap:Ethernet  HWaddr 00:04:23:e1:77:6b  
          UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST  MTU:1500  Metric:1
          RX packets:5112 errors:0 dropped:206 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:639472 (639.4 KB)  TX bytes:0 (0.0 B)

eth4      Link encap:Ethernet  HWaddr 00:04:23:b6:35:6c  
          UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST  MTU:1500  Metric:1
          RX packets:961467 errors:0 dropped:935 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:958561305 (958.5 MB)  TX bytes:0 (0.0 B)

eth5      Link encap:Ethernet  HWaddr 00:04:23:b6:35:6d  
          inet addr:192.168.1.6  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:4264 errors:0 dropped:16 overruns:0 frame:0
          TX packets:699 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:572228 (572.2 KB)  TX bytes:124456 (124.4 KB)

I tried the defaults, then started to play around with settings. I wasn't using any flow control and I increased the RxDescriptor count to 4096 before the upgrade as well without any problems.

# cat /etc/modprobe.d/e1000.conf
options e1000 XsumRX=0,0,0,0,0 RxDescriptors=4096,4096,4096,4096,4096 FlowControl=0,0,0,0,0 debug=16

Here's my network configuration file, I turned off checksumming and various offloading mechanisms along with setting CPU affinity with heavy use interfaces getting an entire CPU and light use interfaces sharing a CPU. I used these settings prior to the upgrade without problems.

# cat /etc/network/interfaces

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface

auto eth0 
iface eth0 inet manual
    pre-up /sbin/ethtool -G eth0 rx 4096 tx 0
    pre-up /sbin/ethtool -K eth0 gro off gso off rx off
    pre-up /sbin/ethtool -A eth0 rx off autoneg off
    up ifconfig eth0 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
    post-up echo "4" > /proc/irq/48/smp_affinity
        down ifconfig eth0 down
        post-down /sbin/ethtool -G eth0 rx 256 tx 256
        post-down /sbin/ethtool -K eth0 gro on gso on rx on
        post-down /sbin/ethtool -A eth0 rx on autoneg on    

auto eth1
iface eth1 inet manual
    pre-up /sbin/ethtool -G eth1 rx 4096 tx 0
    pre-up /sbin/ethtool -K eth1 gro off gso off rx off
    pre-up /sbin/ethtool -A eth1 rx off autoneg off
    up ifconfig eth1 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
    post-up echo "4" > /proc/irq/49/smp_affinity
        down ifconfig eth1 down
        post-down /sbin/ethtool -G eth1 rx 256 tx 256
        post-down /sbin/ethtool -K eth1 gro on gso on rx on
        post-down /sbin/ethtool -A eth1 rx on autoneg on

auto eth2
iface eth2 inet manual
    pre-up /sbin/ethtool -G eth2 rx 4096 tx 0
    pre-up /sbin/ethtool -K eth2 gro off gso off rx off
    pre-up /sbin/ethtool -A eth2 rx off autoneg off
    up ifconfig eth2 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
    post-up echo "1" > /proc/irq/82/smp_affinity
    down ifconfig eth2 down
    post-down /sbin/ethtool -G eth2 rx 256 tx 256
    post-down /sbin/ethtool -K eth2 gro on gso on rx on 
    post-down /sbin/ethtool -A eth2 rx on autoneg on

auto eth3
iface eth3 inet manual
    pre-up /sbin/ethtool -G eth3 rx 4096 tx 0
    pre-up /sbin/ethtool -K eth3 gro off gso off rx off
    pre-up /sbin/ethtool -A eth3 rx off autoneg off
    up ifconfig eth3 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
    post-up echo "2" > /proc/irq/83/smp_affinity
        down ifconfig eth3 down
        post-down /sbin/ethtool -G eth3 rx 256 tx 256
        post-down /sbin/ethtool -K eth3 gro on gso on rx on
        post-down /sbin/ethtool -A eth3 rx on autoneg on

auto eth4
iface eth4 inet manual
    pre-up /sbin/ethtool -G eth4 rx 4096 tx 0
    pre-up /sbin/ethtool -K eth4 gro off gso off rx off
    pre-up /sbin/ethtool -A eth4 rx off autoneg off
    up ifconfig eth4 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up
    post-up echo "4" > /proc/irq/77/smp_affinity
        down ifconfig eth4 down
        post-down /sbin/ethtool -G eth4 rx 256 tx 256
        post-down /sbin/ethtool -K eth4 gro on gso on rx on
        post-down /sbin/ethtool -A eth4 rx on autoneg on

auto eth5
iface eth5 inet static
    pre-up /etc/fw.conf
    address 192.168.1.1
    netmask 255.255.255.0
    broadcast 192.168.1.255
    gateway 192.168.1.1
    dns-nameservers 192.168.1.2 192.168.1.3
    up ifconfig eth5 up
    post-up echo "8" > /proc/irq/77/smp_affinity
    down ifconfig eth5 down

Here's a few examples of packet drops, i ran one after another, probabling totaling 3 or 4 seconds. You can see increases in the drops from the 1st and 3rd. This was a non-busy time, very little traffic.

# awk '{ print $1,$5 }' /proc/net/dev 
Inter-| 
face drop
eth3: 225
lo: 0
eth2: 505
eth1: 0
eth5: 17
eth0: 105
eth4: 1034

# awk '{ print $1,$5 }' /proc/net/dev 
Inter-| 
face drop
eth3: 225
lo: 0
eth2: 507
eth1: 0
eth5: 17
eth0: 105
eth4: 1034

# awk '{ print $1,$5 }' /proc/net/dev 
Inter-| 
face drop
eth3: 227
lo: 0
eth2: 512
eth1: 0
eth5: 17
eth0: 105
eth4: 1039

I tried the pci=noacpi options. With and without, it's the same. This is what my interrupt stats looked like before the upgrade, after, with ACPI on PCI it showed multiple NICs bound to an interrupt and shared with other devices such as USB drives which I didn't like so I think i'm going to keep it with ACPI off as it's easier to designate sole purpose interrupts. Is there any advantage I would have using the default i.e. ACPI w/ PCI. ?

# cat /etc/default/grub | grep CMD_LINE

GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 noacpi pci=noacpi"
GRUB_CMDLINE_LINUX=""

# cat /proc/interrupts 
            CPU0       CPU1       CPU2       CPU3       
   0:         45          0          0         16   IO-APIC-edge      timer
   1:          1          0          0       7936   IO-APIC-edge      i8042
   2:          0          0          0          0    XT-PIC-XT-PIC    cascade
   6:          0          0          0          3   IO-APIC-edge      floppy
   8:          0          0          0          1   IO-APIC-edge      rtc0
   9:          0          0          0          0   IO-APIC-edge      acpi
  12:          0          0          0       1809   IO-APIC-edge      i8042
  14:          1          0          0       4498   IO-APIC-edge      ata_piix
  15:          0          0          0          0   IO-APIC-edge      ata_piix
  16:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb2
  18:          0          0          0       1350   IO-APIC-fasteoi   uhci_hcd:usb4, radeon
  19:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
  23:          0          0          0       4099   IO-APIC-fasteoi   ehci_hcd:usb1
  38:          0          0          0      61963   IO-APIC-fasteoi   megaraid
  48:          0          0    1002319          4   IO-APIC-fasteoi   eth0
  49:          0          0      38772          3   IO-APIC-fasteoi   eth1
  77:          0          0     130076     432159   IO-APIC-fasteoi   eth4
  78:          0          0          0      23917   IO-APIC-fasteoi   eth5
  82:    1329033          0          0          4   IO-APIC-fasteoi   eth2
  83:          0    4886525          0          6   IO-APIC-fasteoi   eth3
 NMI:          5          6          4          5   Non-maskable interrupts
 LOC:      61409      57076      64257     114764   Local timer interrupts
 SPU:          0          0          0          0   Spurious interrupts
 IWI:          0          0          0          0   IRQ work interrupts
 RES:      17956      25333      13436      14789   Rescheduling interrupts
 CAL:      22436        607        539        478   Function call interrupts
 TLB:       1525       1458       4600       4151   TLB shootdowns
 TRM:          0          0          0          0   Thermal event interrupts
 THR:          0          0          0          0   Threshold APIC interrupts
 MCE:          0          0          0          0   Machine check exceptions
 MCP:         16         16         16         16   Machine check polls
 ERR:          0
 MIS:          0

Here's sample output of vmstat, showing the system. Barebones system right now.

 root@nms:~# vmstat -S m 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0      0  14992    192   1029    0    0    56     2  419   29  1  0 99  0
 0  0      0  14992    192   1029    0    0     0     0  922   27  0  0 100  0
 0  0      0  14991    192   1029    0    0     0    36  763   50  0  0 100  0
 0  0      0  14991    192   1029    0    0     0     0  646   35  0  0 100  0
 0  0      0  14991    192   1029    0    0     0     0  722   54  0  0 100  0
 0  0      0  14991    192   1029    0    0     0     0  793   27  0  0 100  0
 ^C

Here's dmesg output. I can't figure out why my PCI-X slots are negotiated as PCI. The network cards are all PCI-X with the exception of the integrated NICs that came with the server. In the output below it looks as if eth3 and eth2 negotiated at PCI-X speeds rather than PCI:66Mhz. Wouldn't they all drop to PCI:66Mhz? If your integrated NICs are PCI, as labeled below (eth0,eth1), then wouldn't all devices on your bus speed drop down to that slower bus speed? If not, I still don't know why only one of my NICs ( each has two ethernet ports) is labeled as PCI-X in the output below. Does that mean it is running at PCI-X speeds are is it showing that it's capable?

# dmesg | grep e1000
[ 3678.349337] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
[ 3678.349342] e1000: Copyright (c) 1999-2006 Intel Corporation.
[ 3678.349394] e1000 0000:06:07.0: PCI->APIC IRQ transform: INT A -> IRQ 48
[ 3678.409725] e1000 0000:06:07.0: Receive Descriptors set to 4096
[ 3678.409730] e1000 0000:06:07.0: Checksum Offload Disabled
[ 3678.409734] e1000 0000:06:07.0: Flow Control Disabled
[ 3678.586409] e1000 0000:06:07.0: eth0: (PCI:66MHz:32-bit) 00:11:43:e0:e2:8c
[ 3678.586419] e1000 0000:06:07.0: eth0: Intel(R) PRO/1000 Network Connection
[ 3678.586642] e1000 0000:07:08.0: PCI->APIC IRQ transform: INT A -> IRQ 49
[ 3678.649854] e1000 0000:07:08.0: Receive Descriptors set to 4096
[ 3678.649859] e1000 0000:07:08.0: Checksum Offload Disabled
[ 3678.649863] e1000 0000:07:08.0: Flow Control Disabled
[ 3678.826436] e1000 0000:07:08.0: eth1: (PCI:66MHz:32-bit) 00:11:43:e0:e2:8d
[ 3678.826444] e1000 0000:07:08.0: eth1: Intel(R) PRO/1000 Network Connection
[ 3678.826627] e1000 0000:09:04.0: PCI->APIC IRQ transform: INT A -> IRQ 82
[ 3679.093266] e1000 0000:09:04.0: Receive Descriptors set to 4096
[ 3679.093271] e1000 0000:09:04.0: Checksum Offload Disabled
[ 3679.093275] e1000 0000:09:04.0: Flow Control Disabled
[ 3679.130239] e1000 0000:09:04.0: eth2: (PCI-X:133MHz:64-bit) 00:04:23:e1:77:6a
[ 3679.130246] e1000 0000:09:04.0: eth2: Intel(R) PRO/1000 Network Connection
[ 3679.130449] e1000 0000:09:04.1: PCI->APIC IRQ transform: INT B -> IRQ 83
[ 3679.397312] e1000 0000:09:04.1: Receive Descriptors set to 4096
[ 3679.397318] e1000 0000:09:04.1: Checksum Offload Disabled
[ 3679.397321] e1000 0000:09:04.1: Flow Control Disabled
[ 3679.434350] e1000 0000:09:04.1: eth3: (PCI-X:133MHz:64-bit) 00:04:23:e1:77:6b
[ 3679.434360] e1000 0000:09:04.1: eth3: Intel(R) PRO/1000 Network Connection
[ 3679.434553] e1000 0000:0a:03.0: PCI->APIC IRQ transform: INT A -> IRQ 77
[ 3679.704072] e1000 0000:0a:03.0: Receive Descriptors set to 4096
[ 3679.704077] e1000 0000:0a:03.0: Checksum Offload Disabled
[ 3679.704081] e1000 0000:0a:03.0: Flow Control Disabled
[ 3679.738364] e1000 0000:0a:03.0: eth4: (PCI:33MHz:64-bit) 00:04:23:b6:35:6c
[ 3679.738371] e1000 0000:0a:03.0: eth4: Intel(R) PRO/1000 Network Connection
[ 3679.738538] e1000 0000:0a:03.1: PCI->APIC IRQ transform: INT B -> IRQ 78
[ 3680.046060] e1000 0000:0a:03.1: eth5: (PCI:33MHz:64-bit) 00:04:23:b6:35:6d
[ 3680.046067] e1000 0000:0a:03.1: eth5: Intel(R) PRO/1000 Network Connection
[ 3682.132415] e1000: eth0 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None
[ 3682.224423] e1000: eth1 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None
[ 3682.316385] e1000: eth2 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None
[ 3682.408391] e1000: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[ 3682.500396] e1000: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[ 3682.708401] e1000: eth5 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX

At first I thought it was the NIC drivers but I'm not so sure. I really have no idea where else to look at the moment.

Any help is greatly appreciated as I'm struggling with this. If you need more information just ask.

Thanks!

[1]http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/Documentation/networking/e1000.txt?v=2.6.11.8 [2] http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390aa.htm

© Server Fault or respective owner

Related posts about ubuntu

Related posts about network-monitoring