FreeBSD 8.1 unstable network connection
- by frankcheong
I have three FreeBSD 8.1 running on three different hardware and therefore consist of different network adapter as well (bce, bge and igb). I found that the network connection is kind of unstable which I have tried to scp some 10MB file and found that I cannot always get the files completed successfully. I have further checked with my network admin and he claim that the problem is being caused by the network driver which cannot support the load whereby he tried to ping using huge packet size (around 15k) and my server will drop packet consistently at a regular interval. I found that this statement may not be valid since the three server is using three different network drive and it would be quite impossible that the same problem is being caused by three different network adapter and thus different network driver.
Since then I have tried to tune up the performance by playing around with the /etc/sysctl.conf figures with no luck.
kern.ipc.somaxconn=1024
kern.ipc.shmall=3276800
kern.ipc.shmmax=1638400000
# Security
net.inet.ip.redirect=0
net.inet.ip.sourceroute=0
net.inet.ip.accept_sourceroute=0
net.inet.icmp.maskrepl=0
net.inet.icmp.log_redirect=0
net.inet.icmp.drop_redirect=1
net.inet.tcp.drop_synfin=1
# Security
net.inet.udp.blackhole=1
net.inet.tcp.blackhole=2
# Required by pf
net.inet.ip.forwarding=1
#Network Performance Tuning
kern.ipc.maxsockbuf=16777216
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216
# Setting specifically for 1 or even 10Gbps network
net.local.stream.sendspace=262144
net.local.stream.recvspace=262144
net.inet.tcp.local_slowstart_flightsize=10
net.inet.tcp.nolocaltimewait=1
net.inet.tcp.mssdflt=1460
net.inet.tcp.sendbuf_auto=1
net.inet.tcp.sendbuf_inc=16384
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=524288
net.inet.tcp.sendspace=262144
net.inet.tcp.recvspace=262144
net.inet.udp.recvspace=262144
kern.ipc.maxsockbuf=16777216
kern.ipc.nmbclusters=32768
net.inet.tcp.delayed_ack=1
net.inet.tcp.delacktime=100
net.inet.tcp.slowstart_flightsize=179
net.inet.tcp.inflight.enable=1
net.inet.tcp.inflight.min=6144
# Reduce the cache size of slow start connection
net.inet.tcp.hostcache.expire=1
Our network admin also claim that they see quite a lot of network up and down from their cisco switch log while I cannot find any up down message inside the dmesg. Have further checked the netstat -s but dont have concrete idea.
tcp:
133695291 packets sent
39408539 data packets (3358837321 bytes)
61868 data packets (89472844 bytes) retransmitted
24 data packets unnecessarily retransmitted
0 resends initiated by MTU discovery
50756141 ack-only packets (2148 delayed)
0 URG only packets
0 window probe packets
4372385 window update packets
39781869 control packets
134898031 packets received
72339403 acks (for 3357601899 bytes)
190712 duplicate acks
0 acks for unsent data
59339201 packets (3647021974 bytes) received in-sequence
114 completely duplicate packets (135202 bytes)
27 old duplicate packets
0 packets with some dup. data (0 bytes duped)
42090 out-of-order packets (60817889 bytes)
0 packets (0 bytes) of data after window
0 window probes
3953896 window update packets
64181 packets received after close
0 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
45192 discarded due to memory problems
19945391 connection requests
1323420 connection accepts
0 bad connection attempts
0 listen queue overflows
0 ignored RSTs in the windows
21133581 connections established (including accepts)
21268724 connections closed (including 32737 drops)
207874 connections updated cached RTT on close
207874 connections updated cached RTT variance on close
132439 connections updated cached ssthresh on close
42392 embryonic connections dropped
72339338 segments updated rtt (of 69477829 attempts)
390871 retransmit timeouts
0 connections dropped by rexmit timeout
0 persist timeouts
0 connections dropped by persist timeout
0 Connections (fin_wait_2) dropped because of timeout
13990 keepalive timeouts
2 keepalive probes sent
13988 connections dropped by keepalive
173044 correct ACK header predictions
36947371 correct data packet header predictions
1323420 syncache entries added
0 retransmitted
0 dupsyn
0 dropped
1323420 completed
0 bucket overflow
0 cache overflow
0 reset
0 stale
0 aborted
0 badack
0 unreach
0 zone failures
1323420 cookies sent
0 cookies received
1864 SACK recovery episodes
18005 segment rexmits in SACK recovery episodes
26066896 byte rexmits in SACK recovery episodes
147327 SACK options (SACK blocks) received
87473 SACK options (SACK blocks) sent
0 SACK scoreboard overflow
0 packets with ECN CE bit set
0 packets with ECN ECT(0) bit set
0 packets with ECN ECT(1) bit set
0 successful ECN handshakes
0 times ECN reduced the congestion window
udp:
5141258 datagrams received
0 with incomplete header
0 with bad data length field
0 with bad checksum
1 with no checksum
0 dropped due to no socket
129616 broadcast/multicast datagrams undelivered
0 dropped due to full socket buffers
0 not for hashed pcb
5011642 delivered
5016050 datagrams output
0 times multicast source filter matched
sctp:
0 input packets
0 datagrams
0 packets that had data
0 input SACK chunks
0 input DATA chunks
0 duplicate DATA chunks
0 input HB chunks
0 HB-ACK chunks
0 input ECNE chunks
0 input AUTH chunks
0 chunks missing AUTH
0 invalid HMAC ids received
0 invalid secret ids received
0 auth failed
0 fast path receives all one chunk
0 fast path multi-part data
0 output packets
0 output SACKs
0 output DATA chunks
0 retransmitted DATA chunks
0 fast retransmitted DATA chunks
0 FR's that happened more than once to same chunk
0 intput HB chunks
0 output ECNE chunks
0 output AUTH chunks
0 ip_output error counter
Packet drop statistics:
0 from middle box
0 from end host
0 with data
0 non-data, non-endhost
0 non-endhost, bandwidth rep only
0 not enough for chunk header
0 not enough data to confirm
0 where process_chunk_drop said break
0 failed to find TSN
0 attempt reverse TSN lookup
0 e-host confirms zero-rwnd
0 midbox confirms no space
0 data did not match TSN
0 TSN's marked for Fast Retran
Timeouts:
0 iterator timers fired
0 T3 data time outs
0 window probe (T3) timers fired
0 INIT timers fired
0 sack timers fired
0 shutdown timers fired
0 heartbeat timers fired
0 a cookie timeout fired
0 an endpoint changed its cookiesecret
0 PMTU timers fired
0 shutdown ack timers fired
0 shutdown guard timers fired
0 stream reset timers fired
0 early FR timers fired
0 an asconf timer fired
0 auto close timer fired
0 asoc free timers expired
0 inp free timers expired
0 packet shorter than header
0 checksum error
0 no endpoint for port
0 bad v-tag
0 bad SID
0 no memory
0 number of multiple FR in a RTT window
0 RFC813 allowed sending
0 RFC813 does not allow sending
0 times max burst prohibited sending
0 look ahead tells us no memory in interface
0 numbers of window probes sent
0 times an output error to clamp down on next user send
0 times sctp_senderrors were caused from a user
0 number of in data drops due to chunk limit reached
0 number of in data drops due to rwnd limit reached
0 times a ECN reduced the cwnd
0 used express lookup via vtag
0 collision in express lookup
0 times the sender ran dry of user data on primary
0 same for above
0 sacks the slow way
0 window update only sacks sent
0 sends with sinfo_flags !=0
0 unordered sends
0 sends with EOF flag set
0 sends with ABORT flag set
0 times protocol drain called
0 times we did a protocol drain
0 times recv was called with peek
0 cached chunks used
0 cached stream oq's used
0 unread messages abandonded by close
0 send burst avoidance, already max burst inflight to net
0 send cwnd full avoidance, already max burst inflight to net
0 number of map array over-runs via fwd-tsn's
ip:
137814085 total packets received
0 bad header checksums
0 with size smaller than minimum
0 with data size < data length
0 with ip length > max ip packet size
0 with header length < data size
0 with data length < header length
0 with bad options
0 with incorrect version number
1200 fragments received
0 fragments dropped (dup or out of space)
0 fragments dropped after timeout
300 packets reassembled ok
137813009 packets for this host
530 packets for unknown/unsupported protocol
0 packets forwarded (0 packets fast forwarded)
61 packets not forwardable
0 packets received for unknown multicast group
0 redirects sent
137234598 packets sent from this host
0 packets sent with fabricated ip header
685307 output packets dropped due to no bufs, etc.
52 output packets discarded due to no route
300 output datagrams fragmented
1200 fragments created
0 datagrams that can't be fragmented
0 tunneling packets that can't find gif
0 datagrams with bad address in header
icmp:
0 calls to icmp_error
0 errors not generated in response to an icmp message
Output histogram:
echo reply: 305
0 messages with bad code fields
0 messages less than the minimum length
0 messages with bad checksum
0 messages with bad length
0 multicast echo requests ignored
0 multicast timestamp requests ignored
Input histogram:
destination unreachable: 530
echo: 305
305 message responses generated
0 invalid return addresses
0 no return routes
ICMP address mask responses are disabled
igmp:
0 messages received
0 messages received with too few bytes
0 messages received with wrong TTL
0 messages received with bad checksum
0 V1/V2 membership queries received
0 V3 membership queries received
0 membership queries received with invalid field(s)
0 general queries received
0 group queries received
0 group-source queries received
0 group-source queries dropped
0 membership reports received
0 membership reports received with invalid field(s)
0 membership reports received for groups to which we belong
0 V3 reports received without Router Alert
0 membership reports sent
arp:
376748 ARP requests sent
3207 ARP replies sent
245245 ARP requests received
80845 ARP replies received
326090 ARP packets received
267712 total packets dropped due to no ARP entry
108876 ARP entrys timed out
0 Duplicate IPs seen
ip6:
2226633 total packets received
0 with size smaller than minimum
0 with data size < data length
0 with bad options
0 with incorrect version number
0 fragments received
0 fragments dropped (dup or out of space)
0 fragments dropped after timeout
0 fragments that exceeded limit
0 packets reassembled ok
2226633 packets for this host
0 packets forwarded
0 packets not forwardable
0 redirects sent
2226633 packets sent from this host
0 packets sent with fabricated ip header
0 output packets dropped due to no bufs, etc.
8 output packets discarded due to no route
0 output datagrams fragmented
0 fragments created
0 datagrams that can't be fragmented
0 packets that violated scope rules
0 multicast packets which we don't join
Input histogram:
UDP: 2226633
Mbuf statistics:
962679 one mbuf
1263954 one ext mbuf
0 two or more ext mbuf
0 packets whose headers are not continuous
0 tunneling packets that can't find gif
0 packets discarded because of too many headers
0 failures of source address selection
Source addresses selection rule applied:
icmp6:
0 calls to icmp6_error
0 errors not generated in response to an icmp6 message
0 errors not generated because of rate limitation
0 messages with bad code fields
0 messages < minimum length
0 bad checksums
0 messages with bad length
Histogram of error messages to be generated:
0 no route
0 administratively prohibited
0 beyond scope
0 address unreachable
0 port unreachable
0 packet too big
0 time exceed transit
0 time exceed reassembly
0 erroneous header field
0 unrecognized next header
0 unrecognized option
0 redirect
0 unknown
0 message responses generated
0 messages with too many ND options
0 messages with bad ND options
0 bad neighbor solicitation messages
0 bad neighbor advertisement messages
0 bad router solicitation messages
0 bad router advertisement messages
0 bad redirect messages
0 path MTU changes
rip6:
0 messages received
0 checksum calculations on inbound
0 messages with bad checksum
0 messages dropped due to no socket
0 multicast messages dropped due to no socket
0 messages dropped due to full socket buffers
0 delivered
0 datagrams output
netstat -m
516/5124/5640 mbufs in use (current/cache/total)
512/1634/2146/32768 mbuf clusters in use (current/cache/total/max)
512/1536 mbuf+clusters out of packet secondary zone in use (current/cache)
0/1303/1303/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
1153K/9761K/10914K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/8/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines
Anyone got an idea what might be the possible cause?