I wanted to share knowledge of tuning FreeBSD via sysctls, so i'm posting them with comments. Based on Igor Sysoev (author of nginx) presentation about FreeBSD tuning up to 100,000-200,000 active connections.
Sysctls are for 7.x FreeBSD. Since 7.2 amd64 some of them are tuned well by default.
Prior 7.0 some of them are boot only (set via /boot/loader.conf) or does not exist at all.
Highload web server sysctls:
# Max. backlog size
kern.ipc.somaxconn=4096
# Shared memory // 7.2+ can use shared memory > 2Gb
kern.ipc.shmmax=2147483648
# Sockets
kern.ipc.maxsockets=204800
# Do not use lager sockbufs on 8.0
# ( http://old.nabble.com/Significant-performance-regression-for-increased-maxsockbuf-on-8.0-RELEASE-tt26745981.html#a26745981 )
kern.ipc.maxsockbuf=262144
# Recive clusters (on amd64 7.2+ 65k is default)
# For such high value vm.kmem_size must be increased to 3G
#kern.ipc.nmbclusters=229376
# Jumbo pagesize(4k/8k) clusters
# Used as general packet storage for jumbo frames
# can be monitored via `netstat -m`
#kern.ipc.nmbjumbop=192000
# Jumbo 9k/16k clusters
# If you are using them
#kern.ipc.nmbjumbo9=24000
#kern.ipc.nmbjumbo16=10240
# Every socket is a file, so increase them
kern.maxfiles=204800
kern.maxfilesperproc=200000
kern.maxvnodes=200000
# Turn off receive autotuning
#net.inet.tcp.recvbuf_auto=0
# Small receive
space, only usable on http-server, on file server this
# should be increased to 65535 or even more
#net.inet.tcp.recvspace=8192
# Small send
space is useful for http servers that serve small files
# Autotuned since 7.x
net.inet.tcp.sendspace=16384
# This should be enabled if you going to use big spaces (>64k)
#net.inet.tcp.rfc1323=1
# Turn this off on highspeed, lossless connections (LAN 1Gbit+)
#net.inet.tcp.delayed_ack=0
# This feature is useful if you are serving data over modems, Gigabit Ethernet,
# or even high speed WAN links (or any other link with a high bandwidth delay product),
# especially if you are also using window scaling or have configured a large send window.
# You can try setting it to 0 on fileserver with 1GBit+ interfaces
# Automatically disables on small RTT ( http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_subr.c?#rev1.237 )
#net.inet.tcp.inflight.enable=0
# Disable randomizing of ports to avoid false RST
# Before usage check SA here www.bsdcan.org/2006/papers/ImprovingTCPIP.pdf
# (it's also says that port randomization auto-disables at some conn.rates, but I didn't tested it thou)
#net.inet.ip.portrange.randomized=0
# Increase portrange
# For outgoing connections only. Good for seed-boxes and ftp servers.
net.inet.ip.portrange.first=1024
net.inet.ip.portrange.last=65535
# Security
net.inet.ip.redirect=0
net.inet.ip.sourceroute=0
net.inet.ip.accept_sourceroute=0
net.inet.icmp.maskrepl=0
net.inet.icmp.log_redirect=0
net.inet.icmp.drop_redirect=1
net.inet.tcp.drop_synfin=1
# Security
net.inet.udp.blackhole=1
net.inet.tcp.blackhole=2
# Increases default TTL, sometimes useful
# Default is 64
net.inet.ip.ttl=128
# Lessen max segment life to conserve resources
# ACK waiting time in miliseconds (default: 30000 from RFC)
net.inet.tcp.msl=5000
# Max bumber of timewait sockets
net.inet.tcp.maxtcptw=40960
# Don't use tw on local connections
# As of 15 Apr 2009. Igor Sysoev says that nolocaltimewait has some buggy realization.
# So disable it or now till get fixed
#net.inet.tcp.nolocaltimewait=1
# FIN_WAIT_2 state fast recycle
net.inet.tcp.fast_finwait2_recycle=1
# Time before tcp keepalive probe is sent
# default is 2 hours (7200000)
#net.inet.tcp.keepidle=60000
# Should be increased until net.inet.ip.intr_queue_drops is zero
net.inet.ip.intr_queue_maxlen=4096
# Interrupt handling via multiple CPU, but with context switch.
# You can play with it. Default is 1;
#net.isr.direct=0
# This is for routers only
#net.inet.ip.forwarding=1
#net.inet.ip.fastforwarding=1
# This speed ups dummynet when channel isn't saturated
net.inet.ip.dummynet.io_fast=1
# Increase dummynet(4) hash
#net.inet.ip.dummynet.hash_size=2048
#net.inet.ip.dummynet.max_chain_len
# Should be increased when you have A LOT of files on server
# (Increase until vfs.ufs.dirhash_mem becames lower)
vfs.ufs.dirhash_maxmem=67108864
# Explicit Congestion Notification (see http://en.wikipedia.org/wiki/Explicit_Congestion_Notification)
net.inet.tcp.ecn.enable=1
# Flowtable - flow caching mechanism
# Useful for routers
#net.inet.flowtable.enable=1
#net.inet.flowtable.nmbflows=65535
# Extreme polling tuning
#kern.polling.burst_max=1000
#kern.polling.each_burst=1000
#kern.polling.reg_frac=100
#kern.polling.user_frac=1
#kern.polling.idle_poll=0
# IPFW dynamic rules and timeouts tuning
# Increase dyn_buckets till net.inet.ip.fw.curr_dyn_buckets is lower
net.inet.ip.fw.dyn_buckets=65536
net.inet.ip.fw.dyn_max=65536
net.inet.ip.fw.dyn_ack_lifetime=120
net.inet.ip.fw.dyn_syn_lifetime=10
net.inet.ip.fw.dyn_fin_lifetime=2
net.inet.ip.fw.dyn_short_lifetime=10
# Make packets pass firewall only once when using dummynet
# i.e. packets going thru pipe are passing out from firewall with accept
#net.inet.ip.fw.one_pass=1
# shm_use_phys Wires all shared pages, making them unswappable
# Use this to lessen Virtual Memory Manager's work when using Shared Mem.
# Useful for databases
#kern.ipc.shm_use_phys=1
/boot/loader.conf:
# Accept filters for data, http and DNS requests
# Usefull when your software uses select() instead of kevent/kqueue or when you under DDoS
# DNS accf available on 8.0+
accf_data_load="YES"
accf_http_load="YES"
accf_dns_load="YES"
# Async IO system calls
aio_load="YES"
# Adds NCQ support in FreeBSD
# WARNING! all ad[0-9]+ devices will be renamed to ada[0-9]+
# 8.0+ only
#ahci_load=
#siis_load=
# Increase kernel memory size to 3G.
#
# Use ONLY if you have KVA_PAGES in kernel configuration, and you have more than 3G RAM
# Otherwise panic will happen on next reboot!
#
# It's required for high buffer sizes: kern.ipc.nmbjumbop, kern.ipc.nmbclusters, etc
# Useful on highload stateful firewalls, proxies or ZFS fileservers
# (FreeBSD 7.2+ amd64 users: Check that current value is lower!)
#vm.kmem_size="3G"
# Older versions of FreeBSD can't tune maxfiles on the fly
#kern.maxfiles="200000"
# Useful for databases
# Sets maximum data size to 1G
# (FreeBSD 7.2+ amd64 users: Check that current value is lower!)
#kern.maxdsiz="1G"
# Maximum buffer size(vfs.maxbufspace)
# You can check current one via vfs.bufspace
# Should be lowered/upped depending on server's load-type
# Usually decreased to preserve kmem
# (default is 200M)
#kern.maxbcache="512M"
# Sendfile buffers
# For i386 only
#kern.ipc.nsfbufs=10240
# syncache Hash table tuning
net.inet.tcp.syncache.hashsize=1024
net.inet.tcp.syncache.bucketlimit=100
# Incresed hostcache
net.inet.tcp.hostcache.hashsize="16384"
net.inet.tcp.hostcache.bucketlimit="100"
# TCP control-block Hash table tuning
net.inet.tcp.tcbhashsize=4096
# Enable superpages, for 7.2+ only
# Also read http://lists.freebsd.org/pipermail/freebsd-hackers/2009-November/030094.html
vm.pmap.pg_ps_enabled=1
# Usefull if you are using Intel-Gigabit NIC
#hw.em.rxd=4096
#hw.em.txd=4096
#hw.em.rx_process_limit="-1"
# Also if you have ALOT interrupts on NIC - play with following parameters
# NOTE: You should set them for every NIC
#dev.em.0.rx_int_delay: 250
#dev.em.0.tx_int_delay: 250
#dev.em.0.rx_abs_int_delay: 250
#dev.em.0.tx_abs_int_delay: 250
# There is also multithreaded version of em drivers can be found here:
# http://people.yandex-team.ru/~wawa/
#
# for additional em monitoring and statistics use
# `sysctl dev.em.0.stats=1 ; dmesg`
#
#Same tunings for igb
#hw.igb.rxd=4096
#hw.igb.txd=4096
#hw.igb.rx_process_limit=100
# Some useful netisr tunables. See sysctl net.isr
#net.isr.defaultqlimit=4096
#net.isr.maxqlimit: 10240
# Bind netisr threads to CPUs
#net.isr.bindthreads=1
# Nicer boot logo =)
loader_logo="beastie"
And finally here is my additions to GENERIC kernel
# Just some of them, see also
# cat /sys/{i386,amd64,}/conf/NOTES
# This one useful only on i386
#options KVA_PAGES=512
# You can play with HZ in environments with high interrupt rate (default is 1000)
# 100 is for my notebook to prolong it's battery life
#options HZ=100
# Polling is goot on network loads with high packet rates and low-end NICs
# NB! Do not enable it if you want more than one netisr thread
#options DEVICE_POLLING
# Eliminate datacopy on socket read-write
# To take advantage with zero copy sockets you should have an MTU of 8K(amd64)
# (4k for i386). This req. is only for receiving data.
# Read more in man zero_copy_sockets
#options ZERO_COPY_SOCKETS
# Support TCP sign. Used for IPSec
options TCP_SIGNATURE
options IPSEC
# This ones can be loaded as modules. They described in loader.conf section
#options ACCEPT_FILTER_DATA
#options ACCEPT_FILTER_HTTP
# Adding ipfw, also can be loaded as modules
options IPFIREWALL
options IPFIREWALL_VERBOSE
options IPFIREWALL_VERBOSE_LIMIT=10
options IPFIREWALL_DEFAULT_TO_ACCEPT
options IPFIREWALL_FORWARD
# Adding kernel NAT
options IPFIREWALL_NAT
options LIBALIAS
# Traffic shaping
options DUMMYNET
# Divert, i.e. for userspace NAT
options IPDIVERT
# This is for OpenBSD's pf firewall
device pf
device pflog
# pf's QoS - ALTQ
options ALTQ
options ALTQ_CBQ # Class Bases Queuing (CBQ)
options ALTQ_RED # Random Early Detection (RED)
options ALTQ_RIO # RED In/Out
options ALTQ_HFSC # Hierarchical Packet Scheduler (HFSC)
options ALTQ_PRIQ # Priority Queuing (PRIQ)
options ALTQ_NOPCC # Required for SMP build
# Pretty console
# Manual can be found here http://forums.freebsd.org/showthread.php?t=6134
#options VESA
#options SC_PIXEL_MODE
# Disable reboot on Ctrl Alt Del
#options SC_DISABLE_REBOOT
# Change normal|kernel messages color
options SC_NORM_ATTR=(FG_GREEN|BG_BLACK)
options SC_KERNEL_CONS_ATTR=(FG_YELLOW|BG_BLACK)
# More scroll
space
options SC_HISTORY_SIZE=8192
# Adding hardware crypto device
device crypto
device cryptodev
# Useful network interfaces
device vlan
device tap #Virtual Ethernet driver
device gre #IP over IP tunneling
device if_bridge #Bridge interface
device pfsync #synchronization interface for PF
device carp #Common Address Redundancy Protocol
device enc #IPsec interface
device lagg #Link aggregation interface
device stf #IPv4-IPv6 port
# Also for my notebook, but may be used with Opteron
#device amdtemp
# Support for ECMP. More than one route for destination
# Works even with default route so one can use it as LB for two ISP
# For now code is unstable and panics (panic: rtfree 2) on route deletions.
#options RADIX_MPATH
# Multicast routing
#options MROUTING
#options PIM
# DTrace
options KDTRACE_HOOKS # all architectures - enable general DTrace hooks
options DDB_CTF # all architectures - kernel ELF linker loads CTF data
#options KDTRACE_FRAME # amd64-only
# Adaptive spining in lockmgr (8.x+)
# See http://www.mail-archive.com/
[email protected]/msg10782.html
options ADAPTIVE_LOCKMGRS
# UTF-8 in console (9.x+)
#options TEKEN_UTF8
#options TEKEN_XTERM
# NCQ support
# WARNING! all ad[0-9]+ devices will be renamed to ada[0-9]+
#options ATA_CAM
# FreeBSD 9+
# Deadlock resolver thread
# For additional information see http://www.mail-archive.com/
[email protected]/msg18124.html
#options DEADLKRES
PS. Also most of FreeBSD's limits can be monitored by
# vmstat -z
and
# limits
PPS. variety of network counters can be monitored via
# netstat -s
In FreeBSD-9 netstat's -Q option appeared, try following command to display netisr stats
# netstat -Q
PPPS. also see
# man 7 tuning
PPPPS. I wanted to thank FreeBSD community, especially author of nginx - Igor Sysoev, nginx-ru@ and FreeBSD-performance@ mailing lists for providing useful information about FreeBSD tuning.
So here is the question:
What tunings are you using on yours FreeBSD servers?
You can also post your /etc/sysctl.conf, /boot/loader.conf, kernel options, etc with description of its' meaning (do not copy-paste from sysctl -d). Don't forget to specify server type (web, smb, gateway, etc)
Let's share experience!