FreeBSD performance tuning. Sysctls, loader.conf, kernel.

Posted by SaveTheRbtz on Server Fault See other posts from Server Fault or by SaveTheRbtz
Published on 2009-09-10T23:52:13Z Indexed on 2010/03/15 22:00 UTC
Read the original article Hit count: 918

I wanted to share knowledge of tuning FreeBSD via sysctls, so i'm posting them with comments. Based on Igor Sysoev (author of nginx) presentation about FreeBSD tuning up to 100,000-200,000 active connections.

Sysctls are for 7.x FreeBSD. Since 7.2 amd64 some of them are tuned well by default. Prior 7.0 some of them are boot only (set via /boot/loader.conf) or does not exist at all.

Highload web server sysctls:

# Max. backlog size
kern.ipc.somaxconn=4096

# Shared memory // 7.2+ can use shared memory > 2Gb
kern.ipc.shmmax=2147483648

# Sockets
kern.ipc.maxsockets=204800
# Do not use lager sockbufs on 8.0 
# ( http://old.nabble.com/Significant-performance-regression-for-increased-maxsockbuf-on-8.0-RELEASE-tt26745981.html#a26745981 )
kern.ipc.maxsockbuf=262144 

# Recive clusters (on amd64 7.2+ 65k is default)
# For such high value vm.kmem_size must be increased to 3G
#kern.ipc.nmbclusters=229376

# Jumbo pagesize(4k/8k) clusters
# Used as general packet storage for jumbo frames
# can be monitored via `netstat -m`
#kern.ipc.nmbjumbop=192000

# Jumbo 9k/16k clusters
# If you are using them
#kern.ipc.nmbjumbo9=24000
#kern.ipc.nmbjumbo16=10240

# Every socket is a file, so increase them
kern.maxfiles=204800
kern.maxfilesperproc=200000
kern.maxvnodes=200000

# Turn off receive autotuning
#net.inet.tcp.recvbuf_auto=0

# Small receive space, only usable on http-server, on file server this 
# should be increased to 65535 or even more
#net.inet.tcp.recvspace=8192

# Small send space is useful for http servers that serve small files 
# Autotuned since 7.x
net.inet.tcp.sendspace=16384

# This should be enabled if you going to use big spaces (>64k)
#net.inet.tcp.rfc1323=1
# Turn this off on highspeed, lossless connections (LAN 1Gbit+)
#net.inet.tcp.delayed_ack=0

# This feature is useful if you are serving data over modems, Gigabit Ethernet, 
# or even high speed WAN links (or any other link with a high bandwidth delay product), 
# especially if you are also using window scaling or have configured a large send window.
# You can try setting it to 0 on fileserver with 1GBit+ interfaces
# Automatically disables on small RTT ( http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_subr.c?#rev1.237 )
#net.inet.tcp.inflight.enable=0

# Disable randomizing of ports to avoid false RST
# Before usage check SA here www.bsdcan.org/2006/papers/ImprovingTCPIP.pdf
# (it's also says that port randomization auto-disables at some conn.rates, but I didn't tested it thou)
#net.inet.ip.portrange.randomized=0

# Increase portrange
# For outgoing connections only. Good for seed-boxes and ftp servers.
net.inet.ip.portrange.first=1024
net.inet.ip.portrange.last=65535

# Security
net.inet.ip.redirect=0
net.inet.ip.sourceroute=0
net.inet.ip.accept_sourceroute=0
net.inet.icmp.maskrepl=0
net.inet.icmp.log_redirect=0
net.inet.icmp.drop_redirect=1
net.inet.tcp.drop_synfin=1

# Security
net.inet.udp.blackhole=1
net.inet.tcp.blackhole=2

# Increases default TTL, sometimes useful
# Default is 64
net.inet.ip.ttl=128

# Lessen max segment life to conserve resources
# ACK waiting time in miliseconds (default: 30000 from RFC)
net.inet.tcp.msl=5000

# Max bumber of timewait sockets
net.inet.tcp.maxtcptw=40960
# Don't use tw on local connections
# As of 15 Apr 2009. Igor Sysoev says that nolocaltimewait has some buggy realization.
# So disable it or now till get fixed
#net.inet.tcp.nolocaltimewait=1

# FIN_WAIT_2 state fast recycle
net.inet.tcp.fast_finwait2_recycle=1

# Time before tcp keepalive probe is sent
# default is 2 hours (7200000)
#net.inet.tcp.keepidle=60000

# Should be increased until net.inet.ip.intr_queue_drops is zero
net.inet.ip.intr_queue_maxlen=4096

# Interrupt handling via multiple CPU, but with context switch.
# You can play with it. Default is 1;
#net.isr.direct=0

# This is for routers only
#net.inet.ip.forwarding=1
#net.inet.ip.fastforwarding=1

# This speed ups dummynet when channel isn't saturated
net.inet.ip.dummynet.io_fast=1
# Increase dummynet(4) hash
#net.inet.ip.dummynet.hash_size=2048
#net.inet.ip.dummynet.max_chain_len

# Should be increased when you have A LOT of files on server 
# (Increase until vfs.ufs.dirhash_mem becames lower)
vfs.ufs.dirhash_maxmem=67108864

# Explicit Congestion Notification (see http://en.wikipedia.org/wiki/Explicit_Congestion_Notification)
net.inet.tcp.ecn.enable=1

# Flowtable - flow caching mechanism
# Useful for routers
#net.inet.flowtable.enable=1
#net.inet.flowtable.nmbflows=65535

# Extreme polling tuning
#kern.polling.burst_max=1000
#kern.polling.each_burst=1000
#kern.polling.reg_frac=100
#kern.polling.user_frac=1
#kern.polling.idle_poll=0

# IPFW dynamic rules and timeouts tuning
# Increase dyn_buckets till net.inet.ip.fw.curr_dyn_buckets is lower
net.inet.ip.fw.dyn_buckets=65536
net.inet.ip.fw.dyn_max=65536
net.inet.ip.fw.dyn_ack_lifetime=120
net.inet.ip.fw.dyn_syn_lifetime=10
net.inet.ip.fw.dyn_fin_lifetime=2
net.inet.ip.fw.dyn_short_lifetime=10
# Make packets pass firewall only once when using dummynet
# i.e. packets going thru pipe are passing out from firewall with accept
#net.inet.ip.fw.one_pass=1

# shm_use_phys Wires all shared pages, making them unswappable
# Use this to lessen Virtual Memory Manager's work when using Shared Mem.
# Useful for databases
#kern.ipc.shm_use_phys=1

/boot/loader.conf:

# Accept filters for data, http and DNS requests
# Usefull when your software uses select() instead of kevent/kqueue or when you under DDoS
# DNS accf available on 8.0+
accf_data_load="YES" 
accf_http_load="YES"
accf_dns_load="YES"

# Async IO system calls
aio_load="YES"

# Adds NCQ support in FreeBSD
# WARNING! all ad[0-9]+ devices will be renamed to ada[0-9]+
# 8.0+ only
#ahci_load=
#siis_load=

# Increase kernel memory size to 3G. 
#
# Use ONLY if you have KVA_PAGES in kernel configuration, and you have more than 3G RAM 
# Otherwise panic will happen on next reboot!
#
# It's required for high buffer sizes: kern.ipc.nmbjumbop, kern.ipc.nmbclusters, etc
# Useful on highload stateful firewalls, proxies or ZFS fileservers
# (FreeBSD 7.2+ amd64 users: Check that current value is lower!)
#vm.kmem_size="3G"

# Older versions of FreeBSD can't tune maxfiles on the fly
#kern.maxfiles="200000"

# Useful for databases 
# Sets maximum data size to 1G
# (FreeBSD 7.2+ amd64 users: Check that current value is lower!)
#kern.maxdsiz="1G"

# Maximum buffer size(vfs.maxbufspace)
# You can check current one via vfs.bufspace
# Should be lowered/upped depending on server's load-type
# Usually decreased to preserve kmem
# (default is 200M)
#kern.maxbcache="512M"

# Sendfile buffers
# For i386 only
#kern.ipc.nsfbufs=10240

# syncache Hash table tuning
net.inet.tcp.syncache.hashsize=1024
net.inet.tcp.syncache.bucketlimit=100

# Incresed hostcache
net.inet.tcp.hostcache.hashsize="16384"
net.inet.tcp.hostcache.bucketlimit="100"

# TCP control-block Hash table tuning
net.inet.tcp.tcbhashsize=4096

# Enable superpages, for 7.2+ only
# Also read http://lists.freebsd.org/pipermail/freebsd-hackers/2009-November/030094.html
vm.pmap.pg_ps_enabled=1

# Usefull if you are using Intel-Gigabit NIC
#hw.em.rxd=4096
#hw.em.txd=4096
#hw.em.rx_process_limit="-1"
# Also if you have ALOT interrupts on NIC - play with following parameters
# NOTE: You should set them for every NIC
#dev.em.0.rx_int_delay: 250
#dev.em.0.tx_int_delay: 250
#dev.em.0.rx_abs_int_delay: 250
#dev.em.0.tx_abs_int_delay: 250
# There is also multithreaded version of em drivers can be found here:
# http://people.yandex-team.ru/~wawa/
#
# for additional em monitoring and statistics use 
# `sysctl dev.em.0.stats=1 ; dmesg`
#
#Same tunings for igb
#hw.igb.rxd=4096
#hw.igb.txd=4096
#hw.igb.rx_process_limit=100

# Some useful netisr tunables. See sysctl net.isr
#net.isr.defaultqlimit=4096
#net.isr.maxqlimit: 10240
# Bind netisr threads to CPUs
#net.isr.bindthreads=1

# Nicer boot logo =)
loader_logo="beastie"

And finally here is my additions to GENERIC kernel

# Just some of them, see also
# cat /sys/{i386,amd64,}/conf/NOTES

# This one useful only on i386
#options         KVA_PAGES=512

# You can play with HZ in environments with high interrupt rate (default is 1000) 
# 100 is for my notebook to prolong it's battery life
#options         HZ=100
# Polling is goot on network loads with high packet rates and low-end NICs
# NB! Do not enable it if you want more than one netisr thread
#options         DEVICE_POLLING

# Eliminate datacopy on socket read-write
# To take advantage with zero copy sockets you should have an MTU of 8K(amd64) 
# (4k for i386). This req. is only for receiving data.
# Read more in man zero_copy_sockets
#options         ZERO_COPY_SOCKETS

# Support TCP sign. Used for IPSec
options         TCP_SIGNATURE
options         IPSEC

# This ones can be loaded as modules. They described in loader.conf section     
#options         ACCEPT_FILTER_DATA
#options         ACCEPT_FILTER_HTTP

# Adding ipfw, also can be loaded as modules
options         IPFIREWALL
options         IPFIREWALL_VERBOSE
options         IPFIREWALL_VERBOSE_LIMIT=10
options         IPFIREWALL_DEFAULT_TO_ACCEPT
options         IPFIREWALL_FORWARD
# Adding kernel NAT
options         IPFIREWALL_NAT
options         LIBALIAS
# Traffic shaping
options         DUMMYNET          
# Divert, i.e. for userspace NAT
options         IPDIVERT

# This is for OpenBSD's pf firewall
device          pf
device          pflog
# pf's QoS - ALTQ
options         ALTQ
options         ALTQ_CBQ        # Class Bases Queuing (CBQ)
options         ALTQ_RED        # Random Early Detection (RED)
options         ALTQ_RIO        # RED In/Out
options         ALTQ_HFSC       # Hierarchical Packet Scheduler (HFSC)
options         ALTQ_PRIQ       # Priority Queuing (PRIQ)
options         ALTQ_NOPCC      # Required for SMP build

# Pretty console 
# Manual can be found here http://forums.freebsd.org/showthread.php?t=6134
#options         VESA
#options         SC_PIXEL_MODE

# Disable reboot on Ctrl Alt Del
#options         SC_DISABLE_REBOOT
# Change normal|kernel messages color
options         SC_NORM_ATTR=(FG_GREEN|BG_BLACK)
options         SC_KERNEL_CONS_ATTR=(FG_YELLOW|BG_BLACK)
# More scroll space
options         SC_HISTORY_SIZE=8192

# Adding hardware crypto device
device          crypto
device          cryptodev

# Useful network interfaces
device          vlan
device          tap                     #Virtual Ethernet driver
device          gre                     #IP over IP tunneling
device          if_bridge               #Bridge interface
device          pfsync                  #synchronization interface for PF
device          carp                    #Common Address Redundancy Protocol
device          enc                     #IPsec interface
device          lagg                    #Link aggregation interface
device          stf                     #IPv4-IPv6 port

# Also for my notebook, but may be used with Opteron
#device         amdtemp

# Support for ECMP. More than one route for destination
# Works even with default route so one can use it as LB for two ISP
# For now code is unstable and panics (panic: rtfree 2) on route deletions.
#options         RADIX_MPATH

# Multicast routing
#options         MROUTING
#options         PIM

# DTrace
options         KDTRACE_HOOKS        # all architectures - enable general DTrace hooks
options         DDB_CTF              # all architectures - kernel ELF linker loads CTF data
#options         KDTRACE_FRAME        # amd64-only

# Adaptive spining in lockmgr (8.x+)
# See http://www.mail-archive.com/[email protected]/msg10782.html
options         ADAPTIVE_LOCKMGRS

# UTF-8 in console (9.x+) 
#options         TEKEN_UTF8
#options         TEKEN_XTERM

# NCQ support
# WARNING! all ad[0-9]+ devices will be renamed to ada[0-9]+
#options         ATA_CAM

# FreeBSD 9+
# Deadlock resolver thread 
# For additional information see http://www.mail-archive.com/[email protected]/msg18124.html 
#options         DEADLKRES

PS. Also most of FreeBSD's limits can be monitored by

# vmstat -z

and

# limits  

PPS. variety of network counters can be monitored via

# netstat -s

In FreeBSD-9 netstat's -Q option appeared, try following command to display netisr stats

# netstat -Q

PPPS. also see

# man 7 tuning

PPPPS. I wanted to thank FreeBSD community, especially author of nginx - Igor Sysoev, nginx-ru@ and FreeBSD-performance@ mailing lists for providing useful information about FreeBSD tuning.

So here is the question:
What tunings are you using on yours FreeBSD servers?

You can also post your /etc/sysctl.conf, /boot/loader.conf, kernel options, etc with description of its' meaning (do not copy-paste from sysctl -d). Don't forget to specify server type (web, smb, gateway, etc)

Let's share experience!

© Server Fault or respective owner

Related posts about freebsd

Related posts about Performance