We have a CentOS 6.3 iscsi server (16GB RAM) running on Infiniband bus (ipoib).
When the load is high I can see multiple errors:
Sep 3 23:22:20 stor4 kernel: tgtd: page allocation failure. order:2, mode:0x20
Sep 3 23:22:20 stor4 kernel: Pid: 3637, comm: tgtd Not tainted 2.6.32 #1
Sep 3 23:22:20 stor4 kernel: Call Trace:
Sep 3 23:22:20 stor4 kernel: [] ? __alloc_pages_nodemask+0x77f/0x940
Sep 3 23:22:20 stor4 kernel: [] ? kmem_getpages+0x62/0x170
Sep 3 23:22:20 stor4 kernel: [] ? fallback_alloc+0x1ba/0x270
Sep 3 23:22:20 stor4 kernel: [] ? cache_grow+0x2cf/0x320
Sep 3 23:22:20 stor4 kernel: [] ? ____cache_alloc_node+0x99/0x160
Sep 3 23:22:20 stor4 kernel: [] ? pskb_expand_head+0x64/0x270
Sep 3 23:22:20 stor4 kernel: [] ? __kmalloc+0x189/0x220
Sep 3 23:22:20 stor4 kernel: [] ? pskb_expand_head+0x64/0x270
Sep 3 23:22:20 stor4 kernel: [] ? __pskb_pull_tail+0x2aa/0x360
Sep 3 23:22:20 stor4 kernel: [] ? tcp_init_tso_segs+0x37/0x50
Sep 3 23:22:20 stor4 kernel: [] ? dev_queue_xmit+0x4bb/0x6f0
Sep 3 23:22:20 stor4 kernel: [] ? neigh_connected_output+0xbd/0x100
Sep 3 23:22:20 stor4 kernel: [] ? ip_finish_output+0x237/0x310
Sep 3 23:22:20 stor4 kernel: [] ? ip_output+0xb8/0xc0
Sep 3 23:22:20 stor4 kernel: [] ? __ip_local_out+0x9f/0xb0
Sep 3 23:22:20 stor4 kernel: [] ? ip_local_out+0x25/0x30
Sep 3 23:22:20 stor4 kernel: [] ? ip_queue_xmit+0x190/0x420
Sep 3 23:22:20 stor4 kernel: [] ? sock_aio_write+0x167/0x180
Sep 3 23:22:20 stor4 kernel: [] ? tcp_transmit_skb+0x3fe/0x7b0
Sep 3 23:22:20 stor4 kernel: [] ? tcp_write_xmit+0x1fb/0xa20
Sep 3 23:22:20 stor4 kernel: [] ? __tcp_push_pending_frames+0x30/0xe0
Sep 3 23:22:20 stor4 kernel: [] ? tcp_push_pending_frames+0x33/0x40
Sep 3 23:22:20 stor4 kernel: [] ? do_tcp_setsockopt+0x3d6/0x480
Sep 3 23:22:20 stor4 kernel: [] ? tcp_setsockopt+0x2a/0x30
Sep 3 23:22:20 stor4 kernel: [] ? sock_common_setsockopt+0x14/0x20
Sep 3 23:22:20 stor4 kernel: [] ? sys_setsockopt+0x7f/0xe0
Sep 3 23:22:20 stor4 kernel: [] ? system_call_fastpath+0x16/0x1b
Sep 3 23:22:20 stor4 kernel: Mem-Info:
Sep 3 23:22:20 stor4 kernel: Node 0 DMA per-cpu:
Sep 3 23:22:20 stor4 kernel: CPU 0: hi: 0, btch: 1 usd: 0
Sep 3 23:22:20 stor4 kernel: CPU 1: hi: 0, btch: 1 usd: 0
Sep 3 23:22:20 stor4 kernel: CPU 2: hi: 0, btch: 1 usd: 0
Sep 3 23:22:20 stor4 kernel: CPU 3: hi: 0, btch: 1 usd: 0
Sep 3 23:22:20 stor4 kernel: Node 0 DMA32 per-cpu:
Sep 3 23:22:20 stor4 kernel: CPU 0: hi: 186, btch: 31 usd: 183
Sep 3 23:22:20 stor4 kernel: CPU 1: hi: 186, btch: 31 usd: 23
Sep 3 23:22:20 stor4 kernel: CPU 2: hi: 186, btch: 31 usd: 183
Sep 3 23:22:20 stor4 kernel: CPU 3: hi: 186, btch: 31 usd: 181
Sep 3 23:22:20 stor4 kernel: Node 0 Normal per-cpu:
Sep 3 23:22:20 stor4 kernel: CPU 0: hi: 186, btch: 31 usd: 171
Sep 3 23:22:20 stor4 kernel: CPU 1: hi: 186, btch: 31 usd: 29
Sep 3 23:22:20 stor4 kernel: CPU 2: hi: 186, btch: 31 usd: 32
Sep 3 23:22:20 stor4 kernel: CPU 3: hi: 186, btch: 31 usd: 32
Sep 3 23:22:20 stor4 kernel: active_anon:1875 inactive_anon:2473 isolated_anon:0
Sep 3 23:22:20 stor4 kernel: active_file:1243637 inactive_file:2505055 isolated_file:0
Sep 3 23:22:20 stor4 kernel: unevictable:0 dirty:268338 writeback:0 unstable:0
Sep 3 23:22:20 stor4 kernel: free:86050 slab_reclaimable:132377 slab_unreclaimable:23744
Sep 3 23:22:20 stor4 kernel: mapped:1293 shmem:222 pagetables:720 bounce:0
Sep 3 23:22:20 stor4 kernel: Node 0 DMA free:15732kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15332kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Sep 3 23:22:20 stor4 kernel: lowmem_reserve[]: 0 2172 16060 16060
Sep 3 23:22:20 stor4 kernel: Node 0 DMA32 free:107544kB min:18268kB low:22832kB high:27400kB active_anon:468kB inactive_anon:2364kB active_file:566208kB inactive_file:976112kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2224900kB mlocked:0kB dirty:96816kB writeback:0kB mapped:908kB shmem:12kB slab_reclaimable:176940kB slab_unreclaimable:968kB kernel_stack:64kB pagetables:192kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep 3 23:22:20 stor4 kernel: lowmem_reserve[]: 0 0 13887 13887
Sep 3 23:22:20 stor4 kernel: Node 0 Normal free:220924kB min:116772kB low:145964kB high:175156kB active_anon:7032kB inactive_anon:7528kB active_file:4408340kB inactive_file:9044108kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:14220800kB mlocked:0kB dirty:976536kB writeback:0kB mapped:4264kB shmem:876kB slab_reclaimable:352568kB slab_unreclaimable:94008kB kernel_stack:2048kB pagetables:2688kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep 3 23:22:20 stor4 kernel: lowmem_reserve[]: 0 0 0 0
Sep 3 23:22:20 stor4 kernel: Node 0 DMA: 1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15732kB
Sep 3 23:22:20 stor4 kernel: Node 0 DMA32: 16305*4kB 4381*8kB 353*16kB 8*32kB 1*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 107900kB
Sep 3 23:22:20 stor4 kernel: Node 0 Normal: 14548*4kB 14808*8kB 2420*16kB 31*32kB 5*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 220784kB
Sep 3 23:22:20 stor4 kernel: 3748822 total pagecache pages
Sep 3 23:22:20 stor4 kernel: 0 pages in swap cache
Sep 3 23:22:20 stor4 kernel: Swap cache stats: add 0, delete 0, find 0/0
Sep 3 23:22:20 stor4 kernel: Free swap = 975864kB
Sep 3 23:22:20 stor4 kernel: Total swap = 975864kB
Sep 3 23:22:20 stor4 kernel: 4194303 pages RAM
Sep 3 23:22:20 stor4 kernel: 126915 pages reserved
Sep 3 23:22:20 stor4 kernel: 3753534 pages shared
Sep 3 23:22:20 stor4 kernel: 213500 pages non-shared
TCP stack and VM config:
net.core.rmem_max = 83886080
net.core.wmem_max = 83886080
net.core.rmem_default = 65536
net.core.wmem_default = 65536
net.ipv4.tcp_rmem = 40960 1048560 4194304
net.ipv4.tcp_wmem = 40960 196608 4194304
net.ipv4.tcp_mem = 16388608 16388608 16388608
vm.min_free_kbytes=135168
Additional tweaks:
/sbin/blockdev --setra 16384 /dev/sdb
echo 2048 /sys/block/sdb/queue/nr_requests
Where might the problem be? Thank you.