Why are my Opteron cores running at only 75% capacity each? (25% CPU idle)

Posted by Tim Cooper on Stack Overflow See other posts from Stack Overflow or by Tim Cooper
Published on 2012-10-05T03:16:00Z Indexed on 2012/10/05 3:37 UTC
Read the original article Hit count: 298

Filed under:
|

We've just taken delivery of a powerful 32-core AMD Opteron server with 128Gb. We have 2 x 6272 CPU's with 16 cores each. We are running a big long-running java task on 30 threads. We have the NUMA optimisations for Linux and java turned on. Our Java threads are mainly using objects that are private to that thread, sometimes reading memory that other threads will be reading, and very very occasionally writing or locking shared objects.

We can't explain why the CPU cores are 25% idle. Below is a dump of "top":

top - 23:06:38 up 1 day, 23 min,  3 users,  load average: 10.84, 10.27, 9.62
Tasks: 676 total,   1 running, 675 sleeping,   0 stopped,   0 zombie
Cpu(s): 64.5%us,  1.3%sy,  0.0%ni, 32.9%id,  1.3%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  132138168k total, 131652664k used,   485504k free,    92340k buffers
Swap:  5701624k total,   230252k used,  5471372k free, 13444344k cached
...
top - 22:37:39 up 23:54,  3 users,  load average: 7.83, 8.70, 9.27
Tasks: 678 total,   1 running, 677 sleeping,   0 stopped,   0 zombie
Cpu0  : 75.8%us,  2.0%sy,  0.0%ni, 22.2%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  : 77.2%us,  1.3%sy,  0.0%ni, 21.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  : 77.3%us,  1.0%sy,  0.0%ni, 21.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  : 77.8%us,  1.0%sy,  0.0%ni, 21.2%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  : 76.9%us,  2.0%sy,  0.0%ni, 21.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  : 76.3%us,  2.0%sy,  0.0%ni, 21.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  : 12.6%us,  3.0%sy,  0.0%ni, 84.4%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  8.6%us,  2.0%sy,  0.0%ni, 89.4%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  : 77.0%us,  2.0%sy,  0.0%ni, 21.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  : 77.0%us,  2.0%sy,  0.0%ni, 21.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu10 : 77.6%us,  1.7%sy,  0.0%ni, 20.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu11 : 75.7%us,  2.0%sy,  0.0%ni, 21.4%id,  1.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu12 : 76.6%us,  2.3%sy,  0.0%ni, 21.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu13 : 76.6%us,  2.3%sy,  0.0%ni, 21.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu14 : 76.2%us,  2.6%sy,  0.0%ni, 15.9%id,  5.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu15 : 76.6%us,  2.0%sy,  0.0%ni, 21.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu16 : 73.6%us,  2.6%sy,  0.0%ni, 23.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu17 : 74.5%us,  2.3%sy,  0.0%ni, 23.2%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu18 : 73.9%us,  2.3%sy,  0.0%ni, 23.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu19 : 72.9%us,  2.6%sy,  0.0%ni, 24.4%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu20 : 72.8%us,  2.6%sy,  0.0%ni, 24.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu21 : 72.7%us,  2.3%sy,  0.0%ni, 25.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu22 : 72.5%us,  2.6%sy,  0.0%ni, 24.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu23 : 73.0%us,  2.3%sy,  0.0%ni, 24.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu24 : 74.7%us,  2.7%sy,  0.0%ni, 22.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu25 : 74.5%us,  2.6%sy,  0.0%ni, 22.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu26 : 73.7%us,  2.0%sy,  0.0%ni, 24.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu27 : 74.1%us,  2.3%sy,  0.0%ni, 23.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu28 : 74.1%us,  2.3%sy,  0.0%ni, 23.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu29 : 74.0%us,  2.0%sy,  0.0%ni, 24.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu30 : 73.2%us,  2.3%sy,  0.0%ni, 24.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu31 : 73.1%us,  2.0%sy,  0.0%ni, 24.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  132138168k total, 131711704k used,   426464k free,    88336k buffers
Swap:  5701624k total,   229572k used,  5472052k free, 13745596k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
13865 root      20   0  122g 112g 3.1g S 2334.3 89.6  20726:49 java
27139 jayen     20   0 15428 1728  952 S  2.6  0.0   0:04.21 top
27161 sysadmin  20   0 15428 1712  940 R  1.0  0.0   0:00.28 top
   33 root      20   0     0    0    0 S  0.3  0.0   0:06.24 ksoftirqd/7
  131 root      20   0     0    0    0 S  0.3  0.0   0:09.52 events/0
 1858 root      20   0     0    0    0 S  0.3  0.0   1:35.14 kondemand/0

A dump of the java stack confirms that none of the threads are anywhere near the few places where locks are used, nor are they anywhere near any disk or network i/o.

I had trouble finding a clear explanation of what 'top' means by "idle" versus "wait", but I get the impression that "idle" means "no more threads that need to be run" but this doesn't make sense in our case. We're using a "Executors.newFixedThreadPool(30)". There are a large number of tasks pending and each task lasts for 10 seconds or so.

I suspect that the explanation requires a good understanding of NUMA. Is the "idle" state what you see when a CPU is waiting for a non-local access? If not, then what is the explanation?

© Stack Overflow or respective owner

Related posts about java

Related posts about numa