what can cause large discrepancy between minor GC time and total pause time?
- by cxcg
We have a latency-sensitive application, and are experiencing some GC-related pauses we don't fully understand. We occasionally have a minor GC that results in application pause times that are much longer than the reported GC time itself. Here is an example log snippet:
485377.257: [GC 485378.857: [ParNew: 105845K-621K(118016K), 0.0028070 secs] 136492K-31374K(1035520K), 0.0028720 secs] [Times: user=0.01 sys=0.00, real=1.61 secs]
Total time for which application threads were stopped: 1.6032830 seconds
The total pause time here is orders of magnitude longer than the reported GC time. These are isolated and occasional events: the immediately preceding and succeeding minor GC events do not show this large discrepancy.
The process is running on a dedicated machine, with lots of free memory, 8 cores, running Red Hat Enterprise Linux ES Release 4 Update 8 with kernel 2.6.9-89.0.1EL-smp. We have observed this with (32 bit) JVM versions 1.6.0_13 and 1.6.0_18.
We are running with these flags:
-server -ea -Xms512m -Xmx512m -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:MaxNewSize=128m -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:-TraceClassUnloading
Can anybody offer some explanation as to what might be going on here, and/or some avenues for further investigation?