The Unspoken - The Why of GC Ergonomics

Posted by jonthecollector on Oracle Blogs See other posts from Oracle Blogs or by jonthecollector
Published on Fri, 28 Jun 2013 22:23:35 +0000 Indexed on 2013/06/29 4:25 UTC
Read the original article Hit count: 471

Filed under:
Do you use GC ergonomics, -XX:+UseAdaptiveSizePolicy, with the UseParallelGC collector? The jist of GC ergonomics for that collector is that it tries to grow or shrink the heap to meet a specified goal. The goals that you can choose are maximum pause time and/or throughput. Don't get too excited there. I'm speaking about UseParallelGC (the throughput collector) so there are definite limits to what pause goals can be achieved. When you say out loud "I don't care about pause times, give me the best throughput I can get" and then say to yourself "Well, maybe 10 seconds really is too long", then think about a pause time goal. By default there is no pause time goal and the throughput goal is high (98% of the time doing application work and 2% of the time doing GC work). You can get more details on this in my very first blog.

GC ergonomics

The UseG1GC has its own version of GC ergonomics, but I'll be talking only about the UseParallelGC version.

If you use this option and wanted to know what it (GC ergonomics) was thinking, try

-XX:AdaptiveSizePolicyOutputInterval=1

This will print out information every i-th GC (above i is 1) about what the GC ergonomics to trying to do. For example,

    UseAdaptiveSizePolicy actions to meet  *** throughput goal ***
                       GC overhead (%)
    Young generation:       16.10         (attempted to grow)
    Tenured generation:      4.67         (attempted to grow)
    Tenuring threshold:    (attempted to decrease to balance GC costs) = 1

GC ergonomics tries to meet (in order)

  • Pause time goal
  • Throughput goal
  • Minimum footprint

    The first line says that it's trying to meet the throughput goal.

        UseAdaptiveSizePolicy actions to meet  *** throughput goal ***
    

    This run has the default pause time goal (i.e., no pause time goal) so it is trying to reach a 98% throughput. The lines

        Young generation:       16.10         (attempted to grow)
        Tenured generation:      4.67         (attempted to grow)
    

    say that we're currently spending about 16% of the time doing young GC's and about 5% of the time doing full GC's. These percentages are a decaying, weighted average (earlier contributions to the average are given less weight). The source code is available as part of the OpenJDK so you can take a look at it if you want the exact definition. GC ergonomics is trying to increase the throughput by growing the heap (so says the "attempted to grow").

    The last line

        Tenuring threshold:    (attempted to decrease to balance GC costs) = 1
    

    says that the ergonomics is trying to balance the GC times between young GC's and full GC's by decreasing the tenuring threshold. During a young collection the younger objects are copied to the survivor spaces while the older objects are copied to the tenured generation. Younger and older are defined by the tenuring threshold. If the tenuring threshold hold is 4, an object that has survived fewer than 4 young collections (and has remained in the young generation by being copied to the part of the young generation called a survivor space) it is younger and copied again to a survivor space. If it has survived 4 or more young collections, it is older and gets copied to the tenured generation. A lower tenuring threshold moves objects more eagerly to the tenured generation and, conversely a higher tenuring threshold keeps copying objects between survivor spaces longer. The tenuring threshold varies dynamically with the UseParallelGC collector. That is different than our other collectors which have a static tenuring threshold. GC ergonomics tries to balance the amount of work done by the young GC's and the full GC's by varying the tenuring threshold. Want more work done in the young GC's? Keep objects longer in the survivor spaces by increasing the tenuring threshold.

    This is an example of the output when GC ergonomics is trying to achieve a pause time goal

        UseAdaptiveSizePolicy actions to meet  *** pause time goal ***
                           GC overhead (%)
        Young generation:       20.74         (no change)
        Tenured generation:     31.70         (attempted to shrink)
    

    The pause goal was set at 50 millisecs and the last GC was

    0.415: [Full GC (Ergonomics) [PSYoungGen: 2048K->0K(26624K)] [ParOldGen: 26095K->9711K(28992K)] 28143K->9711K(55616K), [Metaspace: 1719K->1719K(2473K/6528K)], 0.0758940 secs] [Times: user=0.28 sys=0.00, real=0.08 secs]
    

    The full collection took about 76 millisecs so GC ergonomics wants to shrink the tenured generation to reduce that pause time. The previous young GC was

    0.346: [GC (Allocation Failure) [PSYoungGen: 26624K->2048K(26624K)] 40547K->22223K(56768K), 0.0136501 secs] [Times: user=0.06 sys=0.00, real=0.02 secs]
    

    so the pause time there was about 14 millisecs so no changes are needed.

    If trying to meet a pause time goal, the generations are typically shrunk. With a pause time goal in play, watch the GC overhead numbers and you will usually see the cost of setting a pause time goal (i.e., throughput goes down). If the pause goal is too low, you won't achieve your pause time goal and you will spend all your time doing GC.

    GC ergonomics is meant to be simple because it is meant to be used by anyone. It was not meant to be mysterious and so this output was added. If you don't like what GC ergonomics is doing, you can turn it off with -XX:-UseAdaptiveSizePolicy, but be pre-warned that you have to manage the size of the generations explicitly. If UseAdaptiveSizePolicy is turned off, the heap does not grow. The size of the heap (and the generations) at the start of execution is always the size of the heap. I don't like that and tried to fix it once (with some help from an OpenJDK contributor) but it unfortunately never made it out the door. I still have hope though.

    Just a side note. With the default throughput goal of 98% the heap often grows to it's maximum value and stays there. Definitely reduce the throughput goal if footprint is important. Start with -XX:GCTimeRatio=4 for a more modest throughput goal (%20 of the time spent in GC). A higher value means a smaller amount of time in GC (as the throughput goal).

  • © Oracle Blogs or respective owner

    Related posts about /Java