Dealing with Fine-Grained Cache Entries in Coherence
- by jpurdy
On occasion we have seen significant memory overhead when using very small cache entries. Consider the case where there is a small key (say a synthetic key stored in a long) and a small value (perhaps a number or short string). With most backing maps, each cache entry will require an instance of Map.Entry, and in the case of a LocalCache backing map (used for expiry and eviction), there is additional metadata stored (such as last access time). Given the size of this data (usually a few dozen bytes) and the granularity of Java memory allocation (often a minimum of 32 bytes per object, depending on the specific JVM implementation), it is easily possible to end up with the case where the cache entry appears to be a couple dozen bytes but ends up occupying several hundred bytes of actual heap, resulting in anywhere from a 5x to 10x increase in stated memory requirements. In most cases, this increase applies to only a few small NamedCaches, and is inconsequential -- but in some cases it might apply to one or more very large NamedCaches, in which case it may dominate memory sizing calculations.
Ultimately, the requirement is to avoid the per-entry overhead, which can be done either at the application level by grouping multiple logical entries into single cache entries, or at the backing map level, again by combining multiple entries into a smaller number of larger heap objects.
At the application level, it may be possible to combine objects based on parent-child or sibling relationships (basically the same requirements that would apply to using partition affinity). If there is no natural relationship, it may still be possible to combine objects, effectively using a Coherence NamedCache as a "map of maps". This forces the application to first find a collection of objects (by performing a partial hash) and then to look within that collection for the desired object. This is most naturally implemented as a collection of entry processors to avoid pulling unnecessary data back to the client (and also to encapsulate that logic within a service layer).
At the backing map level, the NIO storage option keeps keys on heap, and so has limited benefit for this situation. The Elastic Data features of Coherence naturally combine entries into larger heap objects, with the caveat that only data -- and not indexes -- can be stored in Elastic Data.