Authors: Abdulrahman Alattas and Aleks Micic
Introduction
In OpenJ9’s Balanced GC policy, the Java heap is divided into up to 2047 fixed-size regions to reduce pause times and increase the efficiency of garbage collection. This layout poses a challenge for large arrays: if an array exceeds the size of a region, it cannot be allocated contiguously. To address this problem, OpenJ9 uses arraylets, which split arrays into a “spine” (holding metadata and pointers) and “leaves” (holding the actual data across regions).
While arraylets solved the allocation problem, they introduced performance overhead due to noncontiguous data and pointer indirection. To overcome these limitations, arraylets is now replaced with a new layout based on OffHeap memory, where array data is always contiguous. A new data address field, “dataAddr”, is added to the array object header to point to the array data, which might be stored adjacent to the object header in the Java heap or nonadjacent in the virtual-heap (OffHeap) memory. This field is specific to the Balanced GC policy and is not added for other GC policies.
This blog explores the new layout, its implementation, and the performance benefits it brings, especially for JIT optimizations and native access.
New Object Layout
The new layout replaces arraylets with a simpler, more efficient model where array data are always contiguous in memory, regardless of the size. Depending on the size of the array, the data can be stored either adjacent to the object header in the Java heap or nonadjacent in OffHeap memory. In both cases, the object header contains the data address pointer that references the start of the data.
Adjacent Layout (Java Heap)
For arrays that fit within a region, the data is stored directly after the object header in the main heap. The data address pointer points to the data adjacent to the header. This layout is straightforward and efficient for smaller arrays.
· ArrayObjectHeader (in Java heap) → “dataAddr” Pointer → Data (contiguous, in Java heap)
Nonadjacent Layout (OffHeap)
For larger arrays, the data is allocated in a separate OffHeap memory space, and the data address pointer in the object header points to that OffHeap location, thereby allowing large arrays to be contiguous in memory.
· ArrayObjectHeader (in Java heap) → “dataAddr” Pointer → Data (contiguous, OffHeap)

OffHeap Memory Design
OffHeap memory is a separately managed region within the JVM’s virtual address space that is used to store large array data contiguously. It is distinct from the region-based Java heap, and it is managed differently. By default, the OffHeap virtual space is set to be a few times larger than the Java heap, providing ample room for large allocations without fragmentation.
Unlike the Java heap, which reserves space for up to 2047 regions and commits them gradually, OffHeap memory is committed on demand. When a large array is allocated, only the number of memory regions that are needed are committed in the OffHeap memory and immediately associated with the object header via the data address pointer.
To maintain a stable memory footprint, an equivalent amount of memory is decommitted from the Java heap. This helps ensure that the total committed memory does not increase, even as memory shifts between the Java heap and OffHeap. Later, during garbage collection, if the OffHeap memory is reclaimed, it is immediately decommitted, and is recommitted in the Java heap for general use.

Object Movement
With OffHeap, the data address pointer in the object header always points to the start of the array data, regardless of where that data is stored.
In the adjacent layout, where the data is stored directly after the object header in the Java heap, the object and its data are physically close. If the object is moved during garbage collection, the data address pointer is updated to reflect the new location of the data.
In the nonadjacent layout, the data resides in OffHeap memory. Since OffHeap allocations never move, the data address pointer remains valid even if the object header itself is relocated in the Java heap. No update is needed in this case.
Benefits of the New Layout
Transitioning from arraylets to a continuous arrangement, particularly with OffHeap capability, unlocks numerous performance enhancements and architectural benefits..
First, since array data are always contiguous, the JIT compiler can apply more aggressive optimizations. Although accessing the data requires dereferencing the data address pointer, this overhead is often eliminated in hot paths through standard optimization techniques.
Second, for large arrays, especially those exceeding the region size, the new layout avoids the need to copy data during JNI critical access. This is possible because the data is both contiguous and immovable in OffHeap memory. This change significantly reduces latency and memory pressure in native-heavy workloads.
Overall, the new layout simplifies memory access, improves cache locality, and reduces GC-related overhead, especially in applications with large arrays, or frequent native interactions.
Performance
To assess the impact of the new OffHeap layout, we ran benchmarks on an IBM Power10 processor using two workloads: AcmeAir Monolithic and IBM ILOG.
AcmeAir is a Java-based airline booking system that is designed to simulate enterprise-scale web applications. It models a fictitious airline and is built to handle billions of web API calls per day, supporting deployment in cloud environments and multiple user interaction channels. IBM ILOG allows complex business logic to be externalized from applications and run dynamically. The benchmark tests rule evaluation performance under different rule set sizes to simulate varying computational loads and native interaction patterns.
To quantify the performance impact of OffHeap, we ran the benchmarks comparing Balanced GC with and without OffHeap support. The results show clear throughput improvements across the different workloads. In AcmeAir, OffHeap boosts throughput by over 26%, while ILOG sees consistent improvements of 17% for the 300 rules configurations and 6.4% for the 5 rule configurations.

This performance also reflects the Balanced GC policy’s progression to become more competitive with the default Gencon GC policy. Across all benchmarks, Balanced GC with OffHeap consistently narrows the performance gap to Gencon GC, making it perform much closer and positioning it as a more attractive alternative, given other benefits of the Balanced GC policy. The following table summarizes the relative throughput gap to Gencon for both Balanced GC configurations.
| Performance gap relative to Gencon GC policy | ||
| Benchmark | Balanced without OffHeap | Balanced with OffHeap |
| AcmeAir Monolithic | 25.45% | 5.68% |
| ILOG (300 rules) | 16.86% | 2.72% |
| ILOG (5 rules) | 7.36% | 1.43% |
Current Limitations
While the new OffHeap layout brings clear benefits, a few limitations exist.
One tradeoff is memory commitment. Although OffHeap is sparsely committed, it is committed in region-size increments. For example, when large arrays don’t fully use their allocated regions, more memory can be committed than with arraylets. Also, OffHeap is not yet NUMA-aware, which can affect performance on large multi-socket systems. Both limitations are targeted for future enhancement.
Finally, while the new layout enables many JIT optimizations due to contiguous data, not all existing optimizations currently work with OffHeap. That said, the performance improvements from contiguous data typically surpass the sacrificed optimizations, as demonstrated by the performance results that were presented. Expanding JIT support for OffHeap is an active area of development.
Conclusion
The move from arraylets to a contiguous OffHeap-based layout marks a significant step forward in OpenJ9’s memory management strategy while using the Balanced GC policy. By eliminating the fragmentation and indirection of large arrays, this new design simplifies memory access, improves throughput performance, and enables more powerful JIT optimizations, especially for large arrays and native interactions.
Although there remain opportunities for enhancement, the substantial throughput advancements that are displayed emphasize that the current benefits already outweigh the recognized limitations. This change not only modernizes how arrays are handled in the Balanced region-based GC policy but also lays the groundwork for future enhancements in scalability and throughput.
Acknowledgments
Special thanks to Vijay Sundaresan, Dimitri Pivkine, and Shubham Verma for their valuable feedback and thoughtful review of this blog.
