So you want to work on the JIT compiler in OpenJ9 but it isn’t obvious how use the Compiler’s Memory Manager? Then this is the blog post for you!
Background
It is important to note that because OpenJ9 is built as an extension of Eclipse OMR (henceforth referred to as OMR), the Compiler Memory Manager in OpenJ9 is also an extension of the Compiler Memory Manager in OMR. It is worth reading the documentation here and here. You might have noticed the term “Compiler Memory Manager” as opposed to “JVM Memory Manager” or just “Memory Manager”. This is because the Compiler manages its own memory.
There are a few reasons for why the Compiler does so. Using an allocator provided by the Port Library, such as j9mem_allocate_memory
, is expensive for Compiler’s allocation patterns. Because the Port Library is not in the same shared library as the Compiler, making too many of these cross component function calls can have a noticeable negative impact on compilation time. Additionally, there are too many participants for informal trust, book-keeping can get expensive, and attempting to optimize the Memory Manager for divergent component use-cases is difficult or even impossible (GC vs VM vs JIT memory allocation patterns are wildly different), which negatively affects memory footprint.
Thus, in order to minimize compilation time and memory footprint, the Compiler manages its own memory, only allocating large segments as and when it needs to. However, in order to ensure that all memory in the JVM is accounted for, as well as to minimize the likelihood of memory leaks, the OpenJ9 Compiler Memory Manager uses the Port Library APIs to allocate the memory it manages, instead of directly using OS APIs.
Scratch vs Persistent Memory
The OpenJ9 JIT can be, loosely speaking, split into two main components: the JIT Compiler and the JIT Runtime. The JIT Compiler component deals with compiling Java bytecodes into native assembly whereas the JIT Runtime component deals with compilation control, runtime assumptions, code/data cache management, memory management, etc. The data these two components work with have different lifetimes; therefore, conceptually, the Compiler Memory Manager provides the ability to allocate two different kinds of memory:
- Scratch Memory. This refers to memory required to perform the compilation (eg. representing IL, CFG, Instructions, etc.). This memory is allocated by the JIT Compiler component, and is released at the end of a compilation. There is a Scratch Space Limit specified for each compilation; reaching this limit causes the compilation to abort, possibly retried at a lower optimization level. There are two sub-types of Scratch Memory:
- Heap Memory is just dynamically allocated memory, analogous to
new
ormalloc
. - Stack Memory is NOT memory on the C/C++ stack. It is also dynamically allocated memory, but memory that is allocated using Stack Mark / Stack Release Semantics (described below).
- Heap Memory is just dynamically allocated memory, analogous to
- Persistent Memory. This refers to memory that persists throughout the lifetime of the JVM (eg. Class Hierarchy Table entries, IProfiler data, etc.). This memory is generally allocated by the JIT Runtime component.
If you take a look at TRMemory.hpp
in OMR, you will see a bunch of #define
s. These are used to add a bunch of placement new
overrides into various classes, as well as a per-class version of jitPersistentAlloc
/jitPersistentFree
. These defines are added to a class via the TR_ALLOC
macro, which takes a TR_MemoryBase::ObjectType
parameter. This is used to keep track of persistent memory allocations, which helps debug memory leaks. There also exists the TR_Memory
class; as described here, it is the old way through which memory in the Compiler was allocated.
Thus, you might see (old) code where memory is allocated using TR_Memory::allocateMemory
as well as calls to per-class versions of jitPersistentAlloc
/jitPersistentFree
. The problem with these approaches is that it requires one to manually allocate memory and then construct objects or cast data onto the memory. The aforementioned #define
s do, however, provide placement new
overrides to minimize the need for a malloc
like API.
In general, however, new code should use C++ Standard Library Template (STL) containers; in keeping with C++ best practices, explicitly allocating memory should be avoided as much as possible.
Allocating Scratch Memory
There are four main ways of allocating Scratch Memory:
Compilation Allocator
This approach is used to allocate memory from the pool of memory that will exist until the end of the compilation. However, because this approach does not yield a C++ Allocator, it needs to be wrapped in a TR::typed_allocator
via a call to getTypedAllocator
. This extra step exists because of the CS2 structures present in the code; these structures need the use of the CS2 Allocators. Eventually, all this should be simplified to just use a TR::Region
. To get an instance of this allocator, do:
TR::Allocator compAllocator = comp()->allocator();
To initialize a C++ container with this allocator, you would do something like:
_jniCallSites(getTypedAllocator<TR_Pair<TR_ResolvedMethod,TR::Instruction> *>(TR::comp()->allocator()));
Note the need for getTypedAllocator
.
Heap Memory Region
This approach is also used to allocate memory from the pool of memory that exists until the end of the compilation. The difference is that this approach will yield a TR::Region
, which has an automatic conversion (ie. it wraps it in a TR::typed_allocator
), so there’s no need to explicitly do so when using C++ containers. The trade-off is, in order to get an instance, you will need to do something like:
TR::Region &heapMemoryRegion = comp()->region();
which is less canonical than comp()->allocator()
.
To initialize a C++ container with this allocator, you would do something like:
_jniCallSites(comp()->region());
As an aside, comp()->trMemory()->heapMemoryRegion()
is equivalent to comp()->region()
.
TR::Region
This is used when you want Region Semantics. TR::Region
is an object that defines a region from which memory can be allocated; when the TR::Region
object is destroyed (either explicitly, or more commonly, when it goes out of scope), all memory that was allocated through it gets freed. The above TR::Compilation::region()
is actually just a TR::Region
that does not get destroyed until the end of a compilation. As mentioned before, TR::Region
has an automatic conversion operator which means it can be used as a C++ allocator. In order to create a new region, simply pass in a reference to an existing region. Usually this “existing region” will just be TR::Compilation::region()
:
TR::Region myRegion(comp()->region());
However, there’s nothing stopping you from building your own nested regions, though if you’re going to do so, you should make sure the lifetimes of the objects allocated in the various nested levels are extremely well understood.
To initialize a C++ container with this allocator, you would do something like:
TR::Region myRegion(comp()->region()); _jniCallSites(myRegion);
TR::StackMemoryRegion
This is used when you want Stack Semantics (Stack Mark / Stack Release); for more details, see this. TR::StackMemoryRegion
extends TR::Region
, and so has all the same properties. To perform a Stack Mark, simply create a new TR::StackMemoryRegion
:
TR::StackMemoryRegion stackMemoryRegion(comp()->trMemory());
A Stack Release occurs automatically when the TR::StackMemoryRegion
goes out of scope. TR::StackMemoryRegion
exists to aid in creating nested regions, which can be particularly useful if you wish to have a scratch space for memory needed during some optimization. For example:
int32_t TR_AmazingNewOpt::perform() { // Stack Mark TR::StackMemoryRegion stackMemoryRegion(comp()->trMemory()); // Do optimization ... // Automatic Stack Release return ...; }
All memory allocated during TR_AmazingNewOpt
is limited to the optimization. Any data that needs to persist longer than the optimization can just be allocated using the Heap Memory Region.
To get the current Stack Memory Region:
TR::StackMemoryRegion ¤tStackRegion = comp()->trMemory()->currentStackRegion();
Allocating Persistent Memory
There are two ways to get an allocator for persistent memory when using a C++ container.
Persistent Allocator
TR::PersistentAllocator
should be used when trying to allocate persistent objects whose class does not have TR_ALLOC
in its definition. To get an instance of this allocator, do:
TR::PersistentAllocator &allocator = TR::Compiler->persistentAllocator();
This allocator also contains an automatic conversion operator which allows it to be used as a C++ allocator
Typed Persistent Allocator
TR_TypedPersistentAllocator
should be used when trying to allocate persistent objects whose class does have TR_ALLOC
or TR_ALLOC_SPECIALIZED
in its definition (see this for more details). The difference between the two macros (specifically with respect to TR_TypedPersistentAllocator
) is, the former will result in the amount of memory allocated incremented in the UnknownType
bucket, whereas the latter will result in the increment occurring in the bucket of the type specified by the macro. The reason for this distinction is to provide an easy way of measuring the JIT Shared Library Memory Footprint cost of switching from TR_ALLOC
to TR_ALLOC_SPECIALIZED
(since the latter defines a template class; take a look at TRMemory.hpp
if you’re curious).
TR_TypedPersistentAllocator
is a wrapper around TR_PersistentMemory
, which enables tracking of persistent allocations per type (the jitPersistentAlloc
/jitPersistentFree
APIs are also wrappers around TR_PersistentMemory
). It also contains an automatic conversion operator which allows it to be used as a C++ allocator. To get an instance, you would do something like:
MyClass::TrackedPersistentAllocator allocator = MyClass::getPersistentAllocator();
Usually you won’t need to explicitly assign the allocator to some variable, eg:
std::list<MyClass, MyClass::TrackedPersistentAllocator> myClassList(MyClass::getPersistentAllocator()));
unless you wish to track the memory allocated for something in the same bucket as the object the allocator came from, eg:
MyClass::TrackedPersistentAllocator allocator = MyClass::getPersistentAllocator(); std::list<MyClass, MyClass::TrackedPersistentAllocator> myClassList(allocator); std::list<MyClassMetadata, MyClass::TrackedPersistentAllocator> myClassMetadata(allocator);
Explicit Allocations
As previously mentioned, explicitly allocating memory is not the preferred approach. Interactions with legacy code might require doing so, but attempts should be made in all new code to use C++ containers, or at the very least, minimize or localize the use of explicit allocations/frees. However, for the sake of completeness, and in order to give a sense of what you might find in older code, this section outlines how to explicitly allocate both Scratch and Persistent Memory.
Scratch Memory
To explicitly allocate memory for an object that has the TR_ALLOC
macro specified in its class’ definition, you would do something like:
TR_MyClass *myClass = new (comp()->trHeapMemory()) TR_MyClass(...);
or
TR_MyClass *myClass = new (comp()->trMemory()) TR_MyClass(...);
or
TR_MyClass *myClass = new (comp()->region()) TR_MyClass(...);
or
TR::Region myRegion(comp()->region()); TR_MyClass *myClass = new (myRegion) TR_MyClass(...);
or, for Stack Memory allocations:
TR_MyClass *myClass = new (comp()->trStackMemory()) TR_MyClass(...);
or
TR::StackMemoryRegion stackMemoryRegion(comp()->trMemory()); TR_MyClass *myClass = new (stackMemoryRegion) TR_MyClass(...);
or
TR_MyClass *myClass = new (comp()->trMemory()->currentStackRegion()) TR_MyClass(...);
If the class does not have the TR_ALLOC
macro, then you would do something like:
void *storage = comp()->trHeapMemory().allocate(sizeof(TR_MyClass)); TR_MyClass *myClass = new (storage) TR_MyClass(...);
or
void *storage = comp()->trMemory()->allocateHeapMemory(sizeof(TR_MyClass)); TR_MyClass *myClass = new (storage) TR_MyClass(...);
or
void *storage = comp()->trMemory()->allocateMemory(sizeof(TR_MyClass), heapAlloc); TR_MyClass *myClass = new (storage) TR_MyClass(...);
or
TR::Region myRegion(comp()->region()); void *storage = myRegion.allocate(sizeof(TR_MyClass)); TR_MyClass *myClass = new (storage) TR_MyClass(...);
or, for Stack Memory allocations:
void *storage = comp()->trStackMemory().allocate(sizeof(TR_MyClass)); TR_MyClass *myClass = new (storage) TR_MyClass(...);
or
void *storage = comp()->trMemory()->allocateStackMemory(sizeof(TR_MyClass)); TR_MyClass *myClass = new (storage) TR_MyClass(...);
or
void *storage = comp()->trMemory()->allocateMemory(sizeof(TR_MyClass), stackAlloc); TR_MyClass *myClass = new (storage) TR_MyClass(...);
or
TR::StackMemoryRegion stackMemoryRegion(comp()->trMemory()); void *storage = stackMemoryRegion.allocate(sizeof(TR_MyClass)); TR_MyClass *myClass = new (storage) TR_MyClass(...);
or
void *storage = comp()->trMemory()->currentStackRegion().allocate(sizeof(TR_MyClass)); TR_MyClass *myClass = new (storage) TR_MyClass(...);
Technically these APIs (both the placement new
overrides and the function calls) also take a default parameter of type TR_MemoryBase::ObjectType
. However, because currently per-type Scratch Memory allocations are not tracked, the type can be omitted.
Persistent Memory
To explicitly allocate memory for an object that has the TR_ALLOC
macro specified in its class’ definition, you would do something like:
TR_MyClass *myClass = new (PERSISTENT_NEW) TR_MyClass(...);
or, within TR_MyClass
:
TR_SomeData *someData = (TR_SomeData *)jitPersistentAlloc(sizeof(TR_SomeData));
If the class does not have the TR_ALLOC
macro, then you would do something like:
void *storage = comp()->trMemory()->allocateMemory(sizeof(TR_MyClass), persistentAlloc); TR_MyClass *myClass = new (storage) TR_MyClass(...);
or
void *storage = TR_Memory::jitPersistentAlloc(sizeof(TR_MyClass)); TR_MyClass *myClass = new (storage) TR_MyClass(...);
There exist TR_Memory
and per-class versions of jitPersistentAlloc
/jitPersistentFree
. The TR_Memory
version will increment, in the UnknownType
bucket, the amount of memory allocated, unless the TR_MemoryBase::ObjectType
is specified, eg:
void *storage = TR_Memory::jitPersistentAlloc(sizeof(TR_MyClass), TR_MemoryBase::ObjectTypeOfMyClass); TR_MyClass *myClass = new (storage) TR_MyClass(...);
The per-class version will increment the type as specified in the parameter of TR_ALLOC
. Any call to jitPersistentAlloc
/jitPersistentFree
from within a class that has TR_ALLOC
specified in its definition will automatically use the per-class versions.
Conclusion
And that’s it! Hopefully this post helped you better understand memory allocation in the Compiler. There are more advanced topics, such as how to create a TR::Region
outside a compilation (for example if you wished to create a region using persistent memory), and what all the other memory structures are for (eg, TR::SegmentAllocator
, TR::SegmentProvider
, etc), but that’s best left for another blog post. If you have any questions, concerns, ideas for improvements, etc., feel free to start a conversation in the OpenJ9 Slack Instance, or via the OpenJ9 Mailing List.