This post describes Relocation; as mentioned in the previous post, it is one of the two actions the JVM must perform to generate and execute AOT code. Because Validations are a bit more involved, we first focus on the task of relocating code before delving into the complexities and subtleties of validations.
What is Relocation?
Relocation is “the process of assigning load addresses to position-dependent code and data of a program and adjusting the code and data to reflect the assigned addresses”1. For example, a linker performs relocation along with symbol resolution.
How to Relocate?
In OpenJ9, a specific relocation is described by a Relocation Record. There are several types of relocations, and so there are several relocation records. The general process of relocation is as follows:
AOT Compilation
- During compilation, the compiler generates External Relocations (
TR::ExternalRelocation
)2. These contain information that the AOT infrastructure uses to generate the Relocation Records. In general, it creates these via theTR::CodeGenerator::addExternalRelocation
API. - After code binary encoding,
J9::CodeGenerator::processRelocations
is called, which callsTR::ExternalRelocation::addExternalRelocation
on each of the External Relocations. This effectively groups similar External Relocations into Iterated External Relocations (TR::IteratedExternalRelocation
)3. TR::initializeAOTRelocationHeader
is called, which allocates and writes into the buffer the header information for each of the Iterated External Relocations. The header provides the information needed to materialize the value used to relocate a location in the code in the subsequent AOT load run.TR::ExternalRelocation::apply
is called on each of the External Relocations. This stores into the buffer the offsets (of the AOT code) of the various locations that need to be updated with the same value. This ensures that if there are multiple locations that have be updated with the same value, they are described by only one Relocation Record.- Finally, the buffer containing the Relocation Records, along with the AOT code, is written out to the shared class cache (SCC).
AOT Load
After validation, the AOT infrastructure goes through buffer of Relocation Records loaded from the SCC. For each record:
preparePrivateData
is called, which uses the binary templates to obtain the data from the buffer and caches itapplyRelocationAtAllOffsets
is called, which does the work of relocating all the necessary locations
Depending on the type of relocation, applyRelocationAtAllOffsets
can range from simply computing the new pointer valid in current address space and updating the location, to adding runtime assumptions and/or patching guards if the assumptions are no longer valid.
Three Structures?
A Relocation Record consists of a header as well as offsets into the code that requires the relocation. In OpenJ9, Relocation Records are described by three data structures:
TR_RelocationRecord
: This is used to perform the relocationTR_RelocationRecordBinaryTemplate
: This describes the structure of the header of the Relocation RecordTR_RelocationRecordPrivateData
: This is used to cache data that is required when relocating multiple locations
There is a reason why there are three structures to describe the Relocation Record. TR_RelocationRecord
has many child classes that override its APIs. Thus, the specific child of TR_RelocationRecord
that needs to be instantiated depends on the type of the Relocation Record. Additionally, to allow for the potential of cross compilation, for example perhaps with JIT-as-a-Service, it is better to access the data via an API that can handle endianness, and other platform specific subtleties.
Thus, in TR_RelocationRecordGroup::applyRelocations
, TR_RelocationRecord
is instantiated on the stack (to prevent fragmentation caused by several small dynamic allocations), and uses APIs along with the binary template to access the data. The reason TR_RelocationRecordPrivateData
exists is because in order to allow TR_RelocationRecord
or its child classes to be instantiated on the stack, all of these classes have to be the same size. However, some relocations require data that has to be queried (either from the Relocation Records loaded from the SCC or the environment). In order to prevent unnecessarily repeated computation, the TR_RelocationRecordPrivateData
serves, in essence, as private member variables.
Conclusion
Hopefully this gives you a general understanding of the relocation process. The next post will cover Validations, and all of the complexities therein.
1. https://en.wikipedia.org/wiki/Relocation_(computing)
2. The term External Relocation exists because the TR::ExternalRelocation
class inherits from TR::Relocation
which the compiler uses for labels.
3. Similar here means that the location is different, but the value that gets applied is the same
2 Replies to “Ahead Of Time Compilation: Relocation”