OpenJ9 CRIU Support: A look under the hood

A recent blog post on OpenJ9 CRIU support introduced the motivations and concept of this new feature. In this blog post, we delve deeper into the support built for this feature in the OpenJ9 JVM and JCL (Java Class Library) components.

You may be wondering what the difference is between checkpointing a container running a Java application using CRIU directly and the behavior one gets when using the OpenJ9 CRIU support. The biggest difference is that OpenJ9 CRIU support allows the application and the JVM to cooperate in increasing the likelihood of a successful restore without unexpected problems. This cooperation takes many forms (all of which we will explore in more depth):

The org.eclipse.openj9.criu.CRIUSupport.checkpointJVM() API that allows a Java application to request a CRIU checkpoint programmatically at a program point of choice.
A hook method architecture/API that allows a user to register the sequence in which methods that they specify ought to be run before taking a checkpoint and upon restore.
Compensation in common JCL methods that allows, for example, time related classes, Random, etc. to account for differences between the checkpoint and restore environments.
A distinct approach in how Java security components are managed to avoid embedding any sensitive information in CRIU checkpoint image files.
Changes to the JIT to generate code that is portable enough to be executed on any architecture version that the container image with the CRIU checkpoint gets deployed to (as it may include some JIT compiled code).

For details on the Hook method architecture and compensation, please see OpenJ9 CRIU Support: A look under the hood (part II). The rest of this blog goes over the checkpointJVM API, Java security components considerations, and JIT compiled code considerations.

The `checkpointJVM()` API

org.eclipse.openj9.criu.CRIUSupport is a Java class that provides APIs such as checkpointJVM() and other helpers such as setWorkDir(), setLeaveRunning(), setShellJob(), and so on. The helper methods allow a user to specify the various arguments that get used by the native CRIU API criu_dump() and the associated setup using the routines such as criu_init_opts(). Examples of the kinds of arguments that can be specified include the directory where the checkpoint files are to be stored, whether the Java process that calls the API should be left running or be forced to exit, whether an “unprivileged” CRIU checkpoint is to be attempted or not, etc. Please see CRIUSupport.java and criusupport.cpp for a more complete list of parameters that are currently supported by our Java API and the CRIU CLI documentation for more details on how the different parameters affect CRIU behavior. If your use case requires more parameters to be handled, please let us know.

Before taking a checkpoint, one should check if CRIU is enabled on the machine with CRIUSupport.isCRIUSupportEnabled(). This call will return true if the -XX:+EnableCRIUSupport JVM option was provided (as CRIU Support is not enabled by default) and if the JVM was able to load libcriu.so. If CRIUSupport.isCRIUSupportEnabled() returns false, then CRIUSupport.getErrorMessage() will indicate what the problem was. In addition, CRIUSupport.isCheckpointAllowed() can be used to check if the VM allows a checkpoint to be taken currently (also invoked internally by checkpointJVM()). By default, the CRIU enabled VM runs with the -XX:+CRIURestoreNonPortableMode option, meaning that only one checkpoint can be taken (more on this in the portable JIT section) but multiple checkpoints may be supported in the future by toggling that option.

There are actually many useful tasks triggered by this API aside from invoking the native CRIU APIs. One such task is for the JVM to check if we can take a checkpoint at the point where the API is called, by checking if the @NotCheckpointSafe annotation is present on any of the methods present on the stacks of the executing threads that are still running Java code (i.e., not in JNI native methods).

This API also manages a new notion that we introduced for CRIU support, i.e., a “single threaded mode of execution” that the JVM enters before calling the CRIU native API. In the “single threaded mode”, all Java threads other than the one calling into the API to create the checkpoint are halted, to allow the JVM to take certain actions before the checkpoint and after the restore. These are actions that would be hard (bordering on impractical) to do with other threads changing the state of the JVM and the Java heap. For example, one such action involves updating instances of certain types (e.g. java.util.Random) on the Java heap to compensate in different ways when restoring from a given checkpoint multiple times and this would be hard to do if another Java thread can run concurrently and create instances of those types (see OpenJ9 CRIU Support: A look under the hood (part II) for more details on CRIU related compensations).

The notion of a single threaded mode introduces subtleties around wait/notify and locking. The wait/notify idiom is commonly used by application code to coordinate activities across multiple threads, e.g., a consumer thread could wait on a lock and be notified by the producer thread when new work arrives. This idiom could run into problems if the notification is sent by a thread in application code that is running when the JVM is in single threaded mode. In such cases, the notification would be sent to another (halted) thread and progress in the application as a whole may depend on that thread acting on the notification (impossible in single threaded mode). In order to avoid such deadlocks, OpenJ9 changes the locking sequences in the JVM to raise a RestoreException if the lone active thread in single threaded mode attempts to acquire a lock owned by some other thread.

Furthermore, OpenJ9 also adds a novel notion of “delayed notifications” to ensure that notify calls that occur during single threaded mode do not risk a deadlock. This is required because of a subtle issue, namely that a thread that gets notified becomes the owner of the lock that it was waiting on. Normally, this would not be an issue since the threads that send and receive the notifications can run concurrently. However, in single threaded mode, the thread receiving the notification won’t be eligible to run and if it gains ownership of the lock in question, this risks a RestoreException happening later if the lone active thread attempts to acquire the same lock (owned by the notified thread) while the JVM is still in single threaded mode. OpenJ9 “delays” notifications by queueing them up while in single threaded mode and only delivers those notifications (thus transferring lock ownership) when the JVM exits single threaded mode, thus reducing the likelihood of a RestoreException. A method can be marked with the @NotCheckpointSafe annotation if it is known to take some actions that would complicate the task of taking a checkpoint (e.g., certain patterns with locking or notify calls) if that method was running.

Finally, another interesting action taken by this checkpointJVM() API is to handle the task of calling the GC to compact the Java heap before the checkpoint gets taken. Since CRIU stores all the memory pages in use in the checkpoint that it creates, compacting the Java heap reduces disk footprint taken by the checkpoint files.

“Restricted” Java Security Components

In normal JVM mode, Java security providers/components as specified in the security properties file get initialized during startup. This means that these components are available for use by any application code that needs to use any Java security functionality. However, this eager initialization of Java security components means that there is some “execution state” related to those components that is already part of the JVM process memory. A CRIU checkpoint contains information of all the mapped memory pages in use by the process and so this Java security state would also be available in it (and by extension, a container image that shipped with the CRIU checkpoint). This represents a potential security risk if (for example) the container images for an application are made available publicly. We considered a couple of alternatives to address these kinds of security concerns.

The first approach involved locating all the Java security state that is relevant/represents the risk and cleansing that memory before taking the checkpoint. The biggest issue with this approach is that it wasn’t clear to us (even as JDK developers) how we would go about locating and cleansing all the relevant state in the complex Java security components (not to mention also figuring out how to repopulate this state on restore).

Instead, we went with a simpler approach wherein we introduced a new “restrictive mode” for how Java security components get initialized before the checkpoint is taken. In particular, we disable a vast majority of the Java security functionality and disable all native exploitation (either via the JIT compiler or via use of native packages such as OpenSSL) before the checkpoint is taken. If the Java application attempts to use any Java security functionality outside the “restricted set” of operations permitted before the checkpoint, an exception is raised thus making it clear that this operation is not supported by OpenJ9 before a CRIU checkpoint is taken. Since the default “restricted set” is quite small, OpenJ9 cleanses the Java security state before the checkpoint operation, and upon restore, a fresh set of Java security providers/components are initialized as specified in the security properties file since there is no security risk in doing so.

Portable JIT Compiled Code

Normally, a JIT compiler is expected to exploit the hardware on the system that it is running on since the code is generated on the fly with awareness of the underlying machine’s capabilities. However, this can cause issues when taking CRIU checkpoints on one machine (containing JIT compiled code that exploited the hardware on which we took the checkpoint) and restoring from it on another machine. In particular, the JIT compiled code in the CRIU snapshot could crash when run after restoring on a machine if the machine lacks a hardware feature the code tries to exploit.

The solution we went with was to generate only “portable” JIT compiled code before the checkpoint is taken. The portability here refers to assuming only that only a minimum set of features are available on the hardware and thus preventing exploitations in JIT compiled code that could cause problems when run on older generations of hardware. OpenJ9 already has a notion of portable AOT compiled code that is used for embedding AOT code in container images that ships with a shared classes cache. We simply extended the same concept to JIT compiled code when we detect that we are in a mode where a CRIU checkpoint could be generated. Note that once a restore occurs, the JIT compiler can go back to platform exploitation by default (i.e., it no longer generates portable code) since it is running on the restore machine with the expectation the hardware capabilities won’t change in the future.

This behaviour is enabled by the -XX:+CRIURestoreNonPortableMode option which is turned on by default. In order to generate portable JIT compiled code even after the restore, one simply provides the -XX:-CRIURestoreNonPortableMode option, but we feel this is unlikely to be used unless there is an intent to generate a subsequent CRIU checkpoint after the process is restored once. At runtime, if one needs to determine whether a checkpoint can be taken, the CRIUSupport.IsCheckpointAllowed API returns true if it is possible. This can be used in cases where multiple checkpoints may be desired. Otherwise, if CRIUSupport.isCRIUSupportEnabled returns true, the ability to take one checkpoint is guaranteed.

Conclusion

I hope this blog post provided some useful technical details about the different aspects of OpenJ9’s CRIU support. In the future, there are a host of areas that we intend to work on (perhaps a good topic for another blog post!) to improve the OpenJ9 CRIU support to make it even easier to run your applications successfully. We encourage you to try out this exciting new capability in OpenJ9 to obtain fast startup times using CRIU for your Java applications and if you have any questions or wish to share your experience, please contact us and we would love to hear from you!

OpenJ9 CRIU Support: A look under the hood

The `checkpointJVM()` API

“Restricted” Java Security Components

Portable JIT Compiled Code

Conclusion

Like this:

2 Replies to “OpenJ9 CRIU Support: A look under the hood ”

Leave a ReplyCancel reply

The checkpointJVM() API

“Restricted” Java Security Components

Portable JIT Compiled Code

Conclusion

Share this:

Like this:

2 Replies to “OpenJ9 CRIU Support: A look under the hood ”

Leave a ReplyCancel reply

Discover more from Eclipse OpenJ9 Blog

The `checkpointJVM()` API