OpenJ9 CRIU Support: A look under the hood (part II)

Introduction

In this blog post, we continue to look under the hood of OpenJ9’s CRIU support by examining some ways in which a Java application and the JVM can cooperate to ensure a successful restore with expected behavior. This is a continuation of OpenJ9 CRIU Support: A look under the hood where we started to get into the implementation details of OpenJ9’s CRIU support. Here, we discuss the Hook method architecture/API that we introduced as part of the CRIU support as well as several compensations that we needed to add to class library methods in order to preserve expected behaviors for Java applications. 

The Hook method architecture/API allows a user to register the sequence in which (hook) methods ought to be run before taking a checkpoint and upon restore. The notion of hook methods is not new or specific to the CRIU support in OpenJ9, and it basically refers to the general notion of call back functions that can be invoked in response to specific events. Related to the CRIU support feature, there are hooks that are implemented internally in the JVM and there is also an API provided to the Java application to register hook methods to be run at specific program points. In particular, the two points of interest are a) pre-checkpoint and b) post-restore, i.e., actions that must be taken to facilitate a successful checkpoint and expected outcome when one attempts a CRIU restore. The APIs to register hook methods, named registerPreCheckpointHook(…) and registerPostRestoreHook(…), take in a Runnable argument that can be passed in by application code (see code here).

One interesting notion is the “priority” that can be specified for each hook as it is registered; the higher priority hooks are run last pre-checkpoint, and are run first post-restore. JVM hooks are assumed to run after all the application level hooks on the checkpoint side and before them on the restore side. Hooks added by the application have lower priority compared to the hooks added by the JVM. A future feature might allow the application to specify its priority and ordering amongst various Java hooks that it registers (the higher the priority passed in, the earlier it gets run). The Java application hooks run only when a single application thread is active thereby building a notion similar to “hooks that ought to run in single threaded mode” (except that this would have no control over JVM threads such as those for GC and JIT compilation).   

Compensation

At a conceptual level, the notion of checkpointing a Java application process and restoring it means that there are two “environments” in the picture. In fact, it is expected that the “checkpoint environment” is different from the “restore environment” for the simple reason that we believe CRIU checkpoints would likely be generated in an environment where application container images are being built, whereas the restore environment is probably the one where the application container images get deployed. This means the JVM has to contend with two kinds of complications that did not exist when running in conventional JVM mode: differences between the checkpoint and restore environments, and behaviors that may be undesirable if we naively restore repeatedly from the same CRIU checkpoint. OpenJ9 performs compensations for several such environmental differences, and we will next elaborate on the areas that we have considered at the time of writing this blog post (with a clear expectation that we will add more compensations in the future). 

Time APIs

Java’s critical time APIs are System.currentTimeMillis() and System.nanoTime() but they differ in that the former has a notion of absolute time that has passed since the start of the epoch (Jan 1, 1970 UTC) whereas the latter can be used to measure elapsed time and is not related to any other notion of absolute time. OpenJ9 adjusts the elapsed time returned by System.nanoTime() to subtract out the time taken between when the checkpoint was generated and when the restore was done (since the process was essentially paused in that time period). This allows application code that depends on timers, etc., to function as expected, rather than experience the consequences of the unpredictable (and potentially large) amount of time that may pass in deploying a container with a CRIU checkpoint in it. Note that the result of System.currentTimeMillis() cannot be adjusted in the same way because it represents a more absolute notion of time. Uses of System.currentTimeMillis() were modified to use System.nanoTime() in several cases in the Java class library (JCL) to minimize unexpected behaviour as part of the CRIU support. Usage of System.currentTimeMillis() to measure time in a relative context, e.g., by calling the method more than once and subtracting, should work fine in application code if these calls do not straddle the boundary between taking a checkpoint and doing the restore; otherwise, it may be better to change such application code to use System.nanoTime(). Several other APIs in the JCL have a notion of elapsed time, e.g., Object.wait(), Thread.sleep(), Unsafe.park(), and these have also been compensated in a similar manner to System.nanoTime().

Random / SecureRandom Class Instances

These classes may get instantiated before the CRIU checkpoint gets generated and if so, affect the randomness guarantees the user expects. In particular, if the same checkpoint is used to restore multiple times, then we may see the same seed get used to generate the random number sequence in all those restored processes leading to suboptimal randomness (identical numbers in the worst case). OpenJ9 keeps track of all such objects on the Java heap before the checkpoint is taken (via walking the Java heap in a JVM hook) and patches in a unique seed value after the restore in each process. This means that the restored Java processes all generate different random number sequences. 

Environment Variables

CRIU essentially freezes the environment variable space as it was when the process was checkpointed and as a consequence, the process cannot access any of the environment variables that are set on the restore side. This can be quite limiting since environment variables are a common way to specify information that is only known at deployment time, e.g., the identity of the database or tracing server. OpenJ9 addresses this problem by allowing the user to specify new environment variable values in a file set by CRIUSupport.RegisterRestoreEnvFile() which the JVM reads from after the process gets restored. OpenJ9 partitions the set of environment variables into a mutable set and an immutable set, and allows the user to specify which environment variables are to be considered mutable. This means that OpenJ9 can update the environment variable representation that it maintains on the Java heap for mutable environment variables, while throwing an exception if immutable environment variables are changed at restore time. New environment variables that were not even set at checkpoint time are easier to handle, in that they are instantiated and set after the restore based on the values set in the file. One of the next areas that we intend to work on is the ability to specify JVM environment variables on the restore side that would allow more control of JVM functionalities post restore (e.g., enable JIT or GC logging to better diagnose problems).  We are also exploring what it might take for CRIU itself to expose the environment variables on the restore side, using a new API.

Examples

This section shows the following examples (please follow the links to obtain and try them out yourself) :

Ensure that the prerequisites listed below are satisfied before trying them out.

Prerequisites

  • The InstantOnStartupGuide repository cloned, and current working directory changed to InstantOnStartupGuide.
  • If running a Ubuntu 22.04 host, ensure that the kernel is upgraded to the latest as there was an issue fixed in the latest updates that prevented successful checkpoint/restore.

Timer Example

In this section we will look at some examples that demonstrate the need for JVM compensations. To begin, run the following commands to build a container image with libcriu and a JDK with OpenJ9 CRIU Support installed.  

docker build -f Containerfiles/Containerfile.ubuntu20.privileged -t instantondemo:ub20 .
docker run --privileged -it instantondemo:ub20

The first example we will look at is called TimerEvents. You can take a look with:

vi TimerEvents.java

This application will create 10 timers, each one with a period of 10 seconds. Every 10 seconds the timers will print a message “Event fired!”. The timers will start 1 second after each, other meaning that you should see a message every second.

javac -cp criunatives.jar:. TimerEvents.java 
java -cp criunatives.jar:. TimerEvents

You should have seen:

Event fired!
Event fired!
Event fired!
Event fired!
Event fired!
Event fired!

Now we are going to take a checkpoint halfway and resume it. Uncomment the following line:

// Utils.checkPointJVM("checkpointData"); 

Next compile it and create the checkpoint directory.

javac -cp criunatives.jar:. TimerEvents.java
mkdir checkpointData

We are first going to try this example without OpenJ9 CRIU Support. We will use a simple native library that just calls the libcriu functions. Run the following to start the application. 

java -cp criunatives.jar:. -DuseCRIUNativesLib=true  -Djava.library.path=/instantOnDemo TimerEvents

You should see

Event fired!
Event fired!
CRIU is not enabled: To enable criu support, please run java with the `-XX:+EnableCRIUSupport` option.
Using CRIU natives libary instead
Killed

Next, wait a few seconds then restore it with the following command

criu restore -D ./checkpointData --shell-job -v4 --log-file=restore.log

You will notice that the messages are not shown one second after another; 10 messages are shown at a time. This occurs because the standard java.util.Timer API was not designed with checkpoint/restore in mind. So it operates as though time is continuous and uninterrupted; this is a safe assumption to make in a normal case, but with checkpoint/restore this can lead to subtle errors that deviate from intended program behaviour.

Next, we will try this example again with OpenJ9 CRIU Support. Run the following command

java -cp criunatives.jar:. -XX:+EnableCRIUSupport TimerEvents

Then restore with

criu restore -D ./checkpointData --shell-job -v4 --log-file=restore.log

You’ll see that with the CRIU Support the application behaves in the same manner as it would have had there not been a checkpoint.

Random Example

In this example we will look at a simple application, RandomValues, that generates random numbers. Take a look with

vi RandomValues.java

Compile and run it with

javac -cp criunatives.jar:. RandomValues.java
java -cp criunatives.jar:. RandomValues

You should see something like

Generate random numbers
random val: 771202271
random val: 395926121
random val: 139212100
random val: -335143831
random val: 948482858
random val: 2037630392

If you run it again, the numbers should be different each time. Next we will try this example again except this time a checkpoint will be taken after the first random value is printed. Uncomment the following line

//Utils.checkPointJVM("checkpointData");

And compile it again with

javac -cp criunatives.jar:. RandomValues.java

Now, we will first run the application without CRIU Support (make sure the checkpointData directory exists first)

java -cp criunatives.jar:. -DuseCRIUNativesLib=true  -Djava.library.path=/instantOnDemo RandomValues

And restore with

criu restore -D ./checkpointData --shell-job -v4 --log-file=restore.log

If you restore multiple times you’ll notice that the random numbers are identical. This could be a security vulnerability if there are algorithms that depend on random values being unique each time.

Lets try this again with CRIU Support. Enter the following to create a checkpoint

java -cp criunatives.jar:. -XX:+EnableCRIUSupport RandomValues 

And restore with

criu restore -D ./checkpointData --shell-job -v4 --log-file=restore.log

This time you’ll notice that the random values are unique each time.

Environment Variables Example

In this example we are going to take a look at the Environment Variables API in the JDK. Lets start by taking a look at the EnvironmentVariables application with

vi EnvironmentVariables.java

This application will simply print out the environment variables. Try it out with the following commands

javac -cp criunatives.jar:. EnvironmentVariables.java
java -cp criunatives.jar:. EnvironmentVariables

You can confirm that the environment variables displayed are correct by entering the following in your terminal

env

Next, let’s try this out with checkpoint and restore. Uncomment the following line in EnvironmentVariables.java

//Utils.checkPointJVM("checkpointData", "envFile");

Next we will run it without CRIU Support. Build and run the application (make sure checkpointData directory exists)

javac -cp criunatives.jar:. EnvironmentVariables.java
java -cp criunatives.jar:. -DuseCRIUNativesLib=true  -Djava.library.path=/instantOnDemo  EnvironmentVariables

Now that the checkpoint image is created, add a new environment variable

export NEW_ENV_VAR="does the JVM detect this"

Next restore the checkpointed image with

criu restore -D ./checkpointData --shell-job -v4 --log-file=restore.log

You’ll notice that the new environment variable is not printed, however if you type env in the terminal it appears. This is because the standard behaviour of the JVM is to read in the environment variables once at startup and place it into an immutable collection.

Next we will see how one can update environment variables with CRIU Support. First, delete the new environment variable and take a checkpoint with CRIU Support enabled.

unset NEW_ENV_VAR
java -cp criunatives.jar:. -XX:+EnableCRIUSupport EnvironmentVariables

In this example we specified a file "envFile" as an argument to the checkpoint method. This is going to register an environment variables file with the JVM that will be read upon restore with CRIUSupport.registerRestoreEnvFile().

Create a new file called “envFile” and add the new environment variables to it with

export NEW_ENV_VAR="does the JVM detect this"
env > envFile

Now restore the image with

criu restore -D ./checkpointData --shell-job -v4 --log-file=restore.log

You should see all the environment variables including the new one.

Conclusion

I hope this blog post was useful in understanding the purpose and design of checkpoint restore hooks, and the various compensations that are done in the JCL to provide a seamless end user experience with CRIU. If you have any questions, suggestions for compensations or APIs that we should add, or just wish to share your experience, please contact us; we would love to hear from you!

8 Replies to “OpenJ9 CRIU Support: A look under the hood (part II)”

  1. I don’t think your approach to patch instances of java.util.Random in the heap and change their seed on restore is conformant with the Java SE specification which explicitly states: “If two instances of Random are created with the same seed, and the same sequence of method calls is made for each, they will generate and return identical sequences of numbers. In order to guarantee this property, particular algorithms are specified for the class Random. Java implementations must use all the algorithms shown here for the class Random, for the sake of absolute portability of Java code. (see https://docs.oracle.com/javase/8/docs/api/java/util/Random.html)”. If you change the seed upon resume, you will violate this requirement.

    1. You’ve raised a good point. The automatic re-seeding of java.util.Random instances is important to ensure uniqueness of the restored JVM for security purposes. We have chosen to perform these compensations automatically in order to reduce the burden for users who want to port their application to checkpoint/restore. We believe that for the majority of cases, re-seeding will be the correct choice. However, if a users program relies on the seed being stable before and after checkpoint then this behaviour could be an issue. We have additional APIs in the works to give users more control on what will be automatically compensated and what should be left alone.

    2. Interesting point, Volker, thanks for mentioning it. If I understand what you’re worried about, it’s multiple Random objects in the same process that were specifically initialized with the same seed value so they intentionally operate in an “entangled” way no matter where they’re referenced in the JVM. One can still, of course, imagine cases where one wants that sequence to be reseeded (consistently) in the restored process or not. For Random objects initialized with a particular seed, it’s probably a more reasonable default to not reseed rather than our current default to reseed. But this is (yet another) one of those tricky corner cases where, as Tobi mentions, there need to be ways for users to override the default approach when it does not match their intent.

  2. Tobi, Mark, thanks a lot for your quick replies!

    Just to be clear, I think that in the vast majority of cases your change wouldn’t do any harm (and probably even be beneficial for the application). The problem is only that java.util.Random is specified to return, for every single seed, a predefined stream of (pseudo-random) numbers (as defined in Donald Knuth, The Art of Computer Programming, Volume 2, Section 3.2.1.). This means that if somebody creates a Random object with the “Random(long seed)” constructor, it is predefined what the first, tenth, etc. random number will be. After a resume, this requirement will be violated by your changes.

    In the same way many applications will benefit from your changes (because they expect “real” random numbers) others might depend on the java.util.Random specification (e.g. tests which use “random” numbers but want to be reproducible). This is why your change may brake some applications (and violates the specification) even though it might be useful for others.

    1. Agreed, no one-size-fits-all solution will work. It’s even possible to have an application that may require an early checkpoint because it desires both the seed to be different for every JVM and for its post checkpoint pseudo random distribution to match its pre checkpoint pseudo random distribution in some verifiable way (maybe the earlier random numbers are stored away somewhere). The only solution in that case would be to checkpoint earlier (i.e. before the Random object is created) which may even be so early that it’s not really worth taking a checkpoint. It depends how much code depends on the Random object. A very similar kind of issue arises for native images.

      One key question will be whether many real applications fall into this kind of category (easy to achieve disappointment on that kind of question, though). I also think this may be a spot where the Java SE specification might be able to tolerate some adjustment for “practical purposes” 🙂 , but I haven’t put any thought yet into what that change would need to look like. That would be a discussion that’s probably more appropriate in an OpenJDK forum (CRaC or Leyden projects maybe?).

    1. In short, yes. However, this will most likely require changes in Spring Boot to coordinate the checkpoint and restore via pre-checkpoint/post-restore hooks in order for it to work successfully. One needs to ensure that external dependencies (file handles, socket connections, etc.) are closed pre-checkpoint and re-opened post-restore. This has been tried and tested with OpenLiberty (https://openliberty.io/blog/2022/09/29/instant-on-beta.html) and has shown good results.

Leave a Reply