Exploring JITServer on the new Linux on IBM z16 Platform

With the recent 0.32 release of Eclipse OpenJ9, the JITServer technology was officially released and made available for the Linux on IBM z System/LinuxONE platforms. This release also coincides with the recently launched IBM z16 technology. This blog will provide an overview of the JITServer technology and demonstrate some of its key benefits using the latest hardware available on the IBM z System platform.

JITServer

The JITServer technology decouples the Just-In-Time Compiler (JIT) from the Java Virtual Machine (JVM), allowing the JIT to function as a standalone tool. JITServer’s goal is to offload the JIT compilation workload from the client JVM to a JITServer, so that the client JVM/application can reap the benefits of JIT compiled methods without having to perform Java method compilation work locally. You can learn more about OpenJ9’s JITServer technology from our previous blog posts and from the OpenJ9 docs.

This blog post will demonstrate the key benefits of using JITServer using the Acme Air sample application as a benchmark. The Acme Air application models a Jakarta EE application for a fictitious airline that provides several APIs and microservices such as flight bookings with thousands of database entries. You can learn more about Acme Air here.

How to get started with JITServer

Launching a JITServer on a server machine is a very straightforward task. All you need is to run the following command:

<jdkHome>/bin/jitserver

This will launch the JITServer on your machine. In order to instruct your Java application (i.e. the client JVM) to use a JITServer, you must provide the following JVM options when launching your Java application:

“-XX:+UseJITServer -XX:JITServerAddress=<JITServer server name>"

Note: If the JITServer process is being run on the same machine as the Java application, then the `-XX:JITServerAddress` option can be omitted. Moreover, if the JITServer process is executed in a docker container on the same bridge network as the Java application, then you can use the container name as the server’s address.

Our benchmark setup

The benchmark we will be using is AcmeAir, an application built on Open Liberty that simulates an airline reservation system. The AcmeAir application is able to handle billions of API calls/day and can be deployed to public clouds, so it can be used as a real-world example of a modern, distributed Java application.

To simulate a multi-node environment, the AcmeAir app, MongoDB database, and JMeter load driver for the benchmark will be running in separate Docker containers. The JITServer instance will also be running in its own container for experiments where JITServer is enabled. 

We will be using the new IBM z16 system with RHEL 8.2 as our benchmarking platform.

For each scenario, we will perform both a cold and a warm run in order to demonstrate the performance gains provided by OpenJ9’s Shared Class Cache (SCC) facility. The SCC improves startup performance by placing classes that an application needs into a a shared class cache. This is colloquially referred to as the cold run. When the application runs, it is able to start faster as many of the classes it needs are available again. This subsequent run is the warm run. You can learn more about the SCC facility and its benefits here. Unless noted otherwise, the results displayed in this blog post were measured from warm runs.

For each scenario, we will limit the resources available to the container running AcmeAir/Liberty Server and compare the results when running with and without JITServer.

Experiment #1

To begin with, we performed a test run without any specific machine resource limits. Our baseline scenario for all experiments is the AcmeAir application without JITServer. We will be comparing the baseline case with our preferred scenario, in which we enable JITServer.

The following graph compares the results of two test runs – one run with JITServer enabled, and the other without:

Figure 1. AcmeAir Throughput OpenJ9 vs OpenJ9 w/ JITServer

We can see that both scenarios yielded similar throughput levels and the time taken to reach steady state throughput (i.e. “ramp up”) is about the same. The average steady state throughput (measured from 60s to 600s) is 127,444 requests/second for the baseline OpenJ9 runs, while the average for runs with JITServer enabled is 125,811 requests/second. The JITServer enabled run achieves a similar level of throughput compared to our baseline case.

So what’s the benefit of running an app in this environment with JITServer? Let’s take a look at the amount of CPU time the AcmeAir application’s JVM has spent performing JIT Compilations.

OpenJ9OpenJ9 + JITServer% Compilation related CPU Time Saved w/ JITServer
Cold run80,678 ms13,181 ms83 %
Warm run53,477 ms4,572 ms91 %
Table 1 JIT Compilation related CPU time saved using JITServer

Note: The above data can be found in a diagnostic file when running your app with the following JVM command line option: -Xjit:verbose={compilePerformance},vlog=vlog.

For the OpenJ9 cold run, the local JVM spent ~81 seconds performing JIT compilations.  On the other hand, the OpenJ9 + JITServer run spent only ~13 second doing JIT compilations locally inside the AcmeAir container. This is because most of the JIT compilation workload is offloaded to a remote JITServer instance and the client JVM only incurs the networking costs. Thus, the application JVM spent 83% less CPU time performing JIT compilations when JITServer is enabled compared to our baseline case. 

For the warm run, the savings are even greater: 91%. Again, we achieved a similar level of application throughput as our baseline case, but saved a significant amount of JIT compilation related CPU time by offloading that overhead to the JITServer.

Now that we have seen how much processing time can be saved by using JITServer, let’s restrict the compute resources available to the AcmeAir container in order to see the benefits JITServer is able to provide in resource-constrained environments.

Experiment #2

For our second experiment, we will limit the number of CPUs available to the AcmeAir container to 1 (i.e. the Docker container will be run with the following additional command-line option: –cpus=1). A graph comparing the warm run results for this experiment is displayed below.

Figure 2. AcmeAir Throughput OpenJ9 vs OpenJ9 w/ JITServer (–cpus=1)

Figure 2 shows that in both scenarios we are able to reach a steady state throughput of ~8500 requests/second. However, when JITServer is available to offload compilations, we can ramp up and reach steady state throughput significantly faster. 

If we look closely at the time axis of the chart, we can see that the JITServer enabled run is able to exceed a throughput of 8000 requests/second only 42 seconds after the application starts (this is marked by the green dot on the graph). At the same time, the baseline OpenJ9 run takes 84 seconds to exceed the same milestone of 8000 requests/second (this is marked by the red dot on the graph). Therefore, we can conclude that in an environment where AcmeAir containers are restricted to 1 CPU, enabling JITServer allows us to ramp-up to steady state throughput performance 50% faster.

Experiment #3

For our third experiment, the AcmeAir container will be limited to 1 CPU and 175MiB of memory. The results of the warm run are displayed below.

Figure 3. AcmeAir Throughput OpenJ9 vs OpenJ9 w/ JITServer (–cpus=1, -m=175m)

As we can see clearly, limiting both CPU and memory has a significant impact on throughput for the baseline OpenJ9 run. At steady state, we see a throughput of 4592 requests/second.

When the application JVM makes use of JITServer to offload JIT compilations, we significantly improve performance by reaching steady state throughput of 6539 requests/second. This corresponds to an improvement of 42% over the baseline case!

The reason for this gap in throughput is that in low memory environments, OpenJ9 won’t have enough memory to perform JIT compilations while simultaneously execute the application, resulting in many compilation failures. With the addition of JITServer, we are able to circumvent this restriction by offloading JIT compilations to a less constrained environment.

Conclusion

In this blog post we discussed how the JITServer technology decouples the JIT from the rest of the JVM and how it allows JIT compilations to take place remotely. In our experiments, we observed that it reduced JIT compilation CPU time in the application JVM by ~91%. Secondly, reducing the number of CPUs available to the AcmeAir container to 1 demonstrated that JITServer enabled applications can ramp up approximately 50% faster. Finally, when reducing the available CPUs to 1 and container memory to 175MiB, we saw that the JITServer enabled runs improved average throughput by 42% at steady state in this highly resource-constrained environment. These performance benefits provided by the JITServer could allow you to use smaller containers to launch Java services, and thus pack more instances onto the same physical resources. This in turn may make your Java deployments more cost effective.

The JITServer technology is available today for use on IBM zLinux, Power PC LE Linux, and x86-64 Linux platforms. Try it today!

Many thanks to Joran Siu, Marius Pirvu and Marc Beyerle for their contributions and feedback on this post.

Leave a Reply