Overview
A previous blog post described how to deploy unprivileged containers that restore an application that checkpointed itself using OpenJ9 CRIU Support on Kubernetes. This post goes over how to build a simple Spring Boot Application, how to configure the application to use OpenJ9 CRIU Support to checkpoint itself, and how to use Knative Serving to create a serverless deployment. This post also goes over how the application with OpenJ9 CRIU Support performs compared to a traditional JVM as well as when compiled natively using GraalVM Native Image in a serverless deployment.
Table of Contents
Prerequisites
- A system with at least Linux kernel version 5.9 for the reasons described below (unless backported to previous kernel versions):
- The kernel needs to support the
CAP_CHECKPOINT_RESTORE
capability. Knative
does not supporthostPath
or runningprivileged
containers; therefore, in order to do an uprivileged restore,CRIU
needs to use theclone3
syscall. This means having to run on at least Ubuntu 22.04 or RHEL 9.
- The kernel needs to support the
- The latest version of
podman
for the linux distribution; at the time of this blog post,docker
does not yet supportCAP_CHECKPOINT_RESTORE
(though this support seems to be present in the development stream at thedocker
project).
Since podman
does not use a daemon with root
authority, the user who launches the container must have the authority to grant it the necessary Linux capabilities. This blog assumes the minikube
and podman
commands are run as the root
user.
Setup
This blog uses a Ubuntu 22.04 host machine. Ensure the kernel is the latest as there was an issue fixed in the latest updates that was preventing successful restore. YMMV on other OSes (though RHEL 9 should work based on our experience).
1. Update the OS
sudo apt-get update
sudo apt-get dist-upgrade
reboot
2. Install CRI-O as per the documentation
OS=xUbuntu_22.04
VERSION=1.24
echo "deb [signed-by=/usr/share/keyrings/libcontainers-archive-keyring.gpg] https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/$OS/ /" > /etc/apt/sources.list.d/devel:kubic:libcontainers:stable.list
echo "deb [signed-by=/usr/share/keyrings/libcontainers-crio-archive-keyring.gpg] https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/$VERSION/$OS/ /" > /etc/apt/sources.list.d/devel:kubic:libcontainers:stable:cri-o:$VERSION.list
mkdir -p /usr/share/keyrings
curl -L https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/$OS/Release.key | gpg --dearmor -o /usr/share/keyrings/libcontainers-archive-keyring.gpg
curl -L https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/$VERSION/$OS/Release.key | gpg --dearmor -o /usr/share/keyrings/libcontainers-crio-archive-keyring.gpg
apt-get update
apt-get install cri-o cri-o-runc
unset OS
unset VERSION
3. Install minikube
as per the documentation
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube
4. Build the Spring Boot Applications
git clone https://github.com/ibmruntimes/InstantOnStartupGuide.git
cd InstantOnStartupGuide/Knative
./buildAll.sh
If you run
podman images
You should see four built images:
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/instanton-ol-spring-demo-restorerun latest 124a29eee9bc 2 minutes ago 1.66 GB
localhost/instanton-spring-demo-restorerun latest 6daec909749a 4 minutes ago 1.55 GB
localhost/nativeimage-spring-demo latest 1573c47eba0a 6 minutes ago 1.35 GB
localhost/jvm-spring-demo latest 63212bcadf2e 12 minutes ago 757 MB
5. Finally, tag the images appropriately and push them to some container registry, for example:
podman tag localhost/jvm-spring-demo docker.io/<dockerhub userid>/jvm-spring-demo
podman tag localhost/nativeimage-spring-demo docker.io/<dockerhub userid>/nativeimage-spring-demo
podman tag localhost/instanton-spring-demo-restorerun docker.io/<dockerhub userid>/instanton-spring-demo-restorerun
podman tag localhost/instanton-ol-spring-demo-restorerun docker.io/<dockerhub userid>/instanton-ol-spring-demo-restorerun
podman push docker.io/<dockerhub userid>/jvm-spring-demo
podman push docker.io/<dockerhub userid>/nativeimage-spring-demo
podman push docker.io/<dockerhub userid>/instanton-spring-demo-restorerun
podman push docker.io/<dockerhub userid>/instanton-ol-spring-demo-restorerun
Details
The buildAll.sh
script builds the following application:
HelloWorldController.java
package com.example.helloworld;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class HelloWorldController {
@GetMapping("/")
public String helloworld() {
return "Hello World!";
}
}
HelloWorldApplication.java
package com.example.helloworld;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class HelloWorldApplication {
public static void main(String[] args) {
SpringApplication.run(HelloWorldApplication.class, args);
}
}
to run in four different configurations:
- JVM: The
jvm-spring-demo
image runs the application using an IBM Semeru Java 17 build. - Native Image: The
nativeimage-spring-demo
image runs the application compiled with Native Image. This configuration was included to see how a CRIU solution compares to a Closed World Native solution. - OpenJ9 CRIU Support: The
instanton-spring-demo-restorerun
image runs the application from a restore point of our choosing. - Open Liberty InstantOn: The
instanton-ol-spring-demo-restorerun
image runs the Spring Boot application in Open Liberty using Open Liberty’s InstantOn feature and theapplications
checkpoint phase.
The application simply returns the string Hello World!
when the /
endpoint is hit. I used Spring Initializr to generate the Maven projects. All four configurations used Maven 2.7.4 with Java 17 and the Spring Web plugin. The Native Image configuration also used the Spring Native plugin. All four configurations also modify the application.properties
file to use port 9080
. The following sections describe the four configurations in more detail.
JVM
./buildAll.sh
invokes ./springboot/jvm/build.sh
; this simply builds ./springboot/jvm/Dockerfile
which:
- Downloads a Semeru JDK17 build.
- Copies the
helloworld
directory into the container. - Runs
./mvnw -DskipTests package
.
Native Image
./buildAll.sh
invokes ./springboot/nativeimage/build.sh
which builds ./springboot/nativeimage/Dockerfile
which:
- Downloads and installs
native-image
. - Copies the
helloworld
directory into the container. - Runs
./mvnw -Pnative -DskipTests package
OpenJ9 CRIU Support
The OpenJ9 CRIU Support configuration is more involved. First of all, the project needs an additional java file:
MyApplicationListener.java
package com.example.helloworld;
import java.nio.file.Paths;
import org.springframework.boot.context.event.ApplicationReadyEvent;
import org.springframework.context.ApplicationListener;
import org.springframework.core.annotation.Order;
import org.springframework.stereotype.Component;
import org.eclipse.openj9.criu.CRIUSupport;
@Component
class MyApplicationListener implements ApplicationListener<ApplicationReadyEvent> {
@Override
public void onApplicationEvent(ApplicationReadyEvent event) {
String path = "checkpointData";
if (CRIUSupport.isCRIUSupportEnabled()) {
new CRIUSupport(Paths.get(path))
.setLeaveRunning(false)
.setShellJob(true)
.setFileLocks(true)
.setLogLevel(4)
.setLogFile("logs")
.setUnprivileged(true)
.checkpointJVM();
} else {
System.err.println("CRIU is not enabled: " + CRIUSupport.getErrorMessage());
}
}
}
This is because we need to find a place from where we can checkpoint and restore. Ideally, it would be better to checkpoint and restore after the application is reported as ready but before doing any stateful work (such as opening ports) but I could not find such a place without modifying the Spring code itself. If there is a better place to invoke checkpointJVM
, please do let me know as I am not a Spring expert.
./buildAll.sh
invokes ./springboot/instanton/build.sh
which
- Builds
./springboot/instanton/Dockerfile
which- Similar to the Unprivileged OpenJ9 CRIU Support post, derives from
icr.io/appcafe/open-liberty:beta-instanton
as it contains the custom built CRIU binary that has support for unprivileged restore. - Downloads the Semeru JDK17 Early Access build with OpenJ9 CRIU support.
- Copies the
helloworld
directory into the container. - Runs
./mvnw -DskipTests package
. - Copies the scripts in
./springboot/instanton/scripts
into the container. - Sets the entry point to run
entry-point.sh
; this script either runscheckpoint.sh
orrestore.sh
depending on whether the checkpoint has already been created.
- Similar to the Unprivileged OpenJ9 CRIU Support post, derives from
- Runs the newly built image which will generate a checkpoint.
- Commits the container (which contains the checkpoint) to create a new image.
It is worth mentioning that the reason the configuration directory is named instanton
is because while “OpenJ9 CRIU Support” is the name of the capability at the OpenJ9 project, we’re using the IBM Semeru JDK build (open edition), that uses the OpenJ9 JVM under the covers.
Open Liberty InstantOn
Open Liberty can run a Spring Boot application. Additionally, Open Liberty has the InstantOn feature (which uses OpenJ9 CRIU Support to do the actual checkpoint/restore). Therefore, this configuration means we do not have to worry about where to checkpoint in the actual Spring Boot application, and instead leave it to Open Liberty to handle the checkpoint/restore operation. However, it is important to note that there was some work done in the Liberty framework to make it possible to configure/compensate after restore; the Spring framework would likely need to add similar support if more complex applications are to be run using CRIU support.
Firstly, the application has to be modified to extend org.springframework.boot.web.servlet.support.SpringBootServletInitializer
so that it can be packaged as a WAR. We also need the following server.xml
configuration for Open Liberty
<?xml version="1.0" encoding="UTF-8"?>
<server description="new server">
<!-- Enable features -->
<featureManager>
<feature>servlet-4.0</feature>
</featureManager>
<!-- To access this server from a remote client add a host attribute to the following element, e.g. host="*" -->
<httpEndpoint id="defaultHttpEndpoint"
httpPort="9080"
httpsPort="9443" />
<webApplication location="helloworld-0.0.1-SNAPSHOT.war" contextRoot="/" />
<!-- Default SSL configuration enables trust for default certificates from the Java runtime -->
<ssl id="defaultSSLConfig" trustDefaultCerts="true" />
</server>
./buildAll.sh
invokes ./springboot/instanton-ol/build.sh
which
- Builds
./springboot/instanton-ol/Dockerfile
which- Derives from
icr.io/appcafe/open-liberty:beta-instanton
as it the Open Liberty Beta build that has the InstantOn feature. - Downloads the Semeru JDK17 Early Access build with OpenJ9 CRIU support.
- Copies the
helloworld
directory into the container. - Runs
./mvnw -DskipTests package
. - Copies the
server.xml
file to/config
and the newly generated WAR file to/config/apps
. - Runs
configure.sh
so that Open Liberty can run the application.
- Derives from
- Runs the newly built image which will generate a checkpoint.
- Commits the container (which contains the checkpoint) to create a new image.
Deploy
With the application images built, they can now be deployed on Kubernetes as a Serverless deployment using Knative Serving.
1. Modify all of the following files
./springboot/jvm/jvm-spring-demo.yaml
./springboot/nativeimage/nativeimage-spring-demo.yaml
./springboot/instanton/instanton-spring-demo-restorerun.yaml
./springboot/instanton-ol/instanton-ol-spring-demo-restorerun.yaml
by replacing <TAG>
with appropriate image name pushed to the container registry in the Setup section above. Everything below assumes the user is root
.
2. Configure minikube
to use Knative Serving. Start minikube
minikube start --container-runtime=cri-o --driver=podman --force
3. Use an alias for brevity
alias kubectl="minikube kubectl --"
4. Wait until until the cluster is fully initialized; run
kubectl get pods -A
and verify that all pods are in the Running
state and Ready.
5. Configure Knative
./knative-deploy.sh
This will install the Knative Serving pods, as well as the Kourier network layer. It also applies the ./knative-config.yaml
configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: config-features
namespace: knative-serving
data:
kubernetes.podspec-securitycontext: "enabled"
kubernetes.containerspec-addcapabilities: "enabled"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: config-autoscaler
namespace: knative-serving
data:
scale-to-zero-grace-period: "10s"
stable-window: "6s"
This configuration enables the autoscaler so that the cluster scales to zero.
6. Wait until all knative-serving
and kourier-system
pods are up and running by checking the output of
kubectl get pods -A
7. Add the previously built Spring Boot applications as Knative Services
./knative-services.sh
This will pull the images specified in the .yaml
files described at the start of this section. It also uses kubectl port-forward
to forward internet traffic to local port 8080
to the network layer’s port 80
. It may appear that the kubectl port-forward
command is running in the foreground but pressing the enter/return key will bring the terminal prompt back.
8. Wait until the applications are scaled to zero (i.e., no running pods) by checking the output of
kubectl get pods
Compare
A script is provided to compare the four applications.
./compareknative.sh
This script invokes each of the four services by first waiting until the cluster has scaled all deployments to zero, and then invoking curl to hit the localhost:8080
endpoint. On my machine the output is
Invoking JVM
Handling connection for 8080
Hello World!
Establish Connection: 0.000137s
Total: 5.887448s
Waiting until pods are scaled to zero
Pods are scaled to zero
Invoking Native Image
Handling connection for 8080
Hello World!
Establish Connection: 0.000166s
Total: 1.517027s
Waiting until pods are scaled to zero
Pods are scaled to zero
Invoking InstantOn
Handling connection for 8080
Hello World!
Establish Connection: 0.000506s
Total: 1.676074s
Waiting until pods are scaled to zero
Pods are scaled to zero
Invoking InstantOn on OL
Handling connection for 8080
Hello World!
Establish Connection: 0.000128s
Total: 1.529405s
Waiting until pods are scaled to zero
Pods are scaled to zero
As you can see, the application with OpenJ9 CRIU Support is significantly faster than without, and it is in the same ballpark as a natively compiled application. Additionally, running the Spring Boot application using Open Liberty’s InstantOn feature is also in the same ballpark.
Now, it may seem surprising that the Native Image application takes as long as it does. However, there are a few reasons for this:
- There is an “inherent Knative overhead”; because the system was scaled to zero, when
curl
sends a request tolocalhost:8080
, Knative has to start a container, which in turn starts the application. - The time measured is not the time it took for the application to start, but the time it took for
curl
to receive the response from an application that was scaled to zero. - The results are based on a quick sniff test, and the machine I used was not the most performant one. Your measurements could easily vary but I suspect the fundamental conclusion won’t.
To get a sense of the overhead of a Knative Serverless application, a second script is provided to simply measure the time to first response outside a Knative environment:
./comparelocal.sh
This script first runs ./curlloop.sh
which runs curl
in a loop until localhost:9080
returns an HTTP code of 200
. It then starts the application. The time outputted comes from date +"%s.%N"
which is the time in seconds and nanoseconds since Unix Epoch. On this machine, it takes approximately 450ms for native image, 600-700ms for CRIU support, and 5.5 seconds for normal JVM mode. Thus, comparing with the prior numbers shown when scaling out from zero in Knative, it seems like the “inherent Knative overhead” is, on average, about a second on this machine.
Conclusion
This blog post described how to create a (Knative) Serverless deployment of a Spring Boot application using OpenJ9 CRIU Support to significantly improve the time to first response. We also observed that the “inherent Knative overhead” of scaling out from zero was approximately one second in our environment. This Knative overhead is dwarfed by the JVM’s startup time when running our Spring Boot application. However, the time it takes to scale out from zero is much smaller with OpenJ9 CRIU support (Native Image is also in the same ball park) thus making it more feasible for serverless workloads.
1 Reply to “Serverless Deployment with OpenJ9 CRIU Support”