Note: This post is out of date; see Deploying on OCP 4.13 With OpenJ9 CRIU Support for the most up-to-date instructions.
The previous blog post showed how to restore an application that checkpointed itself using OpenJ9 CRIU Support in an unprivileged container. This blog post will go over how deploy these containers in Kubernetes (K8s) and OpenShift Container Platform (OCP).
Prerequisites
- A kernel that supports the
CAP_CHECKPOINT_RESTORE
Linux capability. This capability was introduced in kernel version 5.9 but has been backported to RHEL kernel versions used in RHEL 8.6. CRI-O
configured to usecrun
orrunc
.- If using
runc
, the version needs to be 1.1.3 or higher to have the recent fix which enables mounting/proc/sys/kernel/ns_last_pid
. - A container image with the checkpointed application.
Kubernetes
Clone the InstantOnStartupGuide repo
git clone https://github.com/ibmruntimes/InstantOnStartupGuide.git
cd InstantOnStartupGuide
Use an alias to for brevity.
alias kubectl-criu='kubectl --namespace=criu'
In line with K8s best practices, create the criu
namespace so that the deployment can be appropriately scoped.
kubectl create -f YAMLs/k8s/criunamespace.yaml
In line with k8s best practices, create a service account in the criu
namespace that will be used to run the deployment.
kubectl-criu apply -f YAMLs/k8s/criusvcacct.yaml
Launch the deployment. Note, the image
field in my-app-criu.yaml
should be updated with the name of the container image with the checkpointed application.
kubectl-criu apply -f YAMLs/common/my-app-criu.yaml
Inspect the logs to view the application output.
kubectl-criu get pods
kubectl-criu logs <POD_NAME_FROM_PREVIOUS_CMD>
If you chose to try out the image built from this previous blog, you can view the output by doing
kubectl-criu exec <POD_NAME_FROM_PREVIOUS_CMD> -- tail -f out
If you inspect the my-app-criu.yaml
deployment file, you will see a similarities to the podman run
command in the previous blog.
containers:
- name: my-app-criu
image: <TAG>
imagePullPolicy: Always
volumeMounts:
- mountPath: /proc/sys/kernel/ns_last_pid
name: ns-last-pid-mount
securityContext:
capabilities:
add: [ "CHECKPOINT_RESTORE", "NET_ADMIN", "SYS_PTRACE" ]
volumes:
- name: ns-last-pid-mount
hostPath:
path: /proc/sys/kernel/ns_last_pid
type: File
There is still a need to specify the Linux capabilities in the Security Context, and /proc/sys/kernel/ns_last_pid
needs to be specified as a Volume Mount if using runc
or a kernel that does not have the clone3
system call. However, there is no need to specify the seccomp profile because by default, it is unconfined.
OpenShift Container Platform
Clone the InstantOnStartupGuide repo
git clone https://github.com/ibmruntimes/InstantOnStartupGuide.git
cd InstantOnStartupGuide
In line with OCP best practices, create a new project so that the deployment can be appropriately scoped.
oc new-project criu
oc project criu
In line with OCP best practices, create a new service account to run the deployment.
oc create sa criusvcacct
Create the appropriate Security Context Constraint (SCC) to allow a restore to occur with minimal privileges. This SCC is based on the restricted
SCC. Additionally, create a new Role that uses this SCC.
oc apply -f YAMLs/ocp/scc-cap-cr.yaml
oc apply -f YAMLs/ocp/role-custom-scc-cap-cr-my-app-criu.yaml
Create a new Role Binding to bind the Role the Service Account.
oc apply -f YAMLs/ocp/rolebinding-criusvcacct-my-app-criu.yaml
Launch the deployment. Note, the image
field in my-app-criu.yaml
should be updated with the name of the container image with the checkpointed application.
oc apply -f YAMLs/common/my-app-criu.yaml
Inspect the logs to view the application output.
oc get pods
oc logs <POD_NAME_FROM_PREVIOUS_CMD>
If you chose to try out the image built from this previous blog, you can view the output by doing
oc exec <POD_NAME_FROM_PREVIOUS_CMD> -- tail -f out
At the time of the blog post, Red Hat CoreOS (4.11) does not have the necessary version of runc
to allow mounting /proc/sys/kernel/ns_last_pid
. There is a way to work around this, but it is not recommended on anything other than a sandbox environment. For each worker node:
- Download the latest
runc
binary. - Create a bind mount by doing
sudo mount --bind /Path/to/latest/runc /usr/bin/runc
This gets around the fact that /usr/bin/runc
can’t be updated because it is mounted as read-only.
Privileged
Running as privileged means deploying the container in the most permissive mode possible. This is generally not recommended as it significantly weakens the security of a container. However, for completeness, the following briefly outlines deploying in privileged mode.
K8s
Deploying a privileged container in K8s is relatively straightforward. The template
in the deployment file YAMLs/common/my-app-criu.yaml
should be updated to be
template:
metadata:
labels:
name: my-app-criu
spec:
serviceAccount: criusvcacct
serviceAccountName: criusvcacct
containers:
- name: my-app-criu
image: <TAG>
imagePullPolicy: Always
securityContext:
privileged: true
The main difference is to remove the Volume Mount, and to use privileged: true
in the securityContext
field.
OCP
Deploying a privileged container in OCP is a little more involved. First, the resourceNames
field in the YAMLs/ocp/role-custom-scc-cap-cr-my-app-criu.yaml
configuration should be updated to
resourceNames:
- privileged
Next, the YAMLs/common/my-app-criu.yaml
should be updated with the changes described in the K8s section above.
Serverless
To learn about how to deploy serverless applications that use OpenJ9 CRIU Support, check out Serverless Deployment with OpenJ9 CRIU Support.
3 Replies to “Deploying on Kubernetes and OpenShift with OpenJ9 CRIU Support”