An easier approach to building Kubernetes Operators
26 Aug 2021
Using Operator-Framework SDK on top of Kubebuilder
Kubernetes Operators are quite complex to implement from scratch. They are commonly used to automate managing processes inside and outside of Kubernetes by regularly recurring calls while queued to a control loop.
Today, we will outline an easy way to build an Operator using the Operator-Framework and SDK based on Kubebuilder . We describe how to install and set up a template operator project, which can be built and is deployable into a local Kubernetes cluster. In a later article, we will use that template to implement a real use case that can deploy and run in a production environment.
You can find all the source code accompanying this article at Github .
Before we can build our first Operator in Go we need to prepare our local environment by installing KinD as a Kubernetes cluster and run a Docker registry locally where we can push and pull our Controller images.
Our KinD configuration defines some essential parameters for our cluster but also containerd plugin configuration to let Kubelet know where to pull Docker images from.
Setup KinD Cluster
Run Docker registry
Our Docker registry is run as a container and connected to the KinD network.
Finally, we need a Configmap to let Pods such as our Controller know where to find our Docker registry:
Operators in Kubernetes are simply Pods acting as Controllers for Custom Resources (CRs) located at the data plane. They follow the Kubernetes principle of control loop. Operators are clients of the Kubernetes API-Server .
Kubernetes Operators can be implemented programmatically in any language providing libraries to communicate with the Kubernetes API, declarative with yaml to automate simple tasks where the kubectl command is not enough or just by implementing bash scripts.
There are two Operator-Frameworks for developing Operators in Python or Go. We will focus here on the Go-based Operator-Framework because Go is the native language Kubernetes was originally implemented in and it is the most widely used. Furthermore, using Go brings more developer benefits, such as community support and existing Frameworks and SDKs.
In addition to Go, this SDK also provides declarative ways to define Operators and implement an Operator Lifecycle Manager known as OLM, which provides a declarative way to install, manage and upgrade Operators and their dependencies in a cluster.
The Operator Framework SDK defines three types of Operators, but we will confine ourselves to the Go-based Operator to implement a Kubernetes native automation:
Let’s create and initialise our project and let the framework bring all components into place. You need to replace domain and repo parameters with your own if you want to publish your Operator.
The --domain attribute is very important and is used as a postfix for all API groups . Unlike the domain attribute, the --repo attribute is used in reverse order for Go package naming.
For now operator-sdk created our project structure, a Dockerfile for our Controller, a Makefile with predefined commands to build and deploy and a main.go file, which acts as the entry point. We can also see a config/folder, which contains all the manifests yaml files an Operator needs to work as expected. This includes a Service Account definition, Roles and RoleBinding objects or other RBAC -related objects, a Deployment and a Service object for the Controller-Manager, as well as some resources for monitoring and metrics. All of these resources are managed via Kustomize .
Let’s now create the first API, Type and Controller of version v1alpha1 grouped in core with a Custom Resource Definition (CRD) named Template.
There are two more important folders under our project root:
The api folder includes our type definition v1alpha1/template_types.go, which corresponds to our CRD, a v1alpha1/groupversion_info.go, which defines our Schema and GroupVersion.
Each of the api files are generated inside the subfolder named by the version of our API. Our Controller can be found in controller/template_controller.go as well as the test suite for our Operator. We can also find a first generated version of our CRD based on the API template type we created under config/samples/core_v1alpha1_template.yaml
For a basic implementation, only three files are important for now. We define our Custom Resource (CR) Template object in api/v1alpha1/template_types.go. Let’s have a closer look at it.
The TemplateSpec struct defines our object attributes, which describe our CR itself. The Status of our object, which is later compared and changed by our Controller, is defined in the TemplateStatus struct.
The Template struct follows the common schema definition for Kubernetes objects and references the meta data and a json representation of our CR. The init() function acts like a constructor and registers our CR before it can be used.
The corresponding CRD then looks like the following example:
Every time we add an attribute to TemplateSpec, we also need to add it in our CRD if it's declared as mandatory.
The Template Controller under controllers/template_controller.go implements the logic of our Operator.
We inject the API-Client object as well as the Schema definition into our Reconciler struct. The Controller has only two functions. The SetupWithManager function adds a new Controller-Manager to our Controller to control our Template CR. The Reconciler function itself is the actual logic we have to implement. For now, nothing is happening here, but the Operator is already complete and can be built and deployed.
Build and Deploy
We will use the Makefile intensively for all of our Build, Run and Deployment activities. To show all the defined targets, we can simply run:
It’s possible to run the operator controller on our host and outside of Kubernetes. Make sure you have already set your $KUBECONFIG environment variable to the KinD generated config file.
Before running the controller, the CRD must be deployed to Kubernetes.
Now we can Run the Controller locally and see what's going on there.
But we don't have any CR yet defined and deployed to our cluster. Let’s do that while opening a second console. We will use our sample for now:
We can find our Custom Resource now in our Cluster:
But nothing happened in our Controller. Why is that? Let’s go back to it and get a better understanding of the Reconciler function there. We can see that this function returns an empty Result struct of the sigs.k8s.io/controller-runtime package. What does it look like?
We did not define Requeue, so it will be false by default. That means our Operator is only triggered once during the first initial run and never queued again. Let’s change that and set Requeue: true and add a logging output as well, then restart our controller.
Perfect! Now we see the output repeated again and again as long as the Controller is requeued for the control loop. But we never touched our CR and compared its status with any mandatory value.
In our last step for this part we will add a new field max of type int32 to our CRD and set it to 5. We also need a field current of type int32 to hold the current value. Our Controller should then increase our current value in each loop run, as long as it is reaching our max value.
Now let’s deploy the new version of our CRD and CR to our cluster and run our controller again.
What we can see here is that the Status.Update() automatically requeues our CR object. The Status.Current attribute is increased each loop up to Spec.Max . We use Status.Update() instead of updating the Template object itself to prevent accidentally overwriting status fields.
We now have a run and deployable complete Operator created. In a later post, we will implement a real use case and show the nature of an Operator and how it could affect our Kubernetes cluster.
Insight, imagination and expertly engineered solutions to accelerate and sustain progress.