The second in a two-part series on building a Kubernetes Operator implements a use case with real value for our Kubernetes cluster security.

In the first instalment of this two-post series, we described how to install and configure our environment to implement a simple Kubernetes operator skeleton using the Operator-Framework SDK. We continue our work today by implementing a real use-case in which we use Clair as a backend for indexing and scanning our deployed Pod container images. Newly discovered Common Vulnerabilities and Exposures (CVEs) will trigger a notifier app via webhooks and inform us on Slack about it.

Requirements

As described above, our environment needs to be installed and configured. This includes kinD as our local Kubernetes cluster, a local container registry and the Operator-Framework itself. We expect to have an up-and-running kinD cluster and container registry.

Slack

For our use case, we need to prepare a Slack-based endpoint to send notifications to, a Slack-Channel to send our notifications to and as a registered Slack-App to receive webhook messages. We won’t go into how you create a Slack-Channel and a registered webhook Slack-App here: Just keep the Slack webhook URL in mind for later use.

Install Clair

For the sake of simplicity, we will run Clair and its dependencies in our cluster in combo mode. Make sure the KUBECONFIG environment variable points to the right Kubernetes config file.

Copy to Clipboard

Clair needs a config file for all its components: the notifier, matcher and indexer:

Copy to Clipboard

Manifests, vulnerabilities and notifications are stored in a database by Clair:

Copy to Clipboard

We will run Clair in combo mode for now to have all components in one place. Feel free to deploy all services in separate pods.

Copy to Clipboard

Clair Operator

Unlike our part 1 Operator, our Clair-Operator will not define a scheduling resource. We will instead create a controller that watches for Pod change events — such as CREATE or UPDATE — and trigger our reconciler function related to that Pod. Handling DELETE events is more difficult because the Pod reference is lost. Our Operator needs to register container images to Clair using container manifests and let Clair create vulnerability reports for it.

Create Operator

First, we initialise our project and create an API in group core and version v1alpha1 as well as our sample Scanner CR, as we did in part 1.

Copy to Clipboard

Set up controller

As described above, we need a Pod watcher to trigger our reconciler function. As we know from part 1, the SetupWithManager() function in controllers/scanner_controller.go builds and sets up our controller type and returns it to the caller in the main() function before setting up a signal handler and starting the controller manager.

Copy to Clipboard

Custom Resource definition

Let’s have a look at our Custom Resource, our ScannerSpec Type defined in api/v1alpha1/scanner_types.go:

Copy to Clipboard

The idea behind the Backend field is to define different scanner backends. In our sample, we will only have a Clair typed Scanner resource. As Notifier, we only provide a Slack notifier, but in order to show the flexibility of our Scanner type, it is showing here. ClairBaseUrl as well as SlackWebhookUrl are self-explanatory and point to the corresponding endpoints.

Copy to Clipboard

Scanner Reconciler

We are now ready to implement our operator logic, which will be done in the Reconcile() function of our ScannerReconciler type.

Copy to Clipboard

First, we need to define variables for a list of Scanners and exactly one Pod — the Pod which creates an Event our Manager was watching for.

Copy to Clipboard

Because our scanner is responsible for a specific Pod and not for Scanner objects, we have to request the scanner objects for the specified namespace and store them in our scanner’s variable.

Copy to Clipboard

If no Scanner CRs are defined for the specified namespace, we can stop here and return to the Manager.

Copy to Clipboard

In case a Scanner was found, we need to fetch the Pod which is referenced by its name and namespace and save it into our pod variable.

Copy to Clipboard

As described before, handling DELETE events is more difficult and we need to add and remove a custom finalizer to the current Pod.

Copy to Clipboard

The finalizePod() function implements a cleanup logic for Clair index database and informs us on Slack. Currently, Clair API does not provide a delete endpoint.

Copy to Clipboard

We don’t need the Pod itself — only its referenced container images. So let’s iterate through all the containers. Iteration through InitContainers is not implemented here but can be done in the same way.

Copy to Clipboard

First, let’s extract the manifest definition to pass it to Clair’s indexing endpoint:

Copy to Clipboard

The docker.Inspect() function of module docker returns a struct of type claircore.Manifest needs to be implemented. You can find out how to do this in github.com/quay/clair. Inspect() just extracts the image name and repository to connect to the registry storing that image and gets the Digests and Layers to build the claircore.Manifest type.

Copy to Clipboard

In order to avoid reindexing already processed Pods we can define Annotations for better filtering. We use Patch() here instead of Update() to prevent the Pod from being fetched again and continue working with the initialised Pod instance.

Copy to Clipboard

Now we can request Clair index API endpoint by using ClairBaseUrl from the current Scanner CR.

Copy to Clipboard
Copy to Clipboard
Copy to Clipboard

With an already indexed container image, we can request the vulnerability report endpoint.

Copy to Clipboard
Copy to Clipboard

Slack notifications

We have three types of Slack notifications to handle. Each time a new Container image is indexed (pod created or updated), we will notify Slack using Webhooks.

Copy to Clipboard
Copy to Clipboard

We will also send notifications about vulnerability reports for the newly indexed images.

Copy to Clipboard
Copy to Clipboard

The third option to notify is not a direct part of the Operator. Clair itself can send notification webhooks for newly discovered vulnerabilities for already indexed container images. The payload of that webhook is fixed and cannot be customised via config parameters. So for simplicity let’s add a simple small web server in go and run it in our Kubernetes Cluster to listen for these webhooks and transform them into Slack compliant webhooks.

Copy to Clipboard

We need to build a container Image using docker CLI

Copy to Clipboard
Copy to Clipboard

Let’s deploy our notifier app into our Kubernetes cluster. We use a Secret for our Slack Webhook URL:

Copy to Clipboard
Copy to Clipboard
Copy to Clipboard

What we’ve achieved

We have a running Clair scanner and notifier for Slack. Our scanner CR is Namespace scoped and will be triggered every time a Pod is created, updated or deleted within that Namespace. Furthermore, we defined Annotations for already existing Pods to prevent reprocessing without container image changes.

A container manifest is created using a docker registry and requests Clair to index it and create a first vulnerability report. To inform users about newly created reports we created a webhook for Slack. Newly discovered vulnerabilities for running and indexed container images create events we receive with our notifier to transform them into Slack webhooks. We did not handle multiple Scanners for the same Namespace nor cover a fine granulated and structured CRD for different Scanner types and backends.

Don’t miss a beat

Get all the latest NearForm news,
from technology to design.
View all posts  |  Technology  |  Business  |  Culture  |  Opinion  |  Design
Follow us for more information on this and other topics.