Clair tutorial: analyzing a Docker Image
In a previous article, we described how to build a Docker Registry in Kubernetes.
Today we look at Clair – a tool that does static analysis of vulnerabilities in a docker image.
What is Clair?
Clair is a popular open source vulnerability scanning solution for docker images made by CoreOS.
Clair is also integrated with quay.io public docker registry and Quay Enterprise, both products of CoreOS.
As a source of vulnerabilities, it uses CVE (Common Vulnerabilities and Exposures) data sources like NIST NVD as well as security bug trackers of the specific Linux distributions supported by Clair.
Clair doesn’t have a web UI or even a command-line tool; so the only way to work with it is via its REST API or a third-party CLI tool.
Clair is distributed as docker image:
docker pull quay.io/coreos/clair
It can also be compiled from the source code here: https://github.com/coreos/clair.
How Clair works
Clair scans docker images by doing static analysis, which means it analyzes images without a need to run their docker container.
A docker image is composed of 1+n layers (also called intermediate images) and each layer is stored in a docker registry as a tar file blob.
Give Clair a HTTP URL to an image layer tar file and it analyses it. To analyse an entire docker image, we need to give Clair all the image layers.
Clair has a couple of API endpoints listed here: https://coreos.com/clair/docs/latest/api_v1.html
We use two of them:
- POST /layers – push a docker image layer to Clair for an analysis
- GET /layers/:name – retrieve info about a docker layer with found vulnerabilities
Clair also has protobuf API v3, but we will be using API v1 as we want to communicate with Clair via HTTP.
Architecture and Clair components
Clair is composed of 2 components:
- REST API server
- CVE Updater which takes care of updating database of vulnerabilities
- List of CVE data sources
- PostgreSQL 9.4+
- Storage of vulnerabilities database and results of analysis of uploaded docker image layers
Local setup of Clair
We won’t go into detail about how to deploy Clair as it isn’t the focus of this post.
Instead, let’s focus on how Clair works and for that, we use the official docker-compose setup and make it run on our local machine.
We need to create just 2 files:
- clair_config/config.yaml (download)
- docker-compose.yaml (download)
Add password=password parameter into clair_config/config.yaml PostgreSQL connection configuration.
I also recommend changing the default postgres:latest docker image by arminc/clair-db which already contains the CVE database. Otherwise, we need to wait roughly 30 minutes before the database gets downloaded.
Let’s start our local Clair:
Analyzing a docker image in a few steps
In the rest of the post, we use the docker registry V2 built in the previous article with the same pseudo domain name registry.mydomain.com and same Basic Auth credentials admin:admin123.
First, we need to have an image in our registry to be able to perform an analysis of it.
So let’s pick up, for example, a debian:9.5 image and push it to our registry.
docker pull debian:9.5
docker tag debian:9.5 registry.mydomain.com/debian:9.5
docker login https://registry.mydomain.com -u admin -p admin123
docker push registry.mydomain.com/debian:9.5
With the image in the registry, let’s get the HTTP URLs of its layers in tar format. To do that, we need to get its manifests:
curl -u admin:admin123 https://registry.mydomain.com/v2/debian/manifests/9.5
in a JSON response, a field fsLayers contains all the image layers identified by blobSums.
Each docker image is always composed by one or more empty layers generated by commands from Dockerfile like CMD, EXPOSE etc. which don’t modify the content of the image, so we don’t need to scan them.
Because these layers are empty they always have the same blobSum:
Therefore we can easily identify them and exclude them from Clair analysis.
Knowing that we want to analyze only the second layer with blobSum sha256:05d1a5232b461a4b35424129580054caa878cd56f100e34282510bd4b4082e4d to get its content, we need to call a different API endpoint:
curl -u admin:admin123 https://registry.mydomain.com/v2/debian/blobs/sha256:05d1a5232b461a4b35424129580054caa878cd56f100e34282510bd4b4082e4d
In response, we get a tar file representing the delta of the layer. That is what we need to provide Clair to let it do its analysis.
Now let’s tell Clair to analyze our docker image composed by only one “scannable” layer.
To do that we call the API POST https://localhost:6060/v1/layers which requires a few parameters:
- Name – we can use blobSum of a layer. Here we need to keep in mind that the name field has to be unique, that means we have to avoid pushing empty layers as they have the same blobSum. That’s also a reason why we decided to exclude it from the analysis.
- Path – URL to tar file of a layer
- Headers – has to contain Authorization header to make Clair able to reach Docker Registry. In our case, it’s Basic Auth. (We can generate a Basic Auth token like this: echo -n “admin:admin123” | base64)
- Format – Clair supports both Docker and Rkt formats of container images.
- ParentName – this field is optional and has to be used if we want to analyze a docker image with more than one layer. In such case we need to push these layers in the right order by referencing its parent layer; otherwise, Clair will not be able to provide us with results of the entire docker image.
Let Clair do its magic:
"Authorization": "Basic YWRtaW46YWRtaW4xMjM="
When finished, we are immediately able to get a result of found vulnerabilities in the layer by calling the endpoint:
In response, we get a list of all features present in a filesystem which could indicate a vulnerability.
One of the features in our tested debian:9.5 docker image is for example systemd where Clair found some vulnerabilities listed below:
"Description":"systemd v233 and earlier fails to safely parse usernames starting with a numeric digit (e.g. \"0day\"), running the service in question with root privileges rather than the user intended.",
"Description":"systemd, when updating file permissions, allows local users to change the permissions and SELinux security contexts for arbitrary files via a symlink attack on unspecified files.",
"Description":"systemd-tmpfiles in systemd before 237 attempts to support ownership/permission changes on hardlinked files even if the fs.protected_hardlinks sysctl is turned off, which allows local users to bypass intended access restrictions via vectors involving a hard link to a file for which the user lacks write access, as demonstrated by changing the ownership of the /etc/passwd file.",
"Description":"In systemd prior to 234 a race condition exists between .mount and .automount units such that automount requests from kernel may not be serviced by systemd resulting in kernel holding the mountpoint and any processes that try to use said mount will hang. A race condition like this may lead to denial of service, until mount points are unmounted.",
"Description":"systemd-tmpfiles in systemd through 237 mishandles symlinks present in non-terminal path components, which allows local users to obtain ownership of arbitrary files via vectors involving creation of a directory and a file under that directory, and later replacing that directory with a symlink. This occurs even if the fs.protected_symlinks sysctl is turned on.",
What we must pay attention to are 2 fields:
- Severity – telling you how critical is the vulnerability
- FixedBy – telling you there is a newer version of a feature which fixes the vulnerability. The field doesn’t exist if there isn’t a fix yet.
We have learned how to use Clair without using any third-party client which helped us to understand how Clair works.
Now with this in mind, we can look at popular third-party clients like klar, clairctl, reg and many others that simplify the work with Clair. With them, you don’t need to care about layers as you provide only a docker image name and the client does the rest for you.
So what was the reason for writing this article you might ask? Well, let me cite one paragraph from official Clair documentation:
“Clair can be integrated directly into a container registry such that the registry is responsible for interacting with Clair on behalf of the user. This type of setup avoids the manual scanning of images and creates a sensible location to which Clair’s vulnerability notifications can be propagated. The registry can also be used for authorization to avoid sharing vulnerability information about images to which one might not have access.”
And that’s something that we will look at in next article.
Read more about NearForm’s services and how we assist businesses in product design and modern application development or contact us to discuss how we can help you accelerate your projects using modern tools, processes and platforms.