Concepts and strategies for the security monitoring of Kubernetes clusters
In this blog post, we give an overview of security monitoring strategies for Kubernetes clusters and how they have changed over time. We highlight the idea of logging and how the way to connect multiple log sources has changed from a classic infrastructure to Kubernetes.
Later, we demonstrate a more performant approach, using eBPF (Extended Berkeley Packet Filter) as a source of what happens inside our cluster.
You can find all the source code accompanying this article at github.com/nearform/k8s-security-monitoring.
The classic way
If you come from a classic infrastructure, you have often watched log files written to the file system by your network components and your applications themselves. The huge range of tools and techniques available to watch files and process log events includes Beats from Elastic, log shippers such as rsyslog and aggregators like fluent-bit or fluentd.
All these tools can preprocess your log entries and forward them via different protocols to different targets for further use. Besides regular log files and in-memory journals, you can also watch for kernel syscalls via auditd and process and ship it in a similar way.
The Kubernetes way
The approach in Kubernetes is slightly different because of its design and, of course, the existence of modernised tools and techniques. Kubernetes is just an abstraction of the underlying infrastructure using containers, cgroups and namespaces, but best practice for application logs is to write to standard output rather than to files to let the container runtime handle it and add meta information, thanks to the Docker logging driver. We can use almost the same tools used in a classic infrastructure — including Beats and fluentd — deploy them manually on each node or use DaemonSet in Kubernetes.
All these tools already support Kubernetes. Alternatively, we can use a Docker logging driver to forward logs directly to external log storage solutions like AWS cloudwatch or 3rd party tools like Splunk or Graylog.
However, often you will use a managed Kubernetes cluster, which means you will not have access to the master nodes (alias the control plane). We can’t grab logs from there in this way, and we are completely dependent on Kubernetes features and the provider of our cluster.
Therefore, it is nice to know that Kubernetes provides auditing for its API server.
We’ll now show you a solution where we watch logs from different sources, including audit logs from Kubernetes, and we send them to Grafana using Loki (a multi-tenant log aggregation system) and Promtail (an agent which ships local logs to a Loki instance).
Promtail, running on each node, sends logs to Loki instance
Let’s start by installing these useful tools in our cluster. Make sure your KUBECONFIG environment variable is set correctly.
This often reaches some limits of your underlying system configuration for parallel open file descriptors we need to increase.
Promtail is running as a DaemonSet on each cluster node and watches on mounted log files for each container in each pod.
Loki itself provides a push api endpoint to receive log events in a fixed format. If we add our data-source for Loki, we can query multiple common log files of Kubernetes components and filter them by using labels.
Now that we know how we want to monitor our cluster, we need to enable important logs required for proper security monitoring.
For now, let’s enable DNS logging by adding the log plugin for coredns to its config file, so we are able to analyse and see suspicious DNS queries.
We also want to know what’s going on at the control plane. As long as we run a managed cluster (like EKR on AWS), we need to consult the provider’s documentation to find out how we can enable audit logging.
In our case, we use a kinD cluster and we have to define a Policy object to pass it to our API-server as a command line argument. For kinD, that needs to be done at cluster creation time and cannot be updated later.
The example Policy document defines rules for what and at which level logs should be enabled. By default, Kubernetes is writing json formatted logs to a file defined at
--audit-policy-path but can also forward logs to an external API endpoint.
Promtail does not know how and where to read audit logs from by default. We need to mount the audit.log file from each host into the container and define a scrape config to watch on it.
If you use the Helm chart for Promtail, as we did here, it is easy to add extra mounts and volumes as well as define an additional scrape config for audit.log. Let’s add it to our promtail.values.yaml file and upgrade Promtail.
If we visit http://localhost:300/explore again and select Loki as the data source, we will see
/var/log/kubernetes/audit.log as a filename label under which we can now monitor Kubernetes internal layer 7 network traffic in the form of request/response events. Not only that, but we can also use it to detect suspicious activities and define alerts for it.
Besides all the Container logs in general, DNS logs in particular and Audit logs from Kubernetes itself, we are also interested in the host’s system logs, including Journal logs or sometimes from rsyslog.
We run Promtail as a container, so we need to mount the /run/log/journal folder from the host and add a scraper config to watch on it via the journal plugin.
We need to upgrade Promtail by using our extended values.yaml for Helm.
Voilà! Now we can see a lot more in Grafana if we explore Loki sources for systemd. We also see a new job name systemd-journal for our aggregated journal logs from our hosts.
Layer 3 and 4 network logs
Finally, we also want to see logs related to our network on L3/L4 of the ISO/OSI reference model. But before we do that, we need to integrate another CNI for Kubernetes to define Network Policies that allow us to enable logging for rules. Let’s install Calico for now, because it provides us with what we need.
If you run kinD you need to recreate your cluster and disable the default CNI in your cluster config first.
Installing Calico is straightforward. We simply download a compact yaml file and deploy it to our cluster. For a production environment, consult the Calico documentation for a more detailed method.
Now we can use Calico’s CRDs to define network policies. We will use a global one to enable logging in the first place. In a production cluster environment, it will produce a huge number of logs that need addressing.
Linux disabled Netfiler logging for kernel namespaces to journald for security reasons, but we can enable it. Be careful doing that in production environments.
We use kinD to demonstrate logging in Kubernetes, so we need to do some additional configurations before we can watch for Netfilter logs. We also need to use rsyslog on our kinD cluster nodes because Netfilter kernel logs are only available in the origin host journal and not inside containers.
Rsyslog will create a /var/log/kern.log file on our Kubernetes container that we need to mount into Promtail in order to watch it. Let’s also add a scraper config for it:
Don’t forget to upgrade your Helm deployment and check if the new kernel logs — including your Network Policy logs — are shown inside Grafana.
The eBPF way
Since Kernel version 3.18, Linux has been providing the extended Berkeley Packet Filter (eBPF). This interesting technology gives us direct access to the kernel at runtime and is essential for high-performance observability in terms of security.
Many tools and techniques have been created around eBPF, including a complete CNI named Cilium. We will give a quick overview of a tool called Falco that monitors for kernel syscalls and alerts if suspicious activities are detected. Cilium itself includes many performant features to control network traffic and observe your cluster, but the complexity of Cilium goes beyond the scope of this article.
Falco comes with a Helm chart and falco-exporter to provide a metric endpoint for Prometheus, including a Grafana predefined dashboard we can import. We need to enable Falcos grpc output to let the falco-exporter connect to it.
If you use a kinD cluster, you need to change your cluster config and mount your host’s /dev folder for Falco into your control plane node at /host/dev.
Now Falco is providing syscall events on its grpc endpoint to falco-exporter for metrics aggregation. Let’s give Prometheus the permission for the falco namespace and request the metrics endpoint via a Service Monitor CRD.
Logging and observability play an important role for our applications running in Kubernetes.
In this article, we discussed why logging to files in containers/pods is not the right choice, and we suggested solutions for logging different layers of our cluster.
We always advise that you do your own research because the solutions suggested here may not be feasible for your application or organisation, but it is important to know some of the tools that can be used with this new approach to logging.