Security monitoring in Kubernetes

28 Oct 2021

Concepts and strategies for the security monitoring of Kubernetes clusters

In this blog post, we give an overview of security monitoring strategies for Kubernetes clusters and how they have changed over time. We highlight the idea of logging and how the way to connect multiple log sources has changed from a classic infrastructure to Kubernetes.

Later, we demonstrate a more performant approach, using eBPF (Extended Berkeley Packet Filter) as a source of what happens inside our cluster.

You can find all the source code accompanying this article at github.com/nearform/k8s-security-monitoring .

The classic way

If you come from a classic infrastructure, you have often watched log files written to the file system by your network components and your applications themselves. The huge range of tools and techniques available to watch files and process log events includes Beats from Elastic, log shippers such as rsyslog and aggregators like fluent-bit or fluentd .

All these tools can preprocess your log entries and forward them via different protocols to different targets for further use. Besides regular log files and in-memory journals, you can also watch for kernel syscalls via auditd and process and ship it in a similar way.

The Kubernetes way

The approach in Kubernetes is slightly different because of its design and, of course, the existence of modernised tools and techniques. Kubernetes is just an abstraction of the underlying infrastructure using containers, cgroups and namespaces , but best practice for application logs is to write to standard output rather than to files to let the container runtime handle it and add meta information, thanks to the Docker logging driver. We can use almost the same tools used in a classic infrastructure — including Beats and fluentd — deploy them manually on each node or use DaemonSet in Kubernetes.

All these tools already support Kubernetes. Alternatively, we can use a Docker logging driver to forward logs directly to external log storage solutions like AWS cloudwatch or 3rd party tools like Splunk or Graylog .

However, often you will use a managed Kubernetes cluster, which means you will not have access to the master nodes (alias the control plane). We can't grab logs from there in this way, and we are completely dependent on Kubernetes features and the provider of our cluster.

Therefore, it is nice to know that Kubernetes provides auditing for its API server.

We’ll now show you a solution where we watch logs from different sources, including audit logs from Kubernetes, and we send them to Grafana using Loki (a multi-tenant log aggregation system) and Promtail (an agent which ships local logs to a Loki instance).

[caption id="attachment_300016248" align="alignnone" width="1328"]

Promtail, running on each node, sends logs to Loki instance[/caption]

Let’s start by installing these useful tools in our cluster. Make sure your KUBECONFIG environment variable is set correctly.

Plain Text

git clone https://github.com/prometheus-operator/kube-prometheus.git
kubectl apply -f ./kube-prometheus/manifests/setup
kubectl apply -f ./kube-prometheus/manifests/
helm repo add grafana https://grafana.github.io/helm-charts
helm upgrade -i -n monitoring loki grafana/loki
helm upgrade -i -n monitoring promtail grafana/promtail \
  --set config.lokiAddress=http://loki.monitoring:3100/loki/api/v1/push

This often reaches some limits of your underlying system configuration for parallel open file descriptors we need to increase.

Plain Text

sudo sysctl -w fs.inotify.max_user_instances=512
sudo sysctl -w fs.inotify.max_user_watches=524288

Promtail is running as a DaemonSet on each cluster node and watches on mounted log files for each container in each pod.

Loki itself provides a push api endpoint to receive log events in a fixed format . If we add our data-source for Loki, we can query multiple common log files of Kubernetes components and filter them by using labels.

Now that we know how we want to monitor our cluster, we need to enable important logs required for proper security monitoring.

DNS logs

For now, let’s enable DNS logging by adding the log plugin for coredns to its config file, so we are able to analyse and see suspicious DNS queries.

JavaScript

cat <<EOF | kubectl -n kube-system apply -f -
kind: ConfigMap
apiVersion: v1
metadata:
  name: coredns
data:
  Corefile: |
    .:53 {
        log
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
           max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }
EOF
kubectl -n kube-system delete pods --selector=k8s-app=kube-dns

Audit logs

We also want to know what’s going on at the control plane. As long as we run a managed cluster (like EKR on AWS), we need to consult the provider's documentation to find out how we can enable audit logging.

In our case, we use a kinD cluster and we have to define a Policy object to pass it to our API-server as a command line argument. For kinD, that needs to be done at cluster creation time and cannot be updated later.

Plain Text

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
Networking:
  extraMounts:
  - hostPath: ./audit/
    containerPath: /etc/kubernetes/audit
    readOnly: false
    selinuxRelabel: false
    propagation: None
kubeadmConfigPatches:
- |
  apiVersion: kubeadm.k8s.io/v1beta2
  kind: ClusterConfiguration
  metadata:
    name: config
  apiServer:
    extraArgs:
      audit-log-path: /etc/kubernetes/audit/audit.log
      audit-policy-file: /etc/kubernetes/audit/audit-policy.yaml
    extraVolumes:
    - name: audit-policy
      hostPath: /etc/kubernetes/audit
      mountPath: /etc/kubernetes/audit
      readOnly: false
      pathType: "DirectoryOrCreate"

The example Policy document defines rules for what and at which level logs should be enabled. By default, Kubernetes is writing json formatted logs to a file defined at --audit-policy-path but can also forward logs to an external API endpoint.

Promtail does not know how and where to read audit logs from by default. We need to mount the audit.log file from each host into the container and define a scrape config to watch on it.

If you use the Helm chart for Promtail, as we did here, it is easy to add extra mounts and volumes as well as define an additional scrape config for audit.log. Let’s add it to our promtail.values.yaml file and upgrade Promtail.

Plain Text

helm upgrade -i -n monitoring -f promtail.values.yaml promtail grafana/promtail

If we visit https://localhost:300/explore again and select Loki as the data source, we will see /var/log/kubernetes/audit.log as a filename label under which we can now monitor Kubernetes internal layer 7 network traffic in the form of request/response events. Not only that, but we can also use it to detect suspicious activities and define alerts for it.

Journal logs

Besides all the Container logs in general, DNS logs in particular and Audit logs from Kubernetes itself, we are also interested in the host's system logs, including Journal logs or sometimes from rsyslog.

We run Promtail as a container, so we need to mount the /run/log/journal folder from the host and add a scraper config to watch on it via the journal plugin.

Plain Text

...
extraVolumes:
  - name: journal
    hostPath:
      path: /run/log/journal
...
extraVolumeMounts:
  - name: journal
    mountPath: /run/log/journal
    readOnly: true
…
config:
  snippets:
    scrapeConfigs: |
      - job_name: systemd-journal
        journal:
          labels:
            cluster: ops-tools1
            job: systemd-journal
          path: /run/log/journal
        relabel_configs:
        - source_labels:
          - __journal__systemd_unit
          target_label: systemd_unit
        - source_labels:
          - __journal__hostname
          target_label: nodename
        - source_labels:
          - __journal_syslog_identifier
          target_label: syslog_identifier

We need to upgrade Promtail by using our extended values.yaml for Helm.

Plain Text

helm upgrade -i -n monitoring -f promtail.values.yaml promtail grafana/promtail

Voilà! Now we can see a lot more in Grafana if we explore Loki sources for systemd. We also see a new job name systemd-journal for our aggregated journal logs from our hosts.

Layer 3 and 4 network logs

Finally, we also want to see logs related to our network on L3/L4 of the ISO/OSI reference model . But before we do that, we need to integrate another CNI for Kubernetes to define Network Policies that allow us to enable logging for rules. Let’s install Calico for now, because it provides us with what we need. If you run kinD you need to recreate your cluster and disable the default CNI in your cluster config first.

Plain Text

...
networking:
  disableDefaultCNI: true
...

Installing Calico is straightforward. We simply download a compact yaml file and deploy it to our cluster. For a production environment, consult the Calico documentation for a more detailed method.

Plain Text

curl https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml

Now we can use Calico’s CRDs to define network policies. We will use a global one to enable logging in the first place. In a production cluster environment, it will produce a huge number of logs that need addressing.

Plain Text

cat <<EOF | kubectl -n kube-system apply -f -
apiVersion: crd.projectcalico.org/v1
kind: GlobalNetworkPolicy
metadata:
  name: log-all
spec:
  selector: all()
  types:
    - Ingress
    - Egress
  ingress:
    - action: Log
    - action: Allow
  egress:
    - action: Log
    - action: Allow
EOF

Linux disabled Netfiler logging for kernel namespaces to journald for security reasons , but we can enable it. Be careful doing that in production environments.

Plain Text

sudo su
echo 1 > /proc/sys/net/netfilter/nf_log_all_netns

We use kinD to demonstrate logging in Kubernetes, so we need to do some additional configurations before we can watch for Netfilter logs. We also need to use rsyslog on our kinD cluster nodes because Netfilter kernel logs are only available in the origin host journal and not inside containers.

Plain Text

docker exec -it secsvc-control-plane /bin/bash
root@secsvc-control-plane:/ apt-get update \
&& apt-get install -y rsyslog \
&& systemctl start rsyslog

Rsyslog will create a /var/log/kern.log file on our Kubernetes container that we need to mount into Promtail in order to watch it. Let’s also add a scraper config for it:

Plain Text

...
extraVolumes:
  - name: kern
    hostPath:
      path: /var/log/kern.log
...
extraVolumeMounts:
  - name: kern
    mountPath: /var/log/kern.log
    readOnly: true
...
config:
  snippets:
    scrapeConfigs: |
      - job_name: kernel
        pipeline_stages:
        static_configs:
          - targets:
              - localhost
            labels:
              job: kernellog
              __path__: /var/log/kern.log

Don’t forget to upgrade your Helm deployment and check if the new kernel logs — including your Network Policy logs — are shown inside Grafana.

The eBPF way

Since Kernel version 3.18, Linux has been providing the extended Berkeley Packet Filter ( eBPF ). This interesting technology gives us direct access to the kernel at runtime and is essential for high-performance observability in terms of security. Many tools and techniques have been created around eBPF, including a complete CNI named Cilium . We will give a quick overview of a tool called Falco that monitors for kernel syscalls and alerts if suspicious activities are detected. Cilium itself includes many performant features to control network traffic and observe your cluster, but the complexity of Cilium goes beyond the scope of this article.

Falco

Falco comes with a Helm chart and falco-exporter to provide a metric endpoint for Prometheus, including a Grafana predefined dashboard we can import. We need to enable Falcos grpc output to let the falco-exporter connect to it.

Plain Text

helm repo add falcosecurity https://falcosecurity.github.io/charts
helm upgrade -i falco -n falco --create-namespace falcosecurity/falco \
  --set falco.grpc.enabled=true \
  --set falco.grpcOutput.enabled=true
helm upgrade -i falco-exporter -n falco falcosecurity/falco-exporter \
  --set serviceMonitor.enabled=true \
  --set grafanaDashboard.enabled=true \
  --set grafanaDashboard.namespace=monitoring

If you use a kinD cluster, you need to change your cluster config and mount your host’s /dev folder for Falco into your control plane node at /host/dev.

Plain Text

...
nodes:
- role: control-plane
  extraMounts:
  - hostPath: /dev/
    containerPath: /host/dev
    readOnly: true
    selinuxRelabel: false
    propagation: None
...

Now Falco is providing syscall events on its grpc endpoint to falco-exporter for metrics aggregation. Let’s give Prometheus the permission for the falco namespace and request the metrics endpoint via a Service Monitor CRD.

JavaScript

cat <<EOF | kubectl apply -f -
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: falco-exporter
  namespace: falco
spec:
  endpoints:
  - interval: 3s
    port: "metrics"
  selector:
    matchLabels:
      app.kubernetes.io/name: falco-exporter
  namespaceSelector: {}

Wrapping up

Logging and observability play an important role for our applications running in Kubernetes.

In this article, we discussed why logging to files in containers/pods is not the right choice, and we suggested solutions for logging different layers of our cluster.

We always advise that you do your own research because the solutions suggested here may not be feasible for your application or organisation, but it is important to know some of the tools that can be used with this new approach to logging.

Insight, imagination and expertly engineered solutions to accelerate and sustain progress.

Contact