Enriching NDR Logs With Context

Editor’s note: This is the latest in a series of posts where we explore topics such as network monitoring in Kubernetes, using sidecars to sniff and tunnel traffic, show a real-world example of detecting malicious traffic between containers, and more! Please subscribe to the blog, or come back for more each week.

In this post, we show how enriching Zeek® logs with cloud and container context makes it much faster to tie interesting activity to the container or cloud asset involved.

In cloud or container environments, layer 3 networking is abstracted away from the higher-level tasks of running workloads or presenting data. Because of this abstraction, when Zeek logs are collected for cloud or container network environments, the attribution of a network flow to actual workload or application is difficult. The SOC analyst would need to know which instance, host, pod, container, etc had the IP address seen within the logs at the exact time the log entry was created. In most cloud environments, this simply is not tracked.

Let’s take the example of a workload being run within a Kubernetes environment. Kubernetes handles the workload by distributing the compute resources and services needed to support it across the workers available. Now, let’s imagine that one of the pods is exploited using the Hildegard malware. When the SOC team identifies the traffic leaving the environment related to cryptomining, they look in the zeek logs to determine which resource is being used for mining. The analyst sees a Kubernetes private IP address, but has no way to isolate the responsible resource due to the abstraction of those details by Kubernetes itself. In such a scenario, an analyst would need to contact the infrastructure team and have them trace the IP back to a given pod worker host and location… but the clock is ticking!

Enter cloud and container context enrichment of Zeek logs. Linking point-in-time data such as worker node IP, pod name, or namespace to the connections logged by Zeek, the change in IP address that is common across workload orchestration environments becomes far less important. Now users can quickly attribute log data to a specific pod on a specific worker within a namespace. Using this type of data to close an evidence gap is a vital part of maintaining complete visibility across network environments.

enrichment-data

In the log snippet above taken from an enriched Zeek conn.log, we can see the new attribution fields that start with orig_pod or resp_pod. From this new data we can instantly identify where the traffic originated from, in this case a kube-proxy, and where the data was sent to, a kubernetes pod with name websvr-784466db5f-rv25w located on worker node2 within the default namespace. Now a SOC analyst examining this log could quickly attribute the entry to a specific service, business function, location, etc.

With the addition of this cloud and container context data to the Zeek logs, SOC analysts will be able to raise their awareness of exactly which system is in play without having to solely rely on IP addresses. The type of enrichment data that can be added is only limited by the accessibility of the source data from Zeek itself. In the case above, the logs were enriched with data directly from the Kubernetes API, so any data available from that API could be added if attributable to the IP address of the pod.

To take things a step further, analysts can pivot into other logs quickly using traditional Zeek ID linking methods, and the additional Kubernetes context is easily seen.

k8s_enrichment_log_pivot

Without this new enrichment data, a SOC analyst would need to give the timestamp and IP address to the infrastructure team for them to query the Kubernetes logs to determine which pod was responsible for the data. This lookup could waste valuable time needed to stop an attack or exfiltration of data. Adding quickly actionable details as evidence in logs could mean the difference between an attacker being successful and not.

This attribution method can be carried to other environments as well, such as AWS. Below we show an enrichment example where the instance details from AWS are attached to a conn.log entry allowing for quick pivoting from tools such as GuardDuty. Having the AWS ec2 instance ID inside the zeek conn.log allows quick pivoting to other logs and data without the need to query AWS to get the instance IP.

aws_enrichment_gd_pivot

With cloud and Kubernetes migrations, traditional network identifiers such as IP addresses become less meaningful. To combat that, adding attribution details from the environment where the compute resource lives not only restores some of that meaning, it greatly amplifies it. In our next blog we will investigate the different ways you can turn this enrichment capability on in your Corelight sensor environment.

By Stan Kiefer, Corelight Security Product Researcher