Examining aspects of encrypted traffic through Zeek logs

February 18, 2019 by Richard Bejtlich

In my last post I introduced the idea that analysis of encrypted HTTP traffic requires different analytical models. If you wish to preserve the encryption (and not inspect it via a middlebox), you have to abandon direct inspection of HTTP payloads to identify normal, malicious, and suspicious activity.

In this post I will use Zeek logs to demonstrate alternative ways to analyze encrypted HTTP traffic.The goal is to reduce a sea of uncertainty to a subset of activity worth investigating. If we can resolve the issue with Zeek data, wonderful. If we cannot, at least we have decided where we need to apply additional investigation, perhaps by applying intelligence, or host-based log data, or other resources.

Because we are talking about encryption woes, I start with Zeek’s x509.log. X509 is an Internet standard which defines the format of public key certificates. These certificates are an important element of Secure Sockets Layer (SSL) and Transport Layer Security (TLS) encryption used with HTTPS traffic.

In the following example I want to profile the algorithms used to sign x509 certificates.

The last result is worrisome. I would prefer not to see any certificates signed by the SHA1 algorithm in use in my environment. As explained by Mozilla, SHA1 suffers many problems that render it unsuitable in modern environments. Is this perhaps suspicious or malicious? I could imagine a scenario where an intruder doesn’t worry about signing collisions, because his malware doesn’t care about being ranked lower by Google’s web page search algorithms.

Next I search for Zeek x509.log entries with the SHA1 algorithm.

I collect several bits of important information here, in addition to a specific log containing a match. First, I get a file identifier, FTGvvp4TC5GHCel6ad, which I will leverage shortly. This file identifier uniquely identifies the x509 certificate that Zeek observed during an encrypted session. Second, I see this certificate was issued by CN=http.l.root-servers.org,OU=LROOT,O=ICANN,L=New Taipei,C=TW. I do not know if that is a problem in and of itself. I also note that the certificate appears to have been issued in late 2018, which is odd given the warnings against using SHA1 for x509 certificates.

Using the file ID, I begin looking for other Zeek log entries. This demonstrates the real power of Zeek logs: they can be linked by entries like the file ID. I will examine each in turn as they appear. (Note that I could have searched for the certificate identifier for other log entries. I could have also turned to sources outside by logs for more information on this identifier.)

Above we have a files.log entry.

This was generated by Zeek in the process of tracking the encrypted session and writing the x509.log. This is a key log because it provides the connection ID, CJDF553HmA2WdUq1Af, which we can use to look for additional Zeek logs. The files.log also contains the source and destination IP addresses, but we will rely on linked logs for more information on the session.

Next we have the ssl.log.

This log entry offers details on the nature of the encryption used in the session of interest. We have the same IP addresses seen earlier, as well as ports. Again, I will turn to these later. Note momentarily the last two bolded entries, for the ja3 and ja3s fields. I will return to those shortly as well. The most important part of this log, for immediate use once we finish the results of this search, is the uid of CJDF553HmA2WdUq1Af. This is a connection identifier that we will search for shortly.

The last log is the x509.log again. I show it here to demonstrate that searching for the file ID results in the three types of logs just shown — files.log, ssl.log, and x509.log. In order of logical creation, they would be listed as ssl.log, x509.log, and files.log.

Returning to the results of the ssl.log, you will remember we found a connection ID. Let’s search for it and see what we find. Again, I will show one entry at a time and explain the pertinent aspects.

The conn.log is sort of the “top level” Zeek log.

Zeek creates conn.log entries for “connections,” whether they are connection-oriented (like TCP) or connectionless (like UDP). This entry shows us flow details about the connection, like the source IP (10.10.40.48), the destination IP (199.7.83.80), and the source and destination ports (36780 and 443), along with the IP protocol (TCP).

I have slightly reordered these three results in order to group and skip them. The conn-summary log is basically a repeat (in this instance) of the conn.log, and we have already seen the files.log and ssl.log. Let’s continue our interpretation with the next unique result.

Above we see the notice.log. Zeek generated this entry for the connection of interest because it was a self-signed certificate. By itself, this does not tell us if the event is normal, suspicious, or malicious, but it is still unwanted.

If we wanted to think about these logs as a chain, I would order them thusly (ignoring the conn-summary.log as it is a “meta” log in most cases).

conn.log, ssl.log, x509.log, files.log, notice.log

Let’s pivot on two items of interest from the ssl.log, the ja3 and ja3s entries. JA3 refers to a wonderful addition to the Zeek code base, donated by engineers from Salesforce.com. JA3 fingerprints connections based on aspects of the client and server TLS connections. A ja3 entry reflects the client and a ja3s entry reflects the server. For our ssl.log, we had these elements:

First we will look for the ja3 client fingerprint. What systems are offering the same sort of aspects of a TLS session to their servers? I omitted the server we already looked at in the following results, and showed only new information.

It looks like our host of interest, 10.10.40.48, is the only system on our network in play, but we have found two other servers to whom 10.10.40.48 communicates — 193.0.6.139 and 193.0.6.158. We could pivot on those IP addresses if we so chose. Note the new ja3s values also.

Now let’s look at the server side to see if any other servers offer similar TLS connection aspects to their clients. We grep for the jas3 value from the earlier ssl.log.

How interesting! It appears we have three Apple iTunes servers which use the same TLS connection aspects as those accepting connections from 10.10.40.48, and we have three unique clients connecting to each of them. This is likely normal, but interesting nevertheless. Remember that if we wanted to pivot off these results, we could pick one session and search for the connection ID. In the following example I look for the connection ID of the first of the last three results (which was bolded).

As you can see, Zeek provides a wealth of identifier-linked logs to make it possible to pull on various threads.

In this example, I was not able to determine the nature of the usage of the SHA1 certificate signing algorithm from within the Zeek logs themselves.

However, the Zeek logs provided information that I could use to do additional investigation. I have the source and destination IP addresses as well as information about the encryption certificates in play. At the very least, I have found a way to focus a microscope on a problem; I’m not stuck wondering where I should look for problems.

For example, I could simply choose to look at other Zeek logs for the odd host in question, 10.10.40.48. What other protocols does it use? To whom does it connect, and how? The Zeek dns.log could be specifically interesting. Perhaps we will turn to those in the next blog entry.

This concept of using network-level data in the face of encryption to identify issues of interest is my main point, and I hope you enjoyed the review of Zeek logs along the way!

Corelight Bright Ideas Blog

Examining aspects of encrypted traffic through Zeek logs

Recent Posts

Sign up for our newsletter

Locations

1 (888) 547-9497

We're hiring!