April 23, 2019 by Gregory Bell
Last week, a candidate for a senior role at Corelight explained his motivation for joining the company this way: “the world is standardizing on Zeek.”
And it’s true. The Zeek network security monitoring platform, created by leading researcher and Corelight co-founder Vern Paxson, is having its moment. Thousands of organizations worldwide have deployed this powerful open-source software (formerly called Bro). If you register for SANS training on incident response or threat hunting, you will likely learn these topics through the lens of Zeek data. Corelight’s Richard Bejtlich has written the book on network security monitoring, and he frequently blogsand tweets about Zeek. A number of companies now use Zeek under the hood in their products, as we do at Corelight – where we also steward development of the open-source codebase. In the past six months, two European Zeek workshops have quickly sold out.
So why is the world standardizing on Zeek at this moment?
I think there are several reasons. First, data science is currently transforming information security. Better outcomes require better data, highly structured and analytically rich, as Corelight Chief Product Officer Brian Dye explains. Second, the threat landscape remains daunting. Dwell-time for attacks is decreasing, but it’s still measured in months. Organizations with highly-skilled adversaries (national labs, federal agencies) were the first to adopt Zeek, and for good reason. Today, global enterprises face similar risks as they confront persistent threats to intellectual property, reputation, and financial assets. They need better tools, and Zeek is the definitive power tool for network traffic analysis. And finally, organizations simply cannot afford to keep legacy data types (such as PCAP) for such a long period of time. They need compact but information-rich data, optimized specifically for security investigation and analytics.
The adoption curve for Zeek is now so strong, that some vendors are riding the wave of popularity by assembling whatever data they can provide into something they call ‘Zeek format’. This is a significant endorsement of the standardization trend I’m describing, even if their data might lack much of what Zeek provides.
Let’s be clear, though: calling Zeek a ‘data format’ is like calling the Taj Mahal a ‘building with 120 rooms’ – factually true, but missing the point.
Zeek is much more than a data format. First and foremost, it’s a vibrant open source community. Anyone can contribute code, and many have. Innovation is ‘permissionless’, as the saying goes. The rich, interlinked structure of Zeek data is the result of ongoing dialog between Zeek developers and working incident responders. That data structure evolves over time as network protocols are created and updated, better defensive techniques emerge, and better ideas find their way into the code. Data from Zeek is not like a fixed IEEE standard. Instead, it reflects current consensus about what blue teams need, and it will continue to get better over time.
Second, Zeek is extensible in every respect. Recently one Corelight customer, a government agency, asked if our product could do something open-source Zeek doesn’t do by default – preserve and pass through VLAN tags in a number of logs. We could implement this quickly because Zeek was designed, from the beginning, with extensibility in mind. As a Zeek user, you too have the same kind of control. Don’t like the default format of a given log? That’s easy to change. Want to add new fields that aren’t in the default logs? No problem. Need a new protocol parser? You can write your own, or you can find collaborators in the community. Want to incorporate external intelligence feeds into Zeek data? Of course you can. Interested in file extraction and analysis? There’s a powerful framework for that already. Want Zeek to talk to your network equipment using native CLI or OpenFlow? There’s a powerful framework for that, too. Because Zeek was created by people obsessed with openness and flexibility, the only limit is your imagination.
Third, Zeek is an analytics platform. I’ve always thought this is the really special part. After you’ve customized Zeek data to suit your needs (if you want to), integrated intel feeds, enabled file extraction… you can write scripts and packages that consume Zeek ‘events,’ and do truly interesting work. Or if you don’t want to learn to write them yourself using community resources, you can try one of the freely-available packages contributed by community members. In fact a new Zeek Package Manager makes installing and managing them a whole lot easier.
Let me provide two examples of the power of the Zeek package ecosystem. Just a couple of weeks ago, MITRE generously released a set of scriptsdesigned to implement detections based on their well-known ATT&CK framework (itself becoming a community standard). I’m certain these scripts will be widely deployed, improved by the community, and updated as ATT&CK evolves to reflect the changing threat landscape. Another good example is the set of tools contributed by Salesforce.com for analyzing encrypted data flows: JA3, JA3S, and HASSH. The community has created apps for other purposes too, mostly focused on using Zeek’s greatest strength (behavioral detection) in the service of network defense. In a nutshell, the Zeek community is in the process of crowd-sourcing a library of powerful behavioral detections, for the benefit of all defenders. And that’s pretty cool!
I hope I’ve explained some of the reasons the world is getting excited about Zeek. This excitement is not just about a data format… it’s also about the openness and creativity that comes from a community of like-minded developers and defenders.
If you don’t already participate in the Zeek community, I invite you to jointhe mailing list, follow Zeek and Corelight on Twitter, monitor community blog posts, read the docs, run the software at home, ask questions, and generally get involved. And hold the date for ZeekWeek 2019: October 8-10, in Seattle! I’d love to see you there.