Last week’s RSA announcements included a pair of new entrants in to the SIEM space, Google Chronicle’s Backstory and Microsoft’s Azure Sentinel. While the entry of larger players in to the SIEM space is an eyebrow-raiser on its own, in conjunction with the existing competitive fray it is pretty amazing. The good news is that this level of competitive intensity is a very good thing for customers and defenders. That said, it is worth looking at the main angles of innovation that are playing out across all the form factors (on-prem, MSSP, and SaaS) … and the elephant in the room that goes with them:
- Ecosystem: Under the “decentralized innovation” theme, Splunk (and more recently PANW) has focused on creating a range of complementary analytics solutions to help get the most out of the aggregated data (and acquiring a few them as well).
- Analytics: Most notably players like Exabeam have put a real premium on novel analytics focused on key IR issues. In the more recent announcements, Google Chronicle touts not only their internally-developed analytics but also a real focus on query speed (the phrase “coffee break query” is unfortunately an industry term at this point).
- Pricing: While Splunk made tremendous success with their “buy as you go” pricing model, it is no secret that is a struggle for infosec budgets now. Elastic and Humio offer structurally different alternatives, and Chronicle’s Backstory announcement squarely focused on this issue as well.
What’s missing in this discussion? The DATA ITSELF. As any data scientist will tell you, the best tools in the world are accelerated (or limited!) by the data. Furthermore, getting the data “right” is the most time consuming part of many data-intensive projects … and the SOC is one big data analysis project. In talking to customers, I’ve seen three key trends that underscore how important the data is to the success of defenders using any of these technologies:
- Security Data Science teams: In the large enterprise, we are seeing true data science teams (in many cases seeded with folks from other internal data science efforts) being staffed for security. This helps defenders up their game, and use the same spread of analytics tools that the attackers are taking advantage of already.
- Career Development: Even when full data science teams aren’t being staffed, I see defenders taking their own steps through classes – most commonly in Python analytics frameworks like SciKit-Learn or TensorFlow. For the same reasons above, this is a great step and an unqualified positive for both the career of the individual and the defensive posture of the organization.
- Post-processing: As defenders use those data analytics skills, they often work to improve, augment, or customize the data in their environment. This often starts by getting *really* good at data joins in their SIEM, but can extend to tools like Kafka Streams or full ETL-style post-processing environments.
All three of these often result in teams looking for an alternative to the “by-product data” they have today. What does that mean? Most of the logs in the SOC were never meant for large scale security analytics … they are operational or alerting logs from a protection or detection technology. This search for better data often leads defenders and data scientists to Corelight (based on the Zeek (fka Bro) open source project), because it has:
- Security Effectiveness: because Corelight is deployed passively behind a TAP, you benefit from a fast and non-disruptive deployment that gives very broad environment coverage. Just as importantly, the data itself is highly compact so organizations can cost-effectively keep data for years of coverage, not weeks or months.
- Native structure: Speaking of which, Corelight’s data spans dozens of protocols, and results from 20 years of evolution focused on the needs of incident responders and threat hunters – enabling great insight into everything from behavioral movement to encrypted traffic to extracted files. The key is that it is all linked with a common identifier, allowing both analysts and machines to deterministically connect what used to be isolated pools of insight (breaking a historical SIEM problem of “raw vs. normalized format” tradeoffs). This saves people time (less chair swiveling!) and dramatically streamlines data science and analytics work.
- Extensibility: Zeek was built from the start to be changed and improved, both to create new data types and derive new insights from the data already there. Defenders throughout the open source community already take advantage of this, as (a) fixing the data is often far easier at the source than with heroic post processing and (b) without the right incoming signal, no amount of post-processing will ever succeed.
In the end, the increased competition in the SIEM space - including next-generation SIEM - is a great thing for people and organizations charged with defending networks and information, and we at Corelight are happy to partner with all of them. No matter which technology you are using today (or considering tomorrow) for your SOC to remediate critical security-related outcomes, come check out Corelight. Getting the right data from the start accelerates almost everything in your IR process, from tools to people. That’s why we believe Corelight is your next best move in security. Put succinctly in the the words of one of our customers, “If I didn’t have this data I wouldn’t sleep well at night. I like to sleep well at night.”