Episode 7 - Practical AI for Zeek, MITRE, and Security Docs

About the episode

In Episode 7 of Corelight DefeNDRs, join me, Richard Bejtlich, as I sit down with Dr. Keith Jones, Corelight's principal security researcher, to discuss the practical applications of AI in enhancing network security. We delve into how large language models (LLMs) can assist in cleaning up documentation and generating Zeek scripts, sharing insights from our extensive experience in incident response and coding. Keith reveals the challenges and successes he has encountered using LLMs to streamline processes, including their role in analyzing MITRE techniques. Whether you're a seasoned coder or new to the field, this episode offers valuable perspectives on leveraging AI tools to improve efficiency and effectiveness in security operations. Tune in for a thought-provoking conversation that bridges AI innovation with real-world cybersecurity challenges.

Episode transcript

Download transcript

Episode 7 - Practical AI for Zeek, MITRE, and Security Docs

Welcome to Corelight Defenders. I'm Richard Bejtlich, strategist and author in residence at Corelight. In each episode, we explore insights from the front lines of NDR, Network Detection And Response. Today, I am speaking with Dr. Keith Jones, principal security researcher at Corelight. Welcome, Keith. Hey, thanks for having me. I am really pleased to have you on the podcast, Keith. You've, uh, been a friend of mine now for over 20 years, back to when we were incident response. consultants at Foundstone. We also wrote a book on digital forensics along with

Curtis Rose. And, uh, just taking a look at your bio, I was reminded, you are a legitimate licensed private pilot, which I am so impressed by. I was in the Air Force, but never flew anything on my own. I couldn't even handle a glider. So, uh, I'm really happy to, uh, to talk with you today. Thanks. Thanks. Yeah, that's, um, I haven't done private pilot in a while, but I actually went and got my drone certificate, so I've been able to do that. No nonsense is one of the words that comes to mind when I think of you. So when I saw that you had been doing a lot of blogging and, uh, u- uh, using artificial intelligence, I knew I had to ask you about this, because you're not gonna buy into the hype. The most recent topic you talked about was using LLMs for, I guess, cleaning up documentation for, uh, some of the code you were writing. Could you tell me, a little bit about that? ChatGPT and stuff came out, and I saw the hype on it. I originally was like, "Ah, it's not gonna be all that great." And then everybody's like, "You gotta try it. You gotta try it." So

I finally jumped on it, and I was like, "Hey, this is kind of neat." And then I started asking it more and more complex questions, and I was getting more and more surprised at the ability of it being able to answer things that you couldn't just answer with a Python script. I- I code a lot, so when I've coded in the past with just plain old Python, it's very, um, uh, binary,

I guess, in a way, where, you know, you're parsing things and it's either a true or a false. And once I figured out that you can use LLMs to sort of give you the fuzzy middle, meaning you can kind of ask it questions and get concepts and things back rather than just a, you know, this thing exists or this doesn't exist, I was like, "All right, hold on a minute." It's kind of like you have a person, it's like an assistant, where you can go to it and say, "Hey, look at this data and tell me. if you see something like this." And you can kind of be vague to it, which is something that you couldn't do well with just standard coding, like a Python script. The topic you brought up about the, um, formatting, so one. of the blog articles that I wrote for Corelight a few weeks back was, let's say you're a technical person like myself and, you hate writing documentation, like myself - ... and you write some great programs, right? And, you know, I gotta make a README or I gotta make a documentation like this, and you write it out, and you look at it and you're like, "Man, I wrote in a bunch of first person, I switched, um, tense from past tense to, to present tense," and, like, all the stuff you can imagine when you're writing.

Well, what I did is I figured out how to automate with a large language model to sort of give you a virtual editor, and I attached a large language model to Microsoft's... They have a style guide, and when I, I mean, very, I'm saying this very generically, style guide, meaning it's not just for source code, it's for just general writing. And what I did is I walked up to an LLM, I- I wrote some code for it, walked up to the LLM and said, "Take this stuff that I wrote that might not be all that great, and then apply Microsoft's style guide to it," and it comes back and it says, "All right, well, here's all your violations." And I said, "Hey, why don't you try to fix my violations?" Hm. And then it does, and then basically I just keep reiterating that where it says, "All right, how many, how many violations are left?" And maybe you started with 60 and now you're down to hopefully 20, and then you do the same thing again, and hopefully you'll be down to, like, 10 or five, and then hopefully at the end you'll be down to, like, one or zero if you iterated enough.

And I use LLMs all the time now to write READMEs. That's some of the, sometimes the hardest thing to do, is to write a README for an open source tool that you wrote. Um, using something like Cursor, I can just say, "Hey, look at my code and then r- give me a README to start with," and then I can just edit it to my liking after that. Now, I know you taught yourself all of this stuff years ago. You've been at Corelight now for over six years. Um, I'm not sure what experience you had with Zeek prior to it as a coder, but now you're, like, you're doing a lot. You, on the research team, you're developing new capabilities that use Zeek. What- what's been your experience with how, how well these LLMs can generate, uh, Zeek script? It varies. So I do a lot, of coding in Python to, do data analysis, and then I do a lot of coding in Zeek for, you know, network detections, but I also do a lot of coding in Spicy, which is just under Zeek, that' will let you, uh, develop new, new protocol analyzers.

So for instance, if Zeek doesn't support, um, MQTT, you. could actually write an analyzer in Spicy and, all of a sudden Zeek will be able to understand MQTT. So, you know, I've been using LMs for, over a year, and I can remember back at the beginning, I would give it Zeek, I would go to, like, ChatGPT and give it Zeek code and ask it questions, and it did all right. It would, you know, uh, if I gave it some Zeek code and said, "Can you tell me what this does or, ask a question about it?" it seemed to be able to understand it when you gave it, gave Zeek code to it.

Hm. Where I was running into a lot of problems, and it wasn't just ChatGPT, it was the offline models and so forth, is, uh, Zeek's probably, I- I would say, not as well known as Python. So I think the training probably reflects that? in a lot of the models. And so I would have some very simple Zeek questions I would ask to a model just to get an idea if it. could understand

Zeek or not, and one of them was, "Could you write code to tell me when the domain corelight.com shows up in SSL?" Hm. Sounds simple to do, but it, I get varying answers ." Anything from, "Hey, you're gonna use this really cool event and it'll give me, like, all the great output," and you read it. and you're like, "Wow, that sounds really convincing." And then you go to the Zeek documentation, you're like, "This event doesn't even exist." Right. So it's just, it's just making up stuff for you.

And that's kind of how it. was at. the beginning with me where I was like, "Oh man, it's really good at Python, but some of the Zeek code are really murderers." And then-... in cursor where I code, it- it deals with Zeek really well, surprisingly well. So are you providing it with some Zeek scripts that's- and you'd sell it? "Here's a Zeek script. that does this, try to understand it, and now, I want to create a new one that does something else." Or does it' already pretty well understand the format? It just understands it. So if I'm in there and I say...

If I do the same question of, you know, "Tell me Corelight.com going across an SSL," it'll do things like, "Oh, I know that there's this event such and such, and I need to do a regular expression across the server name that comes across as event."

And it'll construct it. A lot of times, these LLMs, since they're trained on very, very large datasets, they come up with solutions that I never would have thought of. Where I'll be like, "Hey, I want to iterate over this thing." And it's like, "Here's a way to iterate over it." And I was like, "I've never done that before. That's really sweet." Mm-hmm.

So the LLM has a, has a hand in a lot of the coding I do, anywhere from generate me something from scratch, to understand something that I wrote, to please fix something that I wrote. One of the research projects that I'm working on right now, u- unfortunately I don't have any open source code that I can point you to, but I- I can talk about a little bit, is I have logic, okay? Like, imagine like a Zeek package, right? And I want to know what MITRE techniques does this package detect? Hmm. It sounds easy, and a year ago when I started doing, working on MITRE technique stuff, I was like, "Oh, this shouldn't be that hard." And I started doing it as a human where I walked in and, you know, I picked up MITRE's definition, then I looked at our source code, and I would come up with my own analysis and either say, "Yep, it detects it," or, "Nope, it doesn't."

And there are times when I would do that, but then also go to something like Gemini and ask it to tell me what it thinks. And it would come back with connections in the MITRE... What it would do is it would, I would- I would feed it the actual MITRE wording, the definition that MITRE has on their website for each technique. And because it would read that and then I would give it the source code and read that, and then it would have the knowledge of just general security stuff, it was picking out stuff that I, I wouldn't have thought to put into the analysis. It was totally valid and stuff. I don't just take the output and go, "It's right." And that's a pitfall, I think, a lot of people do where they just go to an LLM, ask a question, and they go, "Well, this thing must be right." And

I've tried in every p- way possible when I generate data, to go through it as a human. Because let's say I generate 10 files, I might have one or two files that I'm not happy with, where the output might hallucinate or it may just get caught on some, you know, it may like focus on some detail that I just don't want it to focus on.

So, what I typically do is I'll use LLMs to generate, to do all the heavy lifting for me, and then I'll put it, you know, like my output will be a bunch of files, and then what I'll do is I'll open up a text editor and just go read every single one just to make sure there isn't hallucination that slips in there. Let's say there's a listener who hasn't tried this yet. What would be the easiest on-ramp? If you haven't played with it, go to something like Gemini, which was probably free for you if you have a, a Google account, and play with, um, the Gemini Pro. It, that's the model. You'll, you'll see it up at the top or you can drop down. There's a Pro and there's a Flash. The Pro model is smarter and takes longer to think through your problems, and it's good for things like writing source code. So, you could go to Gemini and say, "Hey, um, write me some Zeek code that detects Corelight in an SSL connection," and see what it supplies. If you're developing and you wanna branch out and learn or deal with other languages, I recommend using something like Cursor because, you know, the- there's been libraries, for instance, where I don't understand the full library, but I kind of understand it, and then I go to Cursor and I'm like, "I want to use this library to do X." And it'll just do it for you. If that's the type of thing you're doing with coding, I recommend Cursor. Let's say you have some sensitive data you don't want going to any of these online services. I recommend setting up something like Ollama on your computer, and if you have 32 to

64 gigs of RAM, in that range, you should be able to run some of the smaller models, and you can play with it there. And Ollama has a gooey front end called Open-WebUI. And what it does is it runs on top of

Ollama and it gives you kind of a ChatGPT looking interface where you can drag and drop files, and you can make knowledge bases in there so you can put... For instance, one of the other blog articles that

I wrote for Corelight was I took all of our Corelight sensor documentation, which ended up being over a thousand pages, I put it into OpenWebUI and it indexed it, and then I could ask it questions like, "What packages detect DNS anomalies?"

And it will go through, pull the pages of our documentation, and give you a LLM augmented answer for it. Do you use a GPU, like a NVIDIA GPU that is gonna provide you with some VRAM to run these, or are you- you using like CPU only?

So I wish, I wish I had a NVIDIA. So I- I have Apple stuff, so it's not... So Apple has the unified memory, which is normal memory plus the GPU memory. So, if you have an Apple laptop where it's 32 gigs, you can theoretically get up to 32 gigs worth of GPU or just regular

RAM, it's shared. So, you have your operating system in there, so that'll be several gigs, so it's not gonna be quite 32. Let's say it's like 21 to 25 gigs left over. Ollama will go, "I'm on an Apple machine which has this weird unified GPU memory," and it switches into that GPU mode. It's faster than CPU, but it's not an NVIDIA. So it- it's like you see people online in YouTube and they're like, "I'm gonna generate this thing, you know, the stable diffusion on my NVIDIA card," and it takes seconds. Usually an Apple is more on the-... line of, like, maybe minutes if something takes seconds on NVIDIA. But if you did it with CPU, it would be, you know, maybe like close to an hour. Yeah. So, you do get a benefit with that hardware, but it's not gonna be the same jump that you're gonna get if you went on an Intel machine from CPU to an NVIDIA. Yeah. Tha- Yeah, that's interesting, right? One of my kids is in college, well, both of my kids are in college, but the older one is, uh, has been working in a lab, uh, f- for biology, and i- turns out her laptop was the only one of the students that were there that had a, a NVIDIA GPU. It's just a 30- 3050, but it was enough to run the local model that they were, that they needed to run . So she was asking me these questions about how to set up different, uh, Python stuff and i- it got well beyond what I was able to do, but one of her classmates was able to help and, uh, lo and behold, they could run the model locally on the laptop, and here's this laptop that she bought to, uh, to run like, you know, Sims 4 or whatever, and ended up doing some, some work for school. Yeah. Yeah, it- Wh- When I,

I- I- I'm- I was there with her. About a year ago when I, I needed a GPU, we had to look around and, you know, GPU cards were scarce and really expensive, so I ended up going with the Apple Mini, um, with 64 gigs of RAM, which isn't the fastest, but it has so much RAM in it that I can push models in it that I can't put on my laptop, which- Mm-hmm. ... i- is pretty cool. So l- you know, instead of using the 20 some odd gigs on my, that is free on my laptop,

I'm ab- able to put in like 50+ gigs worth of model. Wow. Yeah. I feel like CoreLight needs to get us some of those little, what are those, DGX Sparks or whatever NVIDIA's putting out now?

That's that- ... they're starting to make the rounds . Yeah. I think those would be fun to play with. Cool. Well, uh, Keith, this has been a really interesting conversation. I am definitely gonna have you back because I've basically scratched the surface of the notes that I had here, I had here that

I wanted to talk to you about. So, uh, thank you very much for being a guest on the... What are we now? I used to say CoreLight Podcast, and now I think we are the Network Defenders with the NDR at the end. So, thank you, Keith. Yeah, thanks for having me. And I'll give you a link, uh, that you can put on the podcast if you want. I've published, open source, a bunch of LLM tools that I've used for just different things, like, for instance, that Microsoft formatting guide we talked about earlier? I have that as open source, so you can actually use it. I'll give you the link to the CoreLight repo on GitHub, and you can include it, and that'll hopefully give somebody, um, something to start with. Yeah, that sounds good. I'll make sure that that goes into the, uh, the notes. And readers if- or listeners, if you just go to the CoreLight blog and you look for Keith J. Jones, it's all there. Y- You can search by his name. That's one of the ways I f- I found a lot of Keith stuff. Cool. Thank you, Keith. Hey, thank you for having me. Thank you for joining us on the Network Defenders podcast, sponsored by CoreLight. We will see you on the network.

You've been listening to CoreLight Defenders. To stay informed with expert intelligence on today's cybersecurity challenges, please subscribe to ensure you never miss an episode. We'll see you on the network.

Richard Bejtlich

Strategist & Author in Residence

Richard is strategist and author in residence at Corelight. He was previously chief security strategist at FireEye, and Mandiant's CSO when FireEye acquired Mandiant in 2013. At General Electric, as director of incident response, he built and led the 40-member GE Computer Incident Response Team (GE-CIRT). Richard began his digital security career as a military intelligence officer in 1997 at the Air Force Computer Emergency Response Team (AFCERT), Air Force Information Warfare Center (AFIWC), and Air Intelligence Agency (AIA). Richard is a graduate of Harvard University and the United States Air Force Academy. His fourth book is 'The Practice of Network Security Monitoring'. He also writes for his blog and Mastodon.