Running DeepSeek AI privately using open-source software

Introduction

Zeek^® is a powerful open-source network analysis tool that allows users to monitor traffic and detect malicious activities. Users can write packages to detect cybersecurity events, like this GitHub repo that detects C2 from AgentTesla (a well-known malware family).

Automating summarization and documentation using AI is often helpful when analyzing Zeek packages. Instead of relying on external cloud-based services like the DeepSeek app, which poses potential privacy risks, we can run the deepseek-r1 large language model (LLM) locally on our machine using Ollama. This article demonstrates how to summarize Zeek package contents using Ollama and Open WebUI privately.

Why run DeepSeek locally?

While cloud-based AI models provide convenience, they introduce serious privacy concerns, especially when handling sensitive data like Zeek network monitoring scripts. If you analyze your closed-source Zeek scripts using the DeepSeek AI app, you may be exposing your intellectual property and detection techniques to potential adversaries.

Several reports have highlighted the privacy risks and potential data leaks associated with cloud-based AI applications like DeepSeek:

"DeepSeek App Transmits Sensitive User and Device Data Without Encryption" – This article highlights how DeepSeek's iOS app sends sensitive data over the internet without encryption, making it vulnerable to interception and manipulation.
"Experts Flag Security, Privacy Risks in DeepSeek AI App" – Security experts have identified serious privacy concerns, including extensive data collection and potential device fingerprinting, which could lead to user deanonymization.
"Wiz Research Uncovers Exposed DeepSeek Database Leaking Sensitive Information" – Researchers discovered a publicly accessible database belonging to DeepSeek that contained chat history, secret keys, and other sensitive information, and exposed user data at scale.

Running deepseek-r1 locally with Ollama offers several advantages::

Enhanced security: No data is sent to the cloud, which eliminates exposure to potential breaches.
Full control: You control the model version, parameters, and data, ensuring your information remains private.
Offline capability: AI summarization remains available even in air-gapped environments, making it ideal for secure workflows.

Setting up Ollama and Open WebUI for local processing

Ollama provides an easy-to-use way to run large language models locally. Open WebUI offers a user-friendly interface for interacting with these models. Follow these steps to set up a secure, local AI-powered Zeek summarization system:

1. Install Ollama

You can use the following command line to install Ollama:

curl -fsSL https://ollama.ai/install.sh | sh

Once installed, start the Ollama service (if it has not been started automatically already by systemctl):

ollama serve

If you use a Mac, you can install Ollama via Homebrew instead:

brew install ollama
brew services start ollama

2. Install Open WebUI

Next, we need to install the web-enabled frontend to Ollama called Open WebUI:

pip install open-webui

You can start open-webui with:

open-webui serve

Open WebUI will be available at http://localhost:8080. Create your default admin account and log in.

3. Download the Deepseek-r1 model locally

Next, we will download the deepseek-r1 model into Ollama with the following command:

ollama pull deepseek-r1:14b

This model ensures all processing happens on your machine inside Ollama without reaching external servers.

Summarize the Zeek Code

Now select the deepseek-r1 model in open-webui and provide the following prompt from the source code from the AgentTesla detector code:

    
     Summarize this Zeek (fka Bro) pkackage:

```
signature agenttesla-ftp-data {
    ip-proto == tcp
    payload /^Time:.*
User Name:.*
Computer Name:.*/
    eval AgentTesla::agenttesla_ftp_match    
}

signature agenttesla-generic {
    ip-proto == tcp
    payload /.+Time:.*
User Name:.*
Computer Name:.*/
    eval AgentTesla::agenttesla_match
}

signature agenttesla-http {
    ip-proto == tcp
    payload /^POST .*\x0d\x0a\x0d\x0ap=([A-Za-z0-9\/]|%2B){4}{75,}((([A-Za-z0-9\/]|%2B){3}=)|(([A-Za-z0-9\/]|%2B){2}==))?/
    eval AgentTesla::agenttesla_http_match    
}
```
and

```
module AgentTesla;

export {
	## The notice when AgentTesla C2 is observed.
	redef enum Notice::Type += { C2_Traffic_Observed };
}

# Common logging function
function logit(c: connection, over_what: string, data: string) 
	{
	local msg = fmt("Potential AgentTesla C2 %swith payload in the sub field.", over_what);

	# Do not suppress notices.
	NOTICE([$note=AgentTesla::C2_Traffic_Observed, $msg=msg, $sub=data,
		$conn=c]);
	}

# Signature match function for FTP
function agenttesla_ftp_match(state: signature_state, data: string): bool &is_used
	{
	logit(state$conn, "over FTP data ", data);

	return T;
	}

# Signature match function for SMTP/Generic
function agenttesla_match(state: signature_state, data: string): bool &is_used
	{
	logit(state$conn, "", data);

	return T;
	}

# Signature match function for HTTP
function agenttesla_http_match(state: signature_state, data: string): bool &is_used
	{
	logit(state$conn, "over HTTP ", data);

	return T;
	}

The deepseek-r1 model provides the following output:

    
     <think>
Okay, so I'm looking at this Zeek script to summarize. There are a couple of parts here: some signatures and a module.

First, there are three signatures defined:
1. agenttesla-ftp-data: Looks for TCP traffic where the payload starts with "Time:", followed by "User Name:" and "Computer Name:". It uses a regex pattern to match this.
2. agenttesla-generic: Similar to the FTP one but with a more flexible regex, matching any payload that includes those fields.
3. agenttesla-http: Looks for HTTP POST requests with specific patterns in the payload, including encoded characters like %2B.

Then there's the module AgentTesla which exports a custom notice type C2_Traffic_Observed. It has functions to log potential C2 traffic:
- logit: Creates a message and logs it with the specified note.
- Three match functions (ftp_match, generic_match, http_match) that call logit with appropriate messages indicating the protocol.

Overall, the package is designed to detect AgentTesla's C2 communication by monitoring for specific patterns across different protocols. When a pattern matches, it logs a notice about potential malicious activity.

I think I understand how each part works individually, but I need to make sure I connect them in my summary to show how they all work together to detect AgentTesla.
</think>

### Summary of Zeek (fka Bro) Package for Detecting AgentTesla C2 Traffic:

The provided Zeek package is designed to identify potential Command and Control (C2) traffic associated with theAgentTesla malware. It consists of three signature definitions and a module that handles logging and detection logic.

#### Key Components:
1. **Signatures**:
   - **agenttesla-ftp-data**: Monitors TCP traffic for specific FTP data payloads containing "Time:", "User Name:", and "Computer Name:" fields separated by `
` tags.
   - **agenttesla-generic**: A more flexible signature that detects any TCP payload containing the same C2-related fields, using a broader regex pattern.
   - **agenttesla-http**: Detects HTTP POST requests with payloads matching known patterns of AgentTesla's C2 communication, including encoded characters.

2. **Module (AgentTesla)**:
   - Defines a custom notice type `C2_Traffic_Observed` to log potential malicious activity.
   - Includes logging functions (`logit`, and protocol-specific match functions: `agenttesla_ftp_match`, `agenttesla_match`, `agenttesla_http_match`) that trigger notices when suspicious traffic is detected.

#### Functionality:
- The package employs a multi-protocol approach (FTP, generic TCP, HTTP) to detect various forms of C2 communication.
- When a matching pattern is identified, the corresponding function logs a notice with details about the detected activity.

### Summary:
The Zeek package provides comprehensive detection mechanisms for AgentTesla's C2 traffic by analyzing different types of network communications. It effectively combines protocol-specific signatures with logging functions to alert on potential malicious activities.

This model begins with <think> … </think> tags to document its thought process. The content after these tags is the model’s summary of the AgentTesla detector.

You can see that the model accurately determined that there were three Zeek signatures to look for AgentTesla C2 in FTP, SMTP (generic), and HTTP. Furthermore, the model’s output highlights the phrases the detector looks for, such as "Time:", "User Name:", and "Computer Name:". The model also sees that the package fires a Zeek notice when potential AgentTesla C2 is discovered.

Security considerations:

In this scenario, we demonstrate the analysis of an open-source AgentTesla detector, but what if this package was closed source? If the detection technique is proprietary, querying the DeepSeek app would send this package content to DeepSeek’s authors. There is no guarantee that DeepSeek’s authors would not share the data or use it to train newer models. Worse, if the DeepSeek API is compromised, you could send your Zeek detection logic directly to adversaries.

By running Ollama and Open WebUI locally, we keep our detection logic on our own computer, significantly reducing the risk of data leaks and unauthorized exposure.

Conclusion

The risks of using cloud-based AI applications like DeepSeek for cybersecurity-related code analysis are too significant to ignore. By running deepseek-r1 locally with Ollama and Open WebUI, security analysts maintain control over their data, reduce privacy risks, and ensure their sensitive information is not exposed to third parties. This method provides a secure, efficient, and privacy-preserving way to analyze Zeek scripts while eliminating reliance on untrusted external AI services.

Postscript: If you are interested in monitoring your network with the AgentTesla detector, it is already installed on Corelight sensors. You can also use zkg to install it into your open-source Zeek installation!

Corelight announces cloud enrichment for AWS, GCP, and Azure

The Corelight Partner Program

10 considerations for implementing an XDR strategy

Don't trust. Verify with evidence

Network Detection & Response

Running DeepSeek AI privately using open-source software

Introduction

Why run DeepSeek locally?

Setting up Ollama and Open WebUI for local processing

1. Install Ollama

2. Install Open WebUI

3. Download the Deepseek-r1 model locally

Summarize the Zeek Code

Security considerations:

Conclusion

Tags

Similar Posts

Detecting The Agent Tesla Malware Family

Detecting the STRRAT Malware Family

Detecting Quasar Windows RAT

Have questions?

Sign up for our newsletter

We’re hiring!