Corelight Bright Ideas Blog: NDR & Threat Hunting Blog

How Metadata Enables FINRA Data Archiving Compliance | Corelight

Written by Richard Chitamitre | Mar 7, 2025 7:58:25 PM

The financial industry is known for its rigorous and sometimes quirky data retention requirements that can challenge even the most seasoned security expert.

For example, FINRA Rule 4511 requires members to "preserve for a period of at least six years those FINRA books and records for which there is no specified period under the FINRA rules or applicable Exchange Act rules."

Keeping six years of records: That's no small feat. But it's certainly doable.

This is the true story of how one firm used Zeek®, the open-source network monitoring system that forms the backbone of Corelight’s enterprise Open NDR Platform, to address its data archiving challenge. The firm used Zeek network data logs to assist in the capture and preservation of all business-related electronic communications, including emails, instant messages, and chat logs, ensuring compliance with the rule's recordkeeping mandate. This action gave them the ability to monitor activity in these systems. It also confirmed which data was retained, simplifying auditing and facilitating its ongoing compliance with FINRA Rule 3110. As a residual benefit, the firm improved its process of retaining security logs from involved systems and network monitors.

NetFlow and PCAP costs tapped out the budget

The firm had an existing NetFlow and PCAP system and considered expanding its use to capture book and record transactions. They had a pretty good idea of the answer beforehand, but did their due diligence and calculated the costs of expanding the existing platform:

  • Daily logs: When transferring data at 10 Gbps sustained over an 8-hour day, the total raw traffic would equal 36 TB. Expansion would raise that by an order of magnitude — 100 Gbps — and would equate to 360 TB of raw data captured on disk (these numbers were worst-case scenarios for sustained traffic).

  • For around $100,000 they could cover a week of storage with a backup, but scaling to a full year would be well over $5 million— not including additional expenditures for rack space, power, and cooling for numerous storage appliances.

When the firm spec'd out the platform, it anticipated being able to contain seven days’ live PCAP and NetFlow before copying and archiving some PCAP data over to a NAS for application analysis. In practice, they were only able to compile four days’ worth of live retention before the data was overwritten. The PCAP system also wasn't capturing all network traffic: It was capturing specific traffic communicating on certain ports.

So, in order to expand the limited capture to just a full month’s worth of live record retention, the upgrade was going to cost over $7 million. This solution was too expensive, unsustainable, and out of the question.

Metadata for the win

Fortunately, the firm had personnel with operational experience implementing Zeek. Using Zeek and the detailed metadata it logs, the firm bridged the gap in its network security monitoring platform. With it, the firm could have 60 days’ worth of network traffic indexed live.

Here are what the Zeek solution’s numbers looked like by comparison:

  • Daily logs: For an 8-hour day with sustained traffic of 3 Gbps the total data would amount to approximately ~10.8 TB/day (considering the worst-case scenario). The Zeek logs generated from this traffic measure about 1/100th the size but contain 100 times more information than NetFlow.

  • Final data footprint: 10.8 TB/day of traffic resulted in 108 GB per day of Zeek logs. For a full month, at approximately 23 business days per month, the firm would be looking at ~ 2.484 TB/month and for a full 30 days — ~3.24 TB/month (these numbers are for live indexed data).

And here’s how the firm met its long-term data archival goals and complied with FINRA’s six-year lookback requirement:

  • When approaching the idea of archiving these logs, the resulting archives would achieve approximately an 80% reduction in size on disk. 10.2 TB * .2 = 21.6 GB/day. Storage for a year would be 21.6 GB * 365 = ~7.884 TB. For 10 years of storage—far more than the 6 years FINRA requires—they would only need ~80 TB of disk storage.

Problem solved.

Why is having all this data so important anyway?

As a security expert, it isn't feasible to threat hunt or respond to incidents with a limited amount of PCAP on only a few ports and protocols. When defenders are protecting networks from exploitation, breaches, and data loss, they require detailed evidence that builds a comprehensive story of what occurred. When a new zero-day detection is released, you can detect it going forward. But if you don't have data from the past to run against the indicators of compromise (IoC), you can't be sure your network hasn’t already been compromised.

Scenarios like these form a compelling case for security experts in the financial industry to leverage the capabilities of Corelight’s Open NDR Platform. With its Smart PCAP technology, Corelight enables long-term data storage and retrospective investigation and simplifies the process of complying with regulatory mandates.

It also provides security teams and forensic analysts with invaluable lookback capabilities that assist them in monitoring and defending their networks.

Learn more about Corelight

Corelight does not provide legal or compliance advice. You are responsible for making your own assessment of whether your use of the Corelight offerings meets applicable legal and regulatory requirements.