November 10, 2021 by Corelight Labs Team
On its surface, CVE-2021-42292 doesn’t look like the kind of vulnerability that a network-based tool can find reliably. Marked by Microsoft as a local file format vulnerability, security veterans would expect that between encryption and encoding, there would be a million different ways to evade network detection with a weaponized exploit.
Digging a bit deeper, however, this specific vulnerability presents an opportunity to be detected through a behavior you might not even realize is present in Excel: its ability to download other files directly, without user interaction. Specifically, this exploit relies on Excel fetching a second Excel file, which then contains the malicious code.
So how do you detect that across the network? It boils down to monitoring User-Agents and MIME types.
The first thing the Corelight Labs team looked at while investigating this issue was whether Excel even has a sufficiently identifiable User-Agent string - the HTTP field used to identify the software making the request. While there are many variants depending on specific versions, the theme of either the combination of “Microsoft Office” plus “Excel,” or “Mozilla” plus “ms-office,” emerged quickly - and was validated with plenty of real-world examples when we began digging in through our Polaris research systems.
MIME types - declarations by the web server about the type of content coming down - are thankfully a much more standardized affair. There’s an official list maintained by IANA. Several versions of
application/vnd.ms-excel are obvious there; we’re also looking for versions of
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, which Mozilla’s list of common MIME types calls out as being in use for .XLSX files.
The important question, once we had established that it was possible to detect Excel downloading another Excel file, is whether that happens frequently enough in normal operations that exploits would be lost in the noise. Going back through the past 30 days’ worth of data across the Polaris systems, we looked at all of the times we found Excel downloading files, which happened thousands of times per day at some of our large sites, typically with plain text, HTML, or images. The good news was that Excel downloading Excel was exceptionally rare: it happened at most in 0.03% of such downloads, with some sites showing no sign of it at all.
One frequent false positive that we did see during our investigation was SharePoint. Since that application helpfully declares itself with a header of “MICROSOFTSHAREPOINTTEAMSERVICES,” however, it’s simple to exclude those cases. Advanced SOCs concerned about this vector could easily remove this check, however, and rely on post-processing to filter out legitimate usage of that tool.
We have published Zeek code to detect the potential exploitation of this vulnerability on Github.
The main source code file can be understood as follows:
Finally, when every HTTP session has ended, lines 48-58 check the flags that we added to the associated HTTP::Info record to see if our detection flags have been set, and a Notice is generated if a possible exploit of CVE-2021-42292 has been detected.
In addition to the above Zeek script, we’re releasing equivalent Suricata rules in the same Github repository. This allows users to be flexible about how they deploy this detection, both for Corelight customers and open source users.
Since Suricata is not session-oriented like Zeek, but is instead packet stream oriented, we need signatures for both the request and response packets in order to create detection. Since Suricata is also fastest when it has unique strings to feed its fast pattern matching engine, we break up the two possible User-Agent strings into a pair of rules:
alert http $HOME_NET any -> any any (msg:"CORELIGHT MS Excel User-Agent flowbit set 1"; flow:established,to_server; http.user_agent; content:"Microsoft Office"; nocase; content:"Excel"; nocase; distance:0; pcre:"/^Microsoft Office.*?Excel/Vi"; flowbits:set,CORELIGHT.excel.ua; flowbits:noalert; classtype:misc-activity; sid:3000011; rev:1; metadata:created_at_2021_11_10, udpated_at 2021_11_10;)
alert http $HOME_NET any -> any any (msg:"CORELIGHT MS Excel User-Agent flowbit set 2"; flow:established,to_server; http.user_agent; content:"Mozilla"; nocase; content:"ms-office"; nocase; distance:0; pcre:"/^Mozilla.*?ms-office/Vi"; flowbits:set,CORELIGHT.excel.ua; flowbits:noalert; classtype:misc-activity; sid:3000012; rev:1; metadata:created_at_2021_11_10, udpated_at 2021_11_10;)
The key piece to note here is the flowbit clauses. By setting a bit here - and then telling Suricata to not generate an alert for these rules - they become a prerequisite for the alert that we can check in our following rules:
alert http any any -> $HOME_NET any (msg:"CORELIGHT MS Excel downloading Excel - potential exploit (CVE-2021-42292)"; flow:established,to_client; flowbits:isset,CORELIGHT.excel.ua; http.content_type; content:"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"; nocase; http.header; content:!"MICROSOFTSHAREPOINTTEAMSERVICES"; nocase; reference:cve,2021-42292; classtype:attempted-user; sid:3000013; rev:1; metadata:created_at_2021_11_10, udpated_at 2021_11_10;)
alert http any any -> $HOME_NET any (msg:"CORELIGHT MS Excel downloading Excel - potential exploit (CVE-2021-42292)"; flow:established,to_client; flowbits:isset,CORELIGHT.excel.ua; http.content_type; content:"application/vnd.ms-excel"; nocase; http.header; content:!"MICROSOFTSHAREPOINTTEAMSERVICES"; nocase; flowbits:unset,CORELIGHT.excel.ua; reference:cve,2021-42292; classtype:attempted-user; sid:3000014; rev:1; metadata:created_at_2021_11_10, udpated_at 2021_11_10;)
These rules check for the previously set flowbits, and then validate that the appropriate MIME type has come down; the negated content match, denoted by the “!” in front of the string, rules out SharePoint services per the false positive detailed above.
Detecting the exploit this way is not going to be completely free of false positives, but with all of the context of the transaction provided by Zeek, weeding out the noise should be straightforward. An Excel file being loaded from a corporate Intranet site or another department of a university is much less interesting, for example, than one being loaded from a rarely-accessed domain, especially if that domain is suspicious. Given the near impossibility of detecting this exploit directly over the network, secondary detection combined with smart filtering is the best way for defenders to ensure that they can see this attack.
By Corelight Labs Team