Uncover Unseen Malware Samples With No Distribute Scanners
By Daniel Hatheway and Diana Granger on June 14, 2018
Scope Note: In January 2018, Recorded Future began harvesting data from certain “no distribute scanners.” These scanners analyze files in the same way that mainstream multiscanners do, but without distributing submitted samples to antivirus engines. This piece highlights key findings from our analysis of data collected from January 1, 2018, to May 18, 2018, within the Recorded Future platform. We also walk through a use case to illustrate how real data from these sources can provide actionable intelligence. While there is some overlap in the samples submitted and submission dates to both no distribute scanners and multiscanners, there are also samples seen only by one type of scanner or the other. We believe that metadata from this source can assist in proactive intrusion detection.
Security researchers use multiscanners on a daily basis to run malware samples against multiple antivirus engines, as well as hunt for similar samples, additional indicators, and the threat actors submitting these samples. If a specific criteria is met, multiscanners distribute the sample to participating antivirus companies, potentially leading to more antivirus engines detecting that sample as malicious. No distribute scanners do not distribute analyzed samples to antivirus companies and therefore are appealing to threat actors. Actors receive feedback that allows them to fine-tune their product without running the risk of the malware being proliferated. Recorded Future metadata collected from no distribute scanners can be used to proactively alert on a file hash, research threat actors, and investigate malware, in some cases up to a month before the data is collected and distributed by traditional multiscanners.
- Threat actors are encouraging their community to use no distribute scanners over the traditional multiscanners, allowing them to increase their samples’ effectiveness by testing against antivirus engines without the risk of the samples being shared with antivirus companies and researchers.
- Seventy-five percent of the samples submitted to no distribute scanners in Recorded Future’s dataset have never been seen on traditional multiscanners.
- Alerting on data from no distribute scanners can provide useful metadata as part of a proactive defense strategy.
No Distribute Sites
Security researchers regularly use multiscanners to run potentially malicious samples against multiple antivirus engines. Most multiscanners distribute samples to all participating antivirus engines when a specific criteria is met and then sell sample access to security researchers. Because of this, threat actors have been migrating away from standard multiscanners that allow paid access to the data, distribute samples, or make the free analysis results easy to obtain.
Multiscanners that do not distribute samples to antivirus companies are referred to as “no distribute scanners.” They are particularly appealing to threat actors because, in theory, they receive feedback that allows them to refine their malware to have a lower detection rate or avoid detection altogether. Threat actors also tend to use no distribute scanners on criminal forums as a method of validation to the potential buyers that their “product” is truly fully undetectable (FUD).
Due to their wealth of proactive information and ties to potential threat actors, no distribute scanners have been targeted by law enforcement and security researchers for some time. Consequently, many of these services have a short life span.
Recorded Future Collection Process
Recorded Future collects from many top-tier criminal forums, dark web markets, and other communities where threat actors interact, including URLs for no distribute scan results that are shared on these sites. Those URLs can be used to collect the metadata about the shared malware sample. This metadata will vary based on the data presented by the no distribute site, but generally speaking it will contain at least the following:
- MD5 Hash
- File Size
- Scan Date
- Scan Results
- Scan Detection Names
Custom harvesters are required for each site because every no distribute site is different. Recorded Future harvests from the most frequently used and active no distribute scanners, and continuously adds additional harvesters for new no distribute scanners as identified.
By the Numbers
The following statistics were derived from metadata collected by Recorded Future from no distribute sites between January 1, 2018, and May 18, 2018.
Beginning in January, we collected all of the unique MD5 hashes of samples submitted to no distribute scanners. Of these, only 25 percent can be found on at least one traditional multiscanner, while the remaining 75 percent have never been seen. Of the 25 percent detected by multiscanners, 45 percent were first seen by a no distribute scanner and 55 percent were first seen by a traditional multiscanner.
Note: It is important to point out that some of these hashes are from malware builders, meaning that when the threat actor changes his or her command and control infrastructure, a new unique hash will be created. This could account for a portion of the 75 percent of samples not seen by multiscanners.
As mentioned earlier, threat actors often rely on no distribute scanners as a way to ensure that malware they have created or acquired will go undetected by antivirus software. Vendors will usually include a link to the scan results in advertisements for their malware as proof of the detection rate. Recorded Future’s harvesters allow indexing and alerting on the hashes available from these no distribute scanner links. This metadata can reveal additional details about a malware variant that was previously not available using only the information collected from forum posts.
Tracking a Threat Actor and His or Her Malware
As an example, Insikt Group used activity from one of the top contributing actors and a member of multiple dark web forums, kent9876. Below is part of an advertisement by kent9876 for cryptocurrency mining malware Goldig Miner:
The above post contains several links to no distribute scan results from VirusCheckMate and Run4Me for different components of Goldig Miner. The actor provides results for the same files from multiple scanning sites in order to capture verdicts from as many antivirus engines as possible, as multiscanners differ on which antivirus engines they encompass. The advertisement was first posted on December 18, 2017, and the last known forum post by kent9876 about Goldig Miner was on February 9, 2018. While this was the end of mentions of Goldig Miner, the cryptocurrency resurfaced on March 1, 2018, when its files were reanalyzed on VirusCheckMate and Run4Me with the file hash of b7b203a5455a34a109cec61b44d10b38.
As the timeline above shows, the first analysis of a sample related to Goldig Miner by traditional multiscanners occured on April 11, 2018, over a month after being analyzed by the no distribute scanners. The fact that the samples continue to be submitted to multiscanners for analysis indicates that the malware is likely still in use and thus could be worth monitoring.
SIEM and Alerting
Alerting on hashes from no distribute sources can provide a proactive approach when monitoring your network. In most cases, this involves importing the feed into a SIEM (security information and event management). The feed from this data source is relatively small and thus manageable. Additionally, most of these hashes originate from criminal forums and are consequently worth monitoring.
In the case of Goldig Miner, an organization utilizing this feed would have been alerted to the cryptocurrency miner before most security controls. This would allow its incident response team to address the issue before waiting for a detection from traditional security controls.
With the increasing availability of malicious code to threat actors, it is to our advantage to mine their data sources to give defenders every possible chance to be proactive. Much like the dark web and underground forums from which Recorded Future harvests data, no distribute scanners present an opportunity to do just that; we know that threat actors are sharing and selling samples within their online communities and that they are using sources like no distribute scanners to validate their samples.
We feel that this particular source — hashes from no distribute sites which originated on criminal forums — can allow our customers to be more proactive when identifying threats or researching threat actors. Recorded Future will continue to harvest full metadata from its current collection of no distribute scanners and will add more in the future to take full advantage of this opportunity to mine threat actors’ data sources and give customers a way to proactively alert on this data.