What Lies Beneath: Protecting Your Data From Dark Web Denizens

February 6, 2018 • The Recorded Future Team

Editor’s Note: The following blog post is a summary of a presentation from RFUN 2017 featuring First Data’s Christopher Mascaro and Recorded Future’s Andrei Barysevich.

Key Takeaways

  • The dark web does not refer to a discrete part of the internet, but rather to websites that can only be accessed through browsers that provide encryption and anonymity, like Tor sites.
  • Dark web traffic is the sort of content that demands anonymity, like stolen financial records, drugs, child pornography, or politically sensitive information.
  • The sale of stolen personally identifiable information (PII) is a growing industry on the dark web. Breaches of information from major organizations happen regularly, as evidenced recently by the Equifax breach.
  • Incident response takes visibility on the dark web and human-machine teaming to sort through data smartly and quickly.
  • When you know you have a breach, the best action is to simply invalidate the data.

Your life is probably for sale on the internet — but don’t go rushing out to buy it back. That’s one of the points that Christopher Mascaro of First Data and Andrei Barysevich of Recorded Future made during their presentation at the Recorded Future’s recent annual user conference. In their talk, they discussed what the parts of the internet referred to as the dark web are exactly, and looked at the details of how compromised financial information gets there and goes for sale. They also emphasized that quick responses to data breaches take both a strong human element and a good degree of automation in your organization. Finally, they said that even if you manage to track down your data for sale, the best thing to do is not to buy it all back before someone else does, but to invalidate that data on your end.

What Is the Dark Web?

There are some common misconceptions about what the dark web actually is and how it relates to the other parts of the internet. The part of the web that we are most familiar with is sometimes referred to as the “surface web,” which includes all the information that is indexed by search engines. If you can get to it through a Google search, it’s part of the surface web. Although this includes most of the websites that people visit regularly — Facebook, Wikipedia, YouTube, most news websites, and so on — according to some estimates, the surface web only comprises about four percent of the internet in terms of data.

The majority of information on the internet is not indexed by search engines, but can only be accessed through secure logins or paywalls. This “deep web,” which makes up about 90 percent of the data on the internet, comprises information like medical records, subscription information, legal documents, scientific and academic reports, financial records, government files, and other kinds of databases that are confidential or otherwise secure.

The dark web is what lies beneath, making up the remaining six percent of data on the internet. Websites on the dark web are only accessible through certain browsers, like Tor, that provide encryption and anonymity. Much of the content on the dark web involves illegal activity — for example, drug trafficking or the distribution of child pornography — but it also provides a safe venue for people who are engaging in less questionable activity but still require anonymity, like those engaged in political protests or private communications. Exploits and vulnerabilities in programs are also frequently discussed and exchanged on the dark web.

Barysevich notes that a frequent misconception about the dark web is that it is a holistic and discrete part of the internet, and that getting access to it is “like one key that opens every single door” where “you can find any information, which is neatly organized, available to anyone.” This is not the case. The dark web is an ecosystem divided into categories defined by language, skill set, experience, and reputation, and gaining access to one part of it will not necessarily grant access to another part, just as subscribing to one news outlet online will not thereby provide you with a subscription to all the news on the internet. Further, Mascaro says, communication is typically conducted through encrypted messengers, making it nearly impossible for any organization short of a major government to effectively intercept those communications and leaving an incomplete picture of what information is exchanged on the dark web.

Identities for Sale

A top commodity on the marketplaces of the dark web is stolen personally identifiable information (PII for short), which refers to any data that can be used to identify, contact, or locate a person. Think social security numbers or usernames and passwords. Access to sites that sell PIIs typically require paid subscriptions, and a seller’s reputation plays a major role in determining prices for stolen PIIs, with the biggest factor being how consistently usable — or how “fresh” — the information sold is.

One of the most commonly trafficked PIIs on the dark web is credit card information, which is often stolen in bulk from banks or corporations and then sold for anywhere between $5 and $20 per number. During the recent Equifax breach, for example, more than 200,000 credit card numbers were stolen.

Shifting Tactics

Mascaro notes that in recent times, the nature of breaches is shifting toward targeting e-commerce transactions. Recently,, many attacks involved malware targeting point-of-sale (POS) systems. POS terminals are simply computerized cash registers — the kind that make it easier to track inventory and customer information, but also leaves them vulnerable to cyberattacks. POS systems have become less desirable targets for hackers, however, following the introduction of EMV (Europay, Mastercard, and Visa) chip technology, they are now a globally used standard for financial security. This new chip technology generates a unique transaction code each time the card is used, making transactions far more secure than the magnetic stripe on credit cards. The information held in those mag stripes does not change, making counterfeiting a relatively simple task on old cards that lack EMV chips.

The introduction of EMV chips means the easier targets are now the databases that organizations compile through e-commerce. Rather than insert a chip, online shoppers input their name, address, date of birth, the credit card number, and so on, providing vast amounts of personally identifiable information. Malware on those sites can redirect the users without their knowledge or simply take the information as it is transferred.

Mascaro says that the exchange of these kinds of credentials, which allow for the takeover of a financial account, is the biggest thing that his team has begun to see in recent years. With that information, a malicious actor might simply change the withdrawal limits on an account or create a new card and walk away with thousands or millions of dollars in a weekend.

It’s worth noting the size of the breaches — they’re actually getting smaller in many cases, says Mascaro. In the past, breaches might have led to tens or hundreds of millions of accounts becoming compromised. But those large breaches, though potentially lucrative, are also far more noticeable. The latest breaches target smaller companies, or just parts of a company’s data, making it harder to identify the source of the breach. Smaller merchants — for example, doctor’s offices or hospitals — often also have data with a higher validity, meaning that it will be able to be used for fraudulent purposes more consistently, and thus command a higher price on the dark web.

Mascaro also notes that there is a growing number of what he calls “integrator breaches,” which occur at the level of installation of POS systems and other forms of payment. That lets malicious actors gather information from a variety of merchants and have a smaller but more diverse batch of data, again, making it harder to identify the source of a breach.

Defensive Measures

In discussing how organizations can defend against their data being stolen and trafficked on the dark web, Mascaro and Barysevich kept things sober. The vast amount of man-hours needed to actually review the data collected each day — Barysevich noted that Recorded Future collected over 200,000 events from its sources on the dark web just in the last 24 hours — means that some level of automation is required. But it is impossible, and undesirable, to entirely remove the human element from this process.

Getting alerts about data found on the dark web, of course, means that the data is already out there — stolen. The basic lines of defense have failed, and the task now is to resolve the problem as quickly as possible. That may imply that automation will be of the greatest value in terms of responding quickly, but Barysevich says that “when it comes to human-derived intelligence, when we need to talk to the bad guy, when we need to evaluate certain events … there is no other way around but to actually engage real human beings and to evaluate the problem.”

The goal, then, Barysevich says, is neither to manually review every single alert out of those thousands that come in each day — that would be a task of impossibly large scale — nor to automate the process entirely, but to “create smart alerts … to lower this number to a fairly manageable level.”

That means having a presence on the dark web as well — not necessarily interacting directly with criminals, but having some idea of what is going on in those spaces. What data is being traded? What vulnerabilities are being discussed? Gathering that information before it is exploited provides a critical head start on incident response.

There is a delicate balance at play here, however. One thing you should not do, Barysevich notes, is buy back all of your stolen information. According to him, when an organization has sufficient resources to be aware of, and after a breach, find the stolen information being peddled in some corner of the dark web, sometimes clients will ask whether it is feasible to simply buy back that information instead of letting it fall into the hands of other malicious actors. Although this may be feasible from an economic perspective, a bank or other large organization might believe it is better to spend a few million dollars right then and there rather than deal with the potentially much greater repercussions of letting that information slip out into the world. It actually disturbs the balance, creating an even greater incentive for criminals to steal more information. Individual buyers on the dark web might typically purchase something on the scale of a few dozen accounts in one go. Buy several thousand or a million at once, though, and the sellers will know they’re onto something good.

The best response is to work as quickly as possible to invalidate the data. What is abundantly clear is that the sale of personal data is a thriving economy which, because of the encrypted nature and anonymity of the dark web, and its reliance on cryptocurrencies like Bitcoin, is difficult to shut down by directly targeting the buyers and sellers. The line of weakness, then, is the goods themselves. As mentioned previously, sellers depend on a good reputation to get repeat customers. Ruining the product by changing credentials, passwords, and so on destroys that reputation, and doing so effectively means acting as quickly as possible, before that data is bought and used for malicious purposes.