Chasing Risky Internet Business

August 28, 2017 • Amanda McKeon

As security professionals, we’re relied upon to protect our networks from malicious traffic. But what’s the best strategy for determining the most likely sources of risky traffic? Is it safe to assume that traffic from certain countries is more suspicious than others, or that some hosting infrastructures are more likely to be compromised? With a growing consensus that IP blocklists are rapidly becoming obsolete, a more sophisticated approach is needed.

Our guest today is Dr. Bill Ladd, chief data scientist at Recorded Future. He’s the author of the report, “From Chasing Risk Lists to ASN Policies: Large-Scale Analysis of Risky Internet Activity.” The report takes a data-driven look at a variety of ways to determine risky ASNs and IP addresses. In this episode Bill Ladd gives us an overview of his team’s research and findings.

This podcast was produced in partnership with the CyberWire and Pratt Street Media, LLC.

For those of you who’d prefer to read, here’s the transcript:

This is Recorded Future, inside threat intelligence for cybersecurity.

Dave Bittner:

Hello everyone, I’m Dave Bittner from the CyberWire. Thanks for joining us for episode 21 of the Recorded Future podcast. As security professionals, we’re relied upon to protect our networks from malicious traffic, but what’s the best strategy for determining the most likely sources of that traffic? With a growing consensus that IP blocklists are rapidly becoming obsolete, a more sophisticated approach is needed.

Our guest today is Bill Ladd, chief data scientist at Recorded Future. He’s the author of the report “From Chasing Risk Lists to ASN Policies: Large-Scale Analysis of Risky Internet Activity.” The report takes a data-driven look at a variety of ways to determine risky ASNs and IP addresses, and in this episode, Bill Ladd is going to give us an overview of his team’s research and findings. Stay with us.

Bill Ladd:

With IP addresses, if you’re blindly blocking them then you may or may not know why they were blocked. If you use kind of a simple external risk list, they can be a little bit of a black box. In contrast, if you’re going to block something, the more you know about it, the more confident you are that you’re doing a good block. One of the issues can be that you can have multiple domains resolving to the same IP. One of them may be malicious, but other ones may be fine. The more knowledge you have, and the more context you have, then the easier it is to make a decision about a blocking rule.

Dave Bittner:

Take us through the research. Describe to us what you’ve done here.

Bill Ladd:

At a high level, what we did was we looked at the 4 million or so IPs that we had some level of risk for. Those levels of risk ranged from actively being a C2 — or a command and control server — for malware, towards being infrastructure that is misconfigured and available for open proxy exploits, so they can be used by threat actors in a variety of ways. What we did is, we mapped those to the ASNs that they were involved in, they were part of, and then the countries that were associated with those ASNs to look at how the data aggregated into a way that we can make some higher-level analyses. If people aren’t familiar with what an ASN is …

Dave Bittner:

Yeah.

Bill Ladd:

The internet is basically organized into blocks of IP addresses, and those blocks of IP addresses are managed by different organizations. AT&T manages an ASN, or perhaps multiple ASNs. Companies you’ve never heard of manage ASNs. The military — the U.S. military — manages an ASN. Basically, there are organizations that manage individual blocks of IP addresses. What you find is that different ASNs, or autonomous systems, or different organizations that are managing blocks of IPs, manage those with different levels of control, in terms of what’s allowed to operate on those ASNs. Some companies are really good at making sure that there’s very little illegitimate traffic on their ASNs and there’s very little risky behavior that emerges out of some ASNs.

Other ASNs do a much poorer job of controlling risky behavior, so essentially, an ASN is really … it’s an aggregation of a collection of IP addresses managed by a single organization. That, given you have a common organizational management of those IP addresses, you can make some assessments about the riskiness, or the validity, or the integrity of all of the IP addresses associated with an ASN based on historical activity for the IP addresses associated with the ASN. The ASN is really a high-level way to organize the internet, and we can make assessments about IP addresses from an ASN based on history of IP addresses from those companies that are managing those ASNs.

The first step that we did, we looked at the countries that they aggregated into. For example, we found that the largest number of risky IPs came out of China, which wasn’t a surprise. Then, we went from looking at countries to the ASNs. There are a lot of interesting inferences that came out of the country analysis. It was not surprising to see that China was at the top. It was surprising to me that Russia wasn’t right behind. That Brazil had more risky IPs associated with it than Russia and Ukraine did, for example, as Latin America in general is higher. It was intriguing.

Dave Bittner:

And a high number in the United States.

Bill Ladd:

Absolutely. What was interesting about that was, when we looked at the individual ASNs, in China, the riskiness was really kind of primarily at the top two largest ASN regions — Chinanet and the China Backbone. In the U.S., it tended to be much more spread out across large, but not quite as large, ASNs.

Dave Bittner:

Yeah, why do you think that was?

Bill Ladd:

I think in general, the internet is much more centralized and tightly controlled in China. The number of ASNs is much, much smaller. The big ones are really big. The fact that we only had 560 or so ASNs in China versus 16,000 in the U.S. allows much more variability of ASN management in the U.S., and it’s much more centralized in China. The riskiness is much more spread out in the U.S. across a wider number of ASNs. In China, it’s much more restricted, much more controlled into the largest ASNs which are primarily managed by the Chinese government.

Dave Bittner:

Beyond the country level, you dug in deeper than that.

Bill Ladd:

Absolutely. I think the reason that we did that is that the country-level findings are useful, but not necessarily directly actionable. You can see that there’s a large number of maliciousness perhaps coming out of China, but you’re not necessarily going to block all of China. What we started then to do, is to look at the ASNs where most of the risk was coming from, to start to explore that behavior. You know, we looked at that a couple of different ways. We looked at the ASNs that had the largest total number of risky IPs and those were the Chinanet ones. That went down. We looked at the top U.S. ones in there and they were fairly low. Top European ones a little bit higher. Again, the ASNs with the largest number of risky IPs tended to be fairly large.

We took a second look at the ASNs with the highest percentage of risky behavior associated with the IPs. That started to become fairly interesting from an operational standpoint. You see a collection of ASNs where, essentially, every single IP or the majority of IPs have been identified as risky in the past. Now you move from a position of looking at country level, or a large level, where you’re getting an understanding of the kind of landscape to some specific ASNs, which seem to be endemically compromised.

Dave Bittner:

The notion there being, that if an ASN is 100% risky, then it’s probably just safe to block that ASN?

Bill Ladd:

Exactly. The ASNs that have those high levels of risk also tend to be fairly small. They’re not the Chinanet or AT&T with 100,000,000 IP addresses. They tend to be hundreds, or thousands, or tens of thousands of IP addresses. The risk of legitimate traffic coming from those is also much lower. It becomes a nice data-driven way to create a set of ASNs that you can evaluate for blocking. Now, it becomes kind of an operational task. Okay, these are the top 200, for example, ASNs in terms of the percentage riskiness. Which of these are safe to completely block?

Dave Bittner:

Then you also looked at the risk of ASNs based on their command and control association.

Bill Ladd:

We looked at that because we wanted to look at the ASNs that were associated with the riskiest level of internet activity. The fact that certain ASNs are prone to be used for, say, botnets or scanning, is interesting and certainly worth blocking, but the highest priority of things that you want to protect your network against are the command and control centers that malware on your system is trying to communicate to. We took a look at the ASNs associated with command and control. What surprised me a little bit there initially, in looking for data, was that I think the top three were U.S.-based.

Dave Bittner:

Right.

Bill Ladd:

As we thought through that, we realized that what was driving that was, if you’re going to go to the effort to set up a command and control infrastructure, you want the network to look as harmless as possible when traffic is going back and forth. If you can find U.S.-based infrastructure to set up your command and control servers on, then traffic back and forth between a company’s infrastructure and U.S.-based — but compromised — ASN is going to more likely pass through various levels of network security that you might have in place. It’s going to look less strange than if you’re talking to Uzbekistan. I think what we saw there, right, is we see that different types of risky behavior essentially are located in different geographical regions and different ASNs.

In thinking about the kinds of behavior that you want to protect your network against, and looking at where that type of behavior has come from in the past, allows you to take an approach towards setting up controls and protecting your network appropriately.

Dave Bittner:

In terms of these C2 servers that are set up in the United States, what are we talking about here? What’s the process for someone who wants to set something like that up and are they pretty much running without fear of, for example, law enforcement coming and shutting them down?

Bill Ladd:

You know, it’s always a possibility. I think what we see here is, there’s several hosting infrastructures where it’s been safe to be malicious for a number of years. When I was looking at the East Site, for example, or Psyches. People have been talking about malicious activity coming out of those ASNs for a good five or ten years. They continue to operate, and so, I think that if you’re a threat actor, you understand where you can set up infrastructure that you’re likely to be permitted to continue to run. The infrastructure that you’re setting up on the company that’s offering that isn’t going to threaten you.

In contrast, if the organization operating the ASN has better control over the type of traffic, or better monitoring over the type of traffic coming in and out of the ASN, it’s going to be more restrictive. I think the threat actors are basically looking at the places where they’ve been allowed to operate in the past and continuing to operate out of those locations. I think that’s actually one of the key insights for me in this research.

I can track IPs that are currently illustrating some type of risky activity and perhaps block against those, but if I understand that that risky activity comes from … a large amount of it comes from select areas of the internet, preemptively putting blocks in against those areas is both safe, because they tend to be fairly small, so the risk of illegitimate traffic is small and prudent, because they’ve been seen to be malicious for years and years.

Dave Bittner:

You also looked at IP addresses that were hard-coded into malware, and you analyzed the associated ASNs with those. What did that yield for you?

Bill Ladd:

The thing that was interesting there is, the IP addresses that were hardcoded tended to shift back towards China. When we looked at IP addresses associated with C2, a lot of those are generated via behavioral analysis of the malware file. What typically happens is, the malware file contains a domain. It reaches out to that domain when it’s detonated in a sandbox. The domain can be registered to different IP addresses at different points in time. If this command and control service can be be moved around, because the malware is starting with the domain. Resolving that domain to its current IP address, then connecting to that.

Those were the ones we saw primarily. Much, much more predominantly situated in the U.S., because the traffic would look innocent and because the threat actors can move the infrastructure, the IP addresses, from IP address to IP address as an individual IP address, gets put on current threat lists. In contrast, if you’re going to go to the trouble to hardcode your IP addresses in the malware, you want to make sure that you have pretty good control over whether those IP addresses are going to stay up and running. It doesn’t help you if you point your malware to a hardcoded IP address and that IP address shows up on blocklists, and all of a sudden, you’re shut down.

What we found is, with the hardcoded ones, they tended to be more centralized. Certainly, a larger percentage in China, where presumably the threat actors have more control and more confidence that they can keep their infrastructure up and running on specific IP addresses. That was kind of a different level of operational insight into how threat actors are operating based on the type of malware they’re creating. Are they creating it via … that reaches out via domain, or are they creating it that reaches out via hardcoded IP address?

Dave Bittner:

One of the things that the report highlights is testing the percent risk based on ASN blocking. Can you take us through that?

Bill Ladd:

Sure. So, what we did, we verified that certain ASNs had high levels of riskiness. Then the question is, okay, if you block those, does it make a difference, right? Is that different from blocking just the IPs that you currently know are risky?

Dave Bittner:

Right.

Bill Ladd:

What we did was, we basically created a set of the ASNs that had one percent or more risky behavior, and looked at new IPs that emerged with risk content after a certain point in time. Basically, we took our risk list at, I think it was June 15, and then we looked at the 30 days following, and discovered that indeed, if you take this set of high-risk ASNs based on historical riskiness of IPs, new risky IPs continue to emerge out of those ASNs at a reasonably good clip and you can basically, depending on where you want to draw the line, block 10,000 to 100,000 new IPs that are about to become risky, based on an ASN-level block, as opposed to looking for existing bad behavior which is typically how IP blocking works.

Dave Bittner:

In terms of the lessons learned, and in terms of advice to people who will be looking to this research to help inform how they’re going to protect themselves, what are the take homes?

Bill Ladd:

I think there’s two levels of take homes. I think one is kind of general situational awareness of, where is the risky behavior on the internet? China’s not a surprise. Latin America was a little bit of a surprise to me. The fact that the major ASN in Venezuela had such a high overall level of riskiness. I think there’s situational awareness of, as we’re looking at locations, security operators have some preconceived notions about where risk content is coming from. I think this can educate some of those, as well. I think operationally, the key is that we can use a data-driven approach to identify candidate ASNs for blocking, and that we can create relatively manageable lists to step through and evaluate and say, yeah, these are the ones that I want to block.

There are a couple of different ways that you can do that based on the behavior that you’re interested in. You can look at percent riskiness overall. You can look at ASNs that have a large amount of C2 behavior and step through a list of … a network operator can step through and create verdicts of whether or not they want to block that set of ASNs. To me, it’s clear that if you block those, if you go through and you set a collection of ASNs that you’ve looked at, and you prioritize as … these don’t have legitimate business value to me, and they’re the ongoing source of risky activity, that there’s going to continue to emerge risky internet activity from those IPs, though you will have protected yourself from by putting blocks in place.

I think the key is, how do you get ahead of risky activity on the internet? At any given point in time, there’s hundreds of thousands, or millions of IPs that are having some type of risky behavior. You’ve got various types of monitoring that are trying to tell you that this IP is associated with this risk, and so on and so forth, and by looking at risky ASNs, you can do some preemptive blocking of those areas directly. Then, by having a data-driven approach to step through which ASNs should we evaluate. I think that’s really the key. There’s 57,000 or so ASNs at any given point in time, how do you prioritize the process by which you evaluate and block those. Looking at the historical risks in this is a really straightforward way to do that.

What we saw is that if you take that approach it does work. That new IPs that were previously unknown to be risky, emerge out of those bad ASNs as new threats. That if you put those blocks in place, that it can help protect your network.

Dave Bittner:

Our thanks to Bill Ladd for joining us.

You can find the report “From Chasing Risk Lists to ASN Policies: Large-Scale Analysis of Risky Internet Activity” on the Recorded Future blog page at recordedfuture.com/blog.

Don’t forget to sign up for the Recorded Future Cyber Daily email, where every day you’ll receive the top results for trending technical indicators that are crossing the web, cyber news, targeted industries, threat actors, exploited vulnerabilities, malware, suspicious IP addresses, and much more. You can find that at recordedfuture.com/intel.

Be sure to save the date for RFUN, the sixth annual threat intelligence conference coming up in October in Washington, D.C. Attendees will gain valuable insight into threat intelligence best practices by hearing from industries luminaries, peers, and Recorded Future experts. Details are at recordedfuture.com/rfun.

We hope you’ve enjoyed the show and that you’ll subscribe and help spread the word among your colleagues and online. The Recorded Future podcast team includes Coordinating Producer Amanda McKeon, Executive Producer Greg Barrette. The show is produced by Pratt Street Media, with Editor John Petrik, Executive Producer Peter Kilpe, and I’m Dave Bittner.

Thanks for listening.

Related Posts

Exploring the Future of Security Intelligence at RFUN: Predict 2019

Exploring the Future of Security Intelligence at RFUN: Predict 2019

December 5, 2019 • The Recorded Future Team

Just about a month ago on October 29 to 31, more than 600 Recorded Future partners, clients, and...

Threat Hunting, Mentoring, and Having a Presence

Threat Hunting, Mentoring, and Having a Presence

December 2, 2019 • Monica Todros

Our guest today is O’Shea Bowens He’s CEO of Null Hat Security and a SOC manager for Toast, a...

From Infamous Myspace Wormer to Open Source Advocate

From Infamous Myspace Wormer to Open Source Advocate

November 25, 2019 • Monica Todros

If you are of a certain age — an age where you may have spent a good bit of your time online...