Fears of GDPR-Triggered Spam So Far Unfounded

September 17, 2018 • Amanda McKeon

Chances are you’re familiar with GDPR, the European Union’s General Data Protection Regulation. It went into full effect back in May of this year, with the goal of improving the privacy and security of European citizens in particular, but the global community overall as well.

One of the impacts of GDPR was that it made the WHOIS database private. WHOIS is the searchable online directory of domain name registrations, and some security researchers had concerns that spammers might take advantage of this anonymity to increase their registration rate of domain names, making it easier for them to send out their spam.

Allan Liska is a senior security architect at Recorded Future, and he analyzed several months’ worth of data on spam rates to see if the expected uptick came to pass. Allan wasn’t alone on this project — he had assistance from his son, Bruce, who interned at Recorded Future this past summer and co-authored the report. We’ll hear from Bruce as well.

This podcast was produced in partnership with the CyberWire.

For those of you who’d prefer to read, here’s the transcript:

This is Recorded Future, inside threat intelligence for cybersecurity.

Dave Bittner:

Hello everyone, and thanks for joining us for episode 74 of the Recorded Future podcast. I’m Dave Bittner from the CyberWire.

Chances are you’re familiar with GDPR, the European Union’s General Data Protection Regulation. It went into full effect back in May of this year, with the goal of improving the privacy and security of European citizens in particular, but the global community overall as well.

One of the impacts of GDPR was that it made the WHOIS database private. WHOIS is the searchable online directory of domain name registrations, and some security researchers had concerns that spammers might take advantage of this anonymity to increase their registration rate of domain names, making it easier for them to send out their spam.

Allan Liska is a senior security architect at Recorded Future, and he analyzed several months’ worth of data on spam rates to see if the expected uptick came to pass. Allan wasn’t alone on this project — he had assistance from his son, Bruce, who interned at Recorded Future this past summer and co-authored the report. We’ll hear from Bruce as well. Stay with us.

Allan Liska:

It was interesting. There was a lot of concern in the security community that when GDPR was enacted in Europe, it was going to lead to a flood of new spam campaigns that would essentially work undeterred, or be undeterred, because of the fact that WHOIS privacy is now the default standard. Some researchers rely heavily on WHOIS data to do analysis of domain names and make connections between domain names and tie them to campaigns. With that information now being private by default, there was concern that that would lead to a flood of spam, and so we wanted to see what the truth on the ground was — whether or not there had been a huge uptick in spam.

Dave Bittner:

Let’s just back up a little bit here for folks who might not be completely up on it. Can you give us a rundown of what exactly the WHOIS database is, how it came to be, and how it works?

Allan Liska:

Sure. WHOIS has been around, I think, pretty much since the start of domain registration. Basically, what it is is, it’s a database of who registered a domain, contact information, phone numbers, email addresses, as well as what registrar was used, and what domain servers that domain is tied to. All of that is produced in a command-line tool, where basically, you type in, “Who is recordedfuture.com?” and you get all the information about Recorded Future.

Over the years, there have been things that domain registrars have put in place to prevent some of that information from being seen, such as domain privacy. Domain privacy means that you don’t know who registered a domain, what their phone number is, what their contact information is. You still have the day it was registered, who the registrar was, and what the name servers associated to that domain are.

Then there are other things. Some domain registries enforce WHOIS collection more effectively than others. So there are some domains — .com, .net, .org — that have fairly robust WHOIS infrastructures, but there are other domains — .fun, .men — some of the new, generic top-level domains, that don’t have good WHOIS infrastructure and a lot of the data in there. Even before GDPR was put in place, it wasn’t very good. It wasn’t very reliable.

Dave Bittner:

What was the rationale for enabling the option of keeping the WHOIS information private?

Allan Liska:

GDPR, in general, is a privacy law that was enacted in Europe. I’m sure any security professional out there, when GDPR first went live or in the months leading up to it, got a whole lot of emails talking about GDPR and so on, so they’re probably familiar with that. This was not something that was specifically laid out in GDPR, but one of the side effects is that, by default, consumer information has to be private. WHOIS information has always been public, so you could contact the domain owner and do what you need to do to resolve a dispute, or if there’s a problem with the domain, or if you’re a security researcher and you’re trying to determine if the domain is malicious or not. That’s always been public by default, but because of GDPR in general, for domain registrants in Europe, that domain information now had to be set to private.

Most registrars adopted the, “Well, if I’m doing it for users in Europe, I’m just going to do it for everybody, because I don’t want to take a chance that somebody used a wrong address or something like that and I didn’t accidentally make their information private. So I’m just going to make everybody’s information private.”

Dave Bittner:

I see. So, the fear was, with that privacy setting, what would the spammers be able to do?

Allan Liska:

Clearly, some security researchers relied fairly heavily on WHOIS information to track down spammers. A lot of spammers don’t have good operational security when they’re registering the domain, so they’ll either use their real information, and then you can tie all those spam domains together and automatically block those domains, or they’ll reuse fake information. They’ll use “Joe Smith” over and over again, and they’ll use the same fake phone number, and so on. Again, you can tie that information together and say, “Okay, this is a bad domain, because this person registered it. Every time this person with this WHOIS information has registered a domain, it’s turned out to be bad, so we’ve gone ahead and blocked it now.” We can be proactive in blocking these bad domains.

Dave Bittner:

You decided to take a look at what exactly the results were when it came to email. What did you find?

Allan Liska:

We looked at data, basically, in the month leading up to GDPR and then the month after GDPR, and extended it a little bit longer to see whether or not there were any immediate trends. Obviously, full disclosure — a month is not a whole lot of time to really calculate what is a trend and what isn’t a trend, but we wanted to see if there were any kind of initial patterns that we could determine.

The first thing we did was, we accessed a public database. Cisco maintains a database of spam emails and spam volumes, specifically. We wanted to see whether or not there had been any trend upward since GDPR, and there wasn’t. It turned out that there was actually a trend downward, which is normal for the summer months. However, the trend downward was a little bit deeper than what we would normally expect in the summer months. In the summer months, spam campaigns tend to fall off, but we actually saw it fall off more deeply than it normally does. So we thought that was kind of interesting.

Then, what we did was, we queried Spamhaus. For people not familiar, Spamhaus is an organization that tracks spam and spam domains, et cetera, and creates what they call real-time blacklists that allow subscribers to Spamhaus to automatically block these domains and make sure they don’t get any mail from these bad domains being sent to their users. Spamhaus maintains a list of the top-level domains that have the worst percentage of spam. What I mean by that is, because .com is the biggest domain, it has the most number of bad domains, but it actually has a relatively small percentage of bad domains. It may have 100,000 bad domains, but there are 10 million .com domains, so it has a relatively low percentage.

What we wanted to look at is the ones that are really bad. So domains like .fun and .men are two that we specifically mentioned in the article that had a relatively high percentage of spam. It was somewhere in the neighborhood of 70 or 80 percent of domains registered in those top-level domains are used for bad purposes, whether that’s spam, phishing, other types of malware, et cetera. Because Spamhaus maintains that list, we wanted to look to see whether or not there had been an uptick in those domains, because people were already using these bad domains. So, we thought, if they were getting ready to launch new campaigns, they would register more of these domains in these bad domains. I’ll let Bruce explain what he found.

Dave Bittner:

Before we jump in, I want to take a moment to introduce Bruce. Allan, you had a special assistant this summer to help you dig through some of this data here. Bruce, can you introduce yourself?

Bruce Liska:

Hi, I’m Bruce. I worked at Recorded Future this past summer doing data analytics for the professional services team.

Dave Bittner:

Most importantly, your relationship to Allan is?

Bruce Liska:

He’s my father.

Dave Bittner:

All right, that’s what we were going for. Let’s dig through some of the things you found here, Bruce. Take us through what you discovered.

Bruce Liska:

Overall, like my dad said, there was a downward trend in not all, but a lot of the domains — a lot of the .coms and all that. Well, .com actually went up to about 51 percent. Before, it was like, 40-something percent. But most of the other .nets and all that, they all went down. Most of them went down by a lot.

Allan Liska:

As a percentage of total domains registered.

Bruce Liska:

Yeah, as a percentage.

Dave Bittner:

Let’s go through some of the general email volume statistics here. Bruce, if you could go through some of this with us … This was interesting to me. I think, for most of us, email is one of those things that’s just constant. It’s there. But I don’t know that many of us really consider the amount of volume that flows through email and the percentage of it that’s spam. Can you run through some of these numbers that you all included in your research?

Bruce Liska:

Yeah, about 433, almost 434 billion messages are sent through email every day, or Gmail every day, and spam accounts for about 370 billion of those messages, which is about 85 percent of all emails sent.

Dave Bittner:

I think that’s remarkable that 85 percent of all email is spam.

Bruce Liska:

Yeah.

Dave Bittner:

You say that you did see the volume of email fall off, but that was most likely a seasonal thing. What are your conclusions here? Based on the data that you’re seeing, were the fears of an outbreak of spam … Were they unfounded, or are we still sort of in a “wait and see” mode?

Bruce Liska:

Right now, we’re in a “wait and see” mode. It’s way too early to determine any trends. Instead, what we wanted to point out is, the initial concern was that there was going to be to be a huge flood of spam. It doesn’t seem to be happening, and people don’t seem to be preparing to launch a bunch of spam. Now, of course, things can obviously change and change rapidly.

Interestingly, there have been a couple of rebuttal blog posts that have been published since we delivered this. One of the assertions that I think people think we’re making, that we’re not making, is that this falloff in spam was directly caused by GDPR, and that’s not really the point that we’re making. All we’re observing is that there hasn’t been a big increase in spam since GDPR was enabled. We’re not attributing any falloff in new domain registrations or any falloff in spam that we’re seeing to the fact that GDPR was enabled, but simply that the jump hasn’t happened in the way some security researchers thought it would.

Dave Bittner:

Right. So from here, we stay tuned, and we see what’s going to happen over the next few months, I suppose.

Bruce Liska:

I agree.

Dave Bittner:

Bruce, to switch gears a little bit before we leave you guys today, can you share with us, what was the experience like? You interned at Recorded Future this summer, spent some time with your father along the way. How was that experience for you?

Bruce Liska:

I thought it was actually really fun. Normally, I’m used to being told what to do by my father, but since I didn’t directly work under him, we worked together, instead of him telling me exactly, “Do this and do that.”

Dave Bittner:

I see. Allan, how did that work out for you?

Allan Liska:

The challenge for me is that everybody liked him better than they like me, because they would give him an assignment … He did a whole lot of number crunching for a lot of different people. Bruce has aspirations to be a data scientist. He did a lot of number crunching for a lot of different people, and everybody was really impressed with how quickly he responded, how quickly he got the data that was needed and got the numbers back to them, and then could actually answer questions about the conclusions that were being drawn. It’s tough being the second-most popular member of a family in an office.

Dave Bittner:

Yes, yes. I used to work with my wife, so I am familiar with that, actually. Well, gentlemen, thanks for taking the time for us today. It’s an interesting report, “90 Days of GDPR: Minimal Impact on Spam and Domain Registration.” Bruce, I certainly wish you the best. Allan, how nice that Recorded Future provided this opportunity for both of you.

Allan Liska:

Yeah. I’m actually hoping that we can figure out a way to do more of these internships going forward. Bruce, obviously, was a special case because he’s got the data scientist background and he’s got the programming background, so he had the experience coming in. But I’m hoping that we can work with more high school kids going forward to give more of them an opportunity like this.

Dave Bittner::

Our thanks to Allan and Bruce Liska for joining us.

Their research is titled, “90 Days of GDPR: Minimal Impact on Spam and Domain Registration.” You can find it on the Recorded Future website in the blog section.

If you enjoyed this podcast, we hope you’ll take the time to rate it and leave a review on iTunes. It really does help people find the show.

Don’t forget to sign up for the Recorded Future Cyber Daily email, where every day you’ll receive the top results for trending technical indicators that are crossing the web, cyber news, targeted industries, threat actors, exploited vulnerabilities, malware, suspicious IP addresses, and much more. You can find that at recordedfuture.com/intel.

We hope you’ve enjoyed the show and that you’ll subscribe and help spread the word among your colleagues and online. The Recorded Future podcast team includes Coordinating Producer Amanda McKeon, Executive Producer Greg Barrette. The show is produced by Pratt Street Media, with Editor John Petrik, Executive Producer Peter Kilpe, and I’m Dave Bittner.

Thanks for listening.