Artificial Intelligence in Black and White

February 28, 2018 • Monica Todros

Editor’s Note: The following blog post is a summary of a presentation from RFUN 2017 featuring Staffan Truvé, CTO and co-founder of Recorded Future, and Chris Poulin, principal/director at Booz Allen Hamilton.

Key Takeaways

  • There are two sides to making technological advances using artificial intelligence — one for the white hats, and one for the black hats.
  • Artificial intelligence is built upon developing algorithms and using technology to scale up to a much larger volume than humans could ever handle.
  • Open source intelligence is now being used by bad guys for things like social engineering, allowing them to amass an enormous amount of information on you.
  • Artificial intelligence can not only advance an organization’s threat intelligence strategy, but also provide a factor of scale for attackers.

Artificial intelligence is about constantly trying to push the technology barrier — once you actually succeed, however, it can be challenging to find the next new territory.

There are a multitude of tricky questions to answer in dealing with artificial intelligence, so fortunately, there is no shortage of work in the field. Artificial intelligence is a complex contradiction in that it simultaneously deals with solving simple problems that are just repetitive, human tasks, while at the same time trying to push machines beyond human capability.

Staffan Truvé, CTO and co-founder of Recorded Future, recently shared his expertise at the company’s annual threat intelligence conference in D.C. For the last 25 years, he’s not only been busy with startups and research, but is also an authority on machine learning and artificial intelligence.

So, what is artificial intelligence, exactly? According to Truvé, artificial intelligence is “when a machine does something that, if a human had done it, we’d say it required intelligence to do.”

Truvé explains that while his definition is concise, it’s not really watertight for how artificial intelligence is actually being used today. For some, artificial intelligence is like magic — but at some point, you figure out how to solve a certain a problem, and when the magic disappears, you’re left with an algorithm and a computation.

What’s the Hype?

Everyone gets affected when there’s a cool new technology circulating out there and being talked about. Artificial intelligence can be challenging to understand, but that doesn’t mean it isn’t rising in popularity. As you can see below, more and more people are searching for terms related to the technology.

Google Trends for Artificial Intelligence and Machine Learning

Google trends for the terms “artificial intelligence” and “machine learning.”

Mentions of Artificial Intelligence and Machine Learning in Russian Forums

Mentions of artificial intelligence and machine learning in top-tier Russian forums.

The interest in artificial intelligence and machine learning goes beyond the general public — the terms are also discussed more frequently in top-tier Russian hacking forums.

The ‘White’ in Black and White

Now that we’ve seen the metrics, how can you actually use artificial intelligence in your threat intelligence strategy? There’s a whole list of ways, and a few examples include:

  • Pattern recognition
  • Anomaly detection
  • Natural language processing (NLP)
  • Image analysis — OCR and more
  • Analytics — predictive, descriptive, and prescriptive

Of the techniques available, pattern recognition and anomaly detection have been around for some time, with every antivirus company in the world using it in some way or another. One can say that these techniques are just about statistics, but they’re also very much related to the machine-learning algorithms being developed in artificial intelligence.

Natural language processing is a critical part of how Recorded Future works, Truvé says. To be able to take unstructured text in a multitude of languages and transform it into a uniform language and independent representation is good artificial intelligence, but it’s hard to do. The same goes for image analysis and analytics; it’s a challenge to be able to ingest tons of data in various forms, analyze it, and present it in a comprehensible way.

Recorded Future uses natural language processing to search for clues in text. For example, in the below reference, the NLP algorithm is able to identify Anonymous as the attacker, Japanese Nissan as the target, and a DDoS attack as the method of attack.

Natural Language Processing in Recorded Future

Natural language processing in Recorded Future.

Our natural language processing technology covers everything from looking at an arbitrary web page and understanding which portion of text on the page is the one you want to extract and apply your analysis to, to doing different kinds of classifications, such as entity recognition, parsing, detecting events, and temporally understanding if an event being discussed is in the future or in the past.

Machine Learning Rule-Based Algorithms

The processing pipeline of Recorded Future’s Threat Intelligence Machine uses machine learning and rule-based algorithms to transform unstructured information from open, technical, and dark web sources into actionable threat intelligence.

According to Truvé, the best way, engineering-wise, to get a robust system is to combine machine learning and good old-fashioned rule-based systems.

The ‘Black’ in Black and White

Chris Poulin, principal/director at Booz Allen Hamilton, joined Truvé to touch on the dark side of artificial intelligence. Poulin has 25 years of information technology and security experience, not only building and running nationally respected information security consulting firms over the years, but also managing hundreds of projects across all industries.

During his research into how attackers are going to start using artificial intelligence, Poulin discovered a relevant quote: “The other countries use hammers, too.” In essence, hammers are used to build things, but they can also be used to break things down. That’s how artificial intelligence is being used on the dark side right now, Poulin says.

The Dark Side

When it comes to analytics, Google can categorize and prioritize search queries. But the thing with that technology is, bad guys use it too — except they use it for black hat SEO. Google, now moving toward an artificial intelligence approach, will extract different features and prioritize based on that. The white team is winning on this one, Poulin says, but it’s just a matter of time until the black hats catch up.

For example, if you’ve ever put a Word document or PDF file up on a website, you know that there’s a lot of metadata. Open source intelligence is now being used by threat actors for things like social engineering, and they’re going through social media and public documents looking for that metadata. By grabbing all of that geolocation information from images, in addition to anything else they can find, threat actors can amass an enormous amount of information on you.

Social Media Scams

One prominent example of a malicious social-engineering campaign can be taken from a company called Endgame, which created a chatbot to write tweets to lure people into clicking on malicious links. They also had a person creating tweets with the same purpose, and though the artificial intelligence bot was able to create more tweets by volume, the human’s tweets performed better, obtaining a 38 percent click through rate versus the chatbot’s 34 percent. By pure volume, however, the chatbot brought in 275 victims versus 49.

In addition to chatbots, even more new artificial intelligence uses have been appearing on social media. Using social platforms like Facebook and Twitter to create political social divisions are becoming more prevalent — like the recent alleged manipulation of the U.S. presidential election by Russians, for example.

The intent of this artificial intelligence is not necessarily to attack your networks, but rather to influence the discourse and how we act as a nation. If you can divide the people, you can take down the people, Poulin says. He also speculates that within five years or so, there are going to be artificial intelligence bots creating accounts, posting and cross-posting, and figuring out the best way to get people to read and click through on different types of social media posts.

Network Defenses

Something that’s a bit closer to home for everyone is how exactly black hats are attacking your networks. There was a type of malware once created using artificial intelligence to figure out how to bypass antivirus software, Poulin explains. It repeatedly tested whether or not it could modify a particular piece of malware subtly, until it had a high percentage rate of bypassing some of the next-generation antivirus products that use machine learning themselves to try to protect your systems.

This same kind of technology is being used to evade intrusion detection systems, and there’s also artificial intelligence that’s creating realistic-looking domains for CNC control. A lot of times, the domain-generator algorithms will be XYZ123, for example, but it’s pretty clear that’s not a real domain. The attackers are looking for ways to use dictionaries so that they can create real DNS domains that you can’t just identify by looking at a log and saying, “That’s clearly not something that somebody’s clicking on, or going to,” Poulin explains.

Poisoning models is another interesting aspect. A model that’s not particularly well-tuned can have its data be poisoned, and then you can do one of two things. You can either produce false outputs from it, or you can erode confidence in that model so people stop depending on it, and if you poison the models and cause enough white noise, then people stop looking at it.

Staying Ahead

In summary, artificial intelligence provides a factor of scale for attackers. Black hats are not only using it for things like reconnaissance, but they’re also using it to execute different attacks against you. And it’s not just one attack that succeeds, it’s a whole chain of events and exploits that makes an attacker successful.

There’s an expression Poulin shares that says, “you don’t have to swim faster than the shark, you just have to swim faster than your dive buddy.” That’s no longer true with artificial intelligence since you’re still in trouble when there’s two sharks, so that means you’ve got to swim with three people — but when there’s a feeding frenzy, it’s game over.

That’s where artificial intelligence is now — giving the attacker the economy of scale that means you are not protected just because you have better protections than somebody else.

If you’re interested in learning more about how artificial intelligence and machine learning can power your threat intelligence strategy to help defend against threats, download our free white paper, “4 Ways Machine Learning Is Powering Smarter Threat Intelligence.”