Blog

How Artificial Intelligence Is Shaping the Future of Open Source Intelligence

Posted: 9th January 2019

By: THE RECORDED FUTURE TEAM

How Artificial Intelligence Is Shaping the Future of Open Source Intelligence

Editor’s Note: The following blog post is a summary of a presentation from RFUN 2018 featuring OSINT subject matter expert Chief Warrant Officer 3 Nathan McKeldin of the U.S. Army.

Key Takeaways

The term artificial intelligence (AI) describes technologies that can make informed, non-random decisions algorithmically.
Open source intelligence (OSINT) is based exclusively on publicly available information such as the contents of the open web.
AI has many applications in OSINT, both for military and domestic purposes. In particular, it enables human analysts to collect, analyze, and interrogate huge data sets that would otherwise be insurmountable.
While AI is vital to the future of OSINT — and intelligence gathering in general — it can’t replace an analyst’s ability to determine the “so what” of intelligence outputs.

The term “artificial intelligence” (AI) has become ubiquitous in the security world. If you believe the sales brochures, practically every security product on the market has some level of AI functionality.

Many types of security solutions tout the benefits of AI, including threat intelligence. But what benefits does AI really bring to the intelligence gathering process?

Late last year, we held our seventh annual Recorded Future User Network (RFUN) conference in Washington, D.C. During the conference, attendees were treated to a presentation on AI applications for open source intelligence (OSINT) by subject matter expert CW3 Nathan McKeldin of the U.S. Army.

Over the past five years, Nathan has used OSINT extensively to inform operational planning, fuel counterintelligence investigations, and support tactical military units with insights, warnings, and targeting information. In his presentation, Nathan explained how AI fits into the intelligence gathering process and how we can expect it to evolve over the next two decades.

Demystifying 4 Key Terms

To get things started, Nathan took the audience through four key definitions.

1. Open Source Intelligence (OSINT)

Since OSINT is Nathan’s primary focus, that’s where things began. According to Public Law 109-163, Sec. 931, OSINT is:

Produced from publicly available information
Collected, exploited, and disseminated in a timely manner to an appropriate audience
Addresses a specific intelligence requirement

The key phrase here is “publicly available.” OSINT is founded on information that’s intended for general public consumption, which means no covert techniques or forced entry is required to collect it.

2. Publicly Available Information

The Department of Defense Manual 5240.01 (August 2016) defines publicly available information as information that is:

Published or broadcast for public consumption
Is available on request to the public
Is accessible online or otherwise to the public
Is available to the public by subscription or purchase
Could be seen or heard by any casual observer
Is made available at a meeting open to the public
Is obtained by visiting any place or attending any event that is open to the public

Simply put, if a regular person can access a piece of information without doing anything illegal, you can reasonably classify it as “publicly available.”

3. Artificial Intelligence (AI)

Nathan defines AI as “cognitive reasoning determined through data-driven analysis and algorithmic functions.”

Put another way, an information system can be described as intelligent if it makes an informed, non-random decision algorithmically.

Common subcategories of AI include:

Machine vision
Machine learning
Natural language processing (NLP) and machine translation
Robotics
Purpose-driven and autonomous machines

However, AI is not the same thing as automation, which is where a system automatically responds to an expected input (or set of inputs) by producing a desired output. The automated doors at your local Walmart may be helpful, but they aren’t intelligent.

4. Machine-Aided Analysis

One of the simplest applications of AI in intelligence gathering is to increase the efficiency of a human analyst. Machine-aided analysis describes the application of certain tenets of AI to a data set to execute tasks a human is capable of, but at a much greater volume and velocity.

Put another way, machine-aided analysis could be described as an “easy button” that takes away the heavy lifting from time-consuming, analytical tasks. For example, it could be used to digest a large volume of documents and automatically produce an output of people, places, and things.

How Does AI Fit Into War Fighting?

Once he’d covered the basic terminology of AI and OSINT, Nathan gave the audience an overview of how AI is being used to improve military operations.

As he explained, from a military perspective, there are five domains:

Land (Army)
Sea (Navy)
Air (Air Force)
Space
Cyber

Cyber Domains

Naturally, while each of these domains is important in its own right, all five domains are heavily interconnected. Airplanes fly, but they have to land on an airstrip or aircraft carrier. Troops are regularly moved around by air and sea. And, for obvious reasons, the cyber domain has become heavily intertwined with each of the other domains over the past few decades.

So where does AI fit into war fighting? Right at the center.

As AI has evolved, dozens of applications have been identified across each of the primary domains. Cyber is the most obvious candidate, since AI comes primarily from machines, networks, and computers connected to the internet, but there are also plenty of applications for AI in traditional military domains.

AI for the Land Domain

Over the next 10 to 20 years, we’ll see a huge increase in the use of AI technologies to improve army operations. Some of the most valuable advances will likely include:

The use of augmented reality devices with dialed-back stimuli to enable realistic, high-impact training
Autonomous route-clearing vehicles that can blow up mines and trip IUDs
“Amazon goes to Iraq” — Automated logistics processes that feed supply reports from the field to an autonomous warehouse system that can pull equipment such as MREs and bandages off the shelf, and deliver them to the front line via self-driving supply trucks

AI for the Air and Sea Domains

In addition to the use cases described above, AI will be used to control drones for a variety of military, foreign, and domestic purposes.

AI will also be used to control so-called “drone swarms” to overwhelm opposing forces. The AI involved would be similar to that used to control drones at the South Korean Winter Olympic opening ceremony.

AI for the Space Domain

There are plenty of applications for AI in space, but perhaps the most obvious will be systems designed to help space-based assets avoid asteroids, and the development of kinetic strike vehicles.

Empowering the Intelligence Cycle

Once he’d covered the frontline military applications of AI, Nathan took a step back to focus on how AI can be used to enhance intelligence operations.

Consider the intelligence cycle. In simple terms, it’s a feedback loop which starts with a set of requirements, and ends when those requirements are met with an actionable intelligence product.

Intelligence Cycle

This cycle has been going on, in some capacity, for as long as wars have been going on, and up until recently, it was a heavily manual process. However, over the last 50 years, a number of programs and capacities have been developed to automate portions of the process.

For example, here are some of the ways that AI can fit into the intelligence cycle:

Artificial Intelligence Cycle

Note that in the first phase of the cycle, Nathan has highlighted machine-aided analysis as the tool set of choice. This is because it’s important to retain human involvement in the loop, particularly when the outputs of the process will be used to inform real-world operations. We’ll look at how machine-aided analysis is enhancing OSINT practices shortly.

Beyond this, however, there are a huge number of applications for intelligent technologies throughout the intelligence cycle. From automated data collection via AI-powered drones and sensors (in the real world) and web crawlers and spiders (in the cyber domain), right through to automated and semi-intelligent dissemination mechanisms to the intended recipient, AI will increasingly be used to enhance and accelerate the intelligence cycle, particularly in a military context.

In fact, AI even has a role to play in improving the intelligence cycle itself.

Intelligence Cycle Reimagined

In the diagram above, note the addition of machine learning to the center of the cycle. As Nathan explained, in the traditional intelligence cycle, humans provide the feedback loop to ensure the process is refined over time. In the future, machine learning will perform this function automatically and iteratively train collection and analysis algorithms by figuring out what’s working and what isn’t based on AI-fueled analysis of massive data sets.

‘Like Trying to Find a Needle in a Stack of Needles’

That brings us to OSINT, a clear candidate for AI enhancement. The single greatest problem faced by OSINT collectors is the sheer volume of data available.

To give the audience an idea of scale, Nathan went over some stats on current global internet usage pulled from Internet Live Stats. In 2018 alone, internet users around the world have:

Sent 67,105,618,987,773 emails
Searched Google 1,648,103,202,209 times
Published 1,557,976,566 blog posts
Sent 193,923,407,330 tweets
Watched 1,787,898,764,637 videos on YouTube
Uploaded 20,581,035,026 photos to Instagram

As Nathan put it: “Because of the massive amount of data available, working with OSINT can be like trying to find a needle in a stack of needles. It can be very hard to get down to that one particular needle you need to find.”

For this reason, applying AI and machine-aided analysis to OSINT has a whole host of benefits. For example:

Data Aggregation: Taking unstructured data from the internet and putting it into a structured environment so it becomes queryable, filterable, sortable, and digestible
Visualization: Using technology to compare aspects of a data set (e.g., geographic, temporal, and so on)
Reasoning: Looking at news stories and tracking their propagation across the internet to determine whether they are likely to be true or false (this will become particularly important as fake news, propaganda, and so-called “deep fakes” increasingly become an issue)
Automated Alerting and Reporting: Taking intelligence outputs and making them rapidly available to their intended audience, either as a direct intelligence product or as a resource for AI-powered queryable technologies, such as heads-up displays (HUDs)

Although effusive about the power and value of intelligent technologies to enhance OSINT processes, Nathan was quick to point out that it will never be able to replace an analyst’s ability to determine the “so what?” of information — nor should it.

A (Hypothetical) Example of AI-Powered OSINT in Action

To help put everything into context, Nathan covered a hypothetical scenario of a military analyst using AI-powered OSINT to solve a real-world problem: tracking the activity of extremist groups.

The process could run as follows:

An analyst receives automated outputs from a set of AI processes designed to highlight trends and information likely to be of interest.
She notes the recurrence of a particular symbol associated with extremist activity and sets web crawlers to work finding other instances.
Crawlers collect information from social media, publicly available code repositories, English-language media, foreign-language media, and thousands of other sources.
The collected information is run through AI-powered systems that produce pre-defined outputs (like profiles of relevant actors and/or times and geographic locations associated with instances of the extremist symbol).
Intelligence is automatically prepared into a report which can be utilized by commanders or in-field operators to inform military action.

In this example, the entire process, from collection to dissemination, is facilitated by AI. Threat intelligence solutions like the Recorded FutureⓇ Platform are already being used to fulfill intelligence processes very much like the one Nathan described, both for military and cybersecurity purposes.

AI Possibilities for the Future of OSINT

To round things off, Nathan covered some of the ways AI could impact OSINT in the coming years.

From a positive perspective, AI could power personal assistant technologies with access to open source or classified databases — essentially like Siri with a security clearance. Imagine an in-field operator asking, “Hey Siri, when was the last time [extremist organization] was active in this area?” and receiving an immediate, accurate response.

Similarly, AI could power wearable technologies such as a smart contact lens that tracks eye motion, reads documents, discovers relationships, makes recommendations, provides analyses, and pushes everything to a HUD — think next-generation Google Glass built into a tactical visor.

Of course, not all applications of AI will be forces for good.

From an adversarial perspective, fake news and “deep fakes” — the use of AI to quickly create relatively convincing fake audio or video, like a clip of the president giving a speech he never actually gave — will continue to be a huge concern. We’ve already seen this to some extent with Russian interference and propaganda during the 2016 U.S. presidential campaign, and things are only going to become murkier. In particular, fake videos are getting better all the time, and are already becoming very difficult to identify.

Fortunately, as Nathan pointed out during the presentation, there will also be applications for AI to distinguish between genuine and fake media by analyzing signatures and tracking their propagation online.

Ultimately, Nathan’s sentiment was clear. AI has a huge role to play in the future of intelligence gathering, both for military and domestic organizations, and it will become increasingly mainstream as time moves on.