Guest Post: Improving Cyber Threat and Vulnerability Assessment With Web Intelligence
Since the advent of mankind we have been fascinated to know what the future holds for us. This thirst has not been quenched even in today’s age of advance science and technology. Predictions and prophecies by prognosticators like Nostradamus are a subject of interest for many, but today, Recorded Future is trying to utilize technology to better anticipate the future.
The concept behind Recorded Future is simple to understand (though difficult to implement) and is certainly not based on psychic visions. What the company does is gather content from the open web, filter it, analyze it, and generate predictive signals based on the refined information. Simply put, it is organizing the open source information from the web. For example, it may be useful to understand the social media discussions from a particular area when forecasting a regional election.
Recorded Future for Cyber Security
This kind of technology is very helpful in domains that require attention to and analysis of what is being discussed on the web and how information is linked. From the cyber security point of view, this technology is useful not only for reconnaissance (first phase of pentesting) on a target but also to learn about cyber attacks in the form of malwares, APTs, DDoS attacks, etc.
Let’s take the example of Java 0-day vulnerabilities. Figure 1 clearly shows the huge amount of mentions of the topic (Java 0-day) during the past year especially between August 2012 and February 2013. This information could be crucial for any corporation using Java in its product development. Based upon this graph they could issue warnings to their consumers to keep their Java versions updated and incorporate other defense mechanisms to safeguard themselves. Similarly, this information is helpful to end users in understanding the risks that a particular technology poses at any given time.
Figure 1. Java 0day:1 year- Trend Another example would be of the trends related to “Ransomware”. Figure 2 demonstrates a tree map of the topic “Ransomware” describing what products, technologies, organizations are mentioned in context to the topic within the past year. This helps us to map and understand the reach of a particular topic.
Ransomeware: 1 Year – Tree Map
Recorded Future has additional visualizations to represent information suitable for different types of analysis. It also allows for saving analysis and exporting to different formats such as PDF, CSV etc., which can be passed to different tools for further analysis.
Until now, we have talked about what Recorded Future is providing in its current form, which is pretty great, but let’s also discuss the future of RF as it relates to cyber security. We have seen some of the many useful features and what follows are additions that would make the platform more accurate and effective going forward:
Social Networks and many open file sharing platforms provide great amount of information. Apart from the original data of the file, there is also metadata hidden in the files, for example geolocation and other information (EXIF) in the image files, username in the PDFs, DOCs, etc. This information is crucial (though often ignored) and could help in many different scenarios, for example the metadata in a DOC file helped to locate the BTK killer (Dennis Rader). Including this feature in the platform would generate a wealth of information that would certainly help to generate better predictive signals.
Maltego is basically an OSINT and forensics application which is very useful for information gathering purpose. It is widely used in the IT industry especially Infosec to collect information about the target. A Maltego transform is a piece of code which takes some information as input and based on the transform generates some other related information, for example Domain name as input and IP address as output based upon DNS response.
This could be very helpful as Maltego already provides tons of information and integrating the platform with it could help to widen the scope. Transforms can be generated as a simple piece of code which can be used locally or deployed as a server (TDS). A recommended framework for transform generation is Canary.
Shodan is basically a computer search engine, which searches for computers based on software, geography, OS, IP address and more. Integration of Shodan with RF would help to generate some useful information about a target.
The web is vast and most of us have only touched the tip of the iceberg. There is a huge amount of information out there in the form of hidden web or Darkweb. Darkweb is simply the webspace inaccessible to web search engines. For example, the Onion domain websites. Most of the time these websites are behind some kind of mechanism which cannot be accessed by web spiders like a proxy (TOR) or other kinds of authentication. This part of the web requires more manual intervention, but the amount of information out there is really worth it. One thing that needs to be kept in mind is that access to such places is mostly anonymous (both user and provider) hence there are many illegal activities going on in these places. References:
Metasearching is simply sending the query request to different search engines and aggregating the results. Providing this feature would help users compare how the information provided by Recorded Future is different (or similar) to the information provided by different search engines. References:
Recorded Future is great at gathering and analyzing data from the web and it would be more efficient in producing the results if the platform also did subject matter specific sentiment analysis of the data gathered. Tuning the existing sentiment analysis currently in place to language specific to cyber security would be of great importance as it would help users understand potentially subjective information behind the data. References:
As most of the people are accessing the web today using smart phones it would be great to have Android/iOS apps for Recorded Future.
Some of these features might already be in the pipeline, and Recorded Future’s ability to filter out noise and different data visualization modes it provides makes it truly stand out of the crowd. It will be very exciting to see how this predictive power can be used by cyber security analysts.