The Recorded Future Blog
Threat Intelligence and SIEM (Part 1) — Reactive Security
by Guillaume Dupont on January 19, 2016
This blog post is the first part in a series about reactive versus proactive security with security information and event management (SIEM) and threat intelligence (TI). In this post we will present an overview of reactive SIEM, what it does, how it works, and its limitations. Please note this post will cover a significant amount of background information and details, but will not cover how to deploy and configure SIEM, what vendor to choose, or how to implement use cases.
Security information and event management (SIEM) is a solution that provides a bird’s eye view of an IT infrastructure. It fulfills two main objectives: (1) detecting in (near) real-time security incidents, and (2) efficiently managing logs. These objectives were respectively called security event management (SEM) and security information management (SIM), but nowadays these functions have been merged into a single capability known as SIEM. From a high-level point of view, a SIEM collects information (e.g., logs, events, flows) from various devices on a network, correlates and analyzes the data to detect incidents and abnormal patterns of activity, and, finally, stores the information for later use (reporting, behavior profiling, etc.). When successfully deployed and configured, a SIEM helps organizations:
- Discover internal/external threats.
- Monitor (privileged) user activity and access to resources.
- Provide compliance reporting.
- Support incident response.
As previously mentioned, the SIEM will gather logs and events from a heterogeneous collection of data sources which can be grouped into four categories:
- Network devices (routers, switches, etc.)
- Security devices (IDP/IPS, firewalls, etc.)
- Servers (Web, mail, etc.)
For each device, a collector will be used to gather and normalize its information before forwarding the logs to the central engine — the heart of the SIEM where correlations and analyses take place. Finally, the logs will be stored in a database for a certain amount of time depending on the organization’s retention policy. The typical architecture described above can be depicted as follows:
Depending on the SIEM vendor, the term “collector” can be changed to “agent” or “connector.” Some vendors also offer “smart connectors” which automatically detect the type of device they are connected to, simply by recognizing the logs they receive. To remain as generic as possible, we will keep the term “collector.”
The Real Value of Logs, Events, and Flows
A SIEM system can make use of diverse information types. The primary type is log data, usually meant for several purposes such as debugging, system administration, and security audits. The most commonly used standard for logging is Syslog.1 Depending on the device, the standard can change. For instance in the case of Web servers, most of them will use the Common Log Format2 or other proprietary formats.
Another type of information that can be retrieved by a SIEM is events. Events are usually produced by security devices or controls such as IDPS or Identity and Access Management (IAM) systems. It can be, for example, input validation failures (e.g., invalid parameter names/value, protocol violations) or application errors and systems events (e.g., runtime errors, connectivity problems, performance issues, etc.). Events can be correlated together with other information to provide higher intelligence into log management.
Compared to an ordinary logger, a SIEM system can use various conditions to check whether certain events are matching a rule, and depending on the latter, an alarm can be triggered. For example, let us consider a port scan: when a firewall receives a single packet on port 20, it can send a log “connection attempt to ftp service,” nothing dramatic. But when it receives packets with the destination port set from 20 to 100 in less than two seconds, all these events sent to the SIEM can match the rule “port scan,” which will trigger an alert to the security team: some malicious activities are undergoing!
We can list four categories of conditions:
- Event Based: An IDS reports a signature X targeted at host Y and vulnerability scanner knows that Y is vulnerable. It triggers an alert.
- Rule Based: If X + Y + Z then do A, or If X repeats more than three times in interval Y then do Z.
- Anomaly Based: If the traffic on port X exceeds the standard deviation of historic traffic patterns then trigger an alert (e.g., new worm, bot communicating with C&C).
- Risk Based: If attack type is destructive (e.g., buffer overflow versus SYN scan), and target is a critical asset (production server versus workstation), then trigger an alert.
Some event standards have been proposed to improve interoperability and simplify integration of devices. For example, ArcSight came up with the Common Event Format3 and IBM proposed Log Event Extended Format and Splunk Common Information Model.
Finally the last type of data a SIEM can use is traffic flow,4 providing a better overview of the network activity. The two main standards are NetFlow (RFC 3954) and IPFIX (RFC 5101 and 5102). The problem with these two formats is that they do not provide information above the Layer 4 of the OSI model. To remedy this issue, some vendors propose application-aware flows which help detect threats through the analysis of the packet content, using Deep Packet Inspection (e.g., IBM’s QFlow).
The central engine will correlate all of the gathered information by using diverse algorithms and data-mining techniques. These techniques will identify suspicious patterns and behaviors, and provide great help for intrusion detection and auditing.
Deploying a SIEM solution can be quite complex and expensive: the price of appliances, the time for configuration and tuning, and the expertise required for daily use/maintenance can discourage customers. After purchase and deployment, the recurrent question is “now what do we do?” and enterprises tend to answer by using a “monitor-and-respond strategy.5” By using the SIEM in a signature-based defense approach, the security team (or the security operation center [SOC] team) will monitor activities and regularly update the security devices with signatures of known threats.
Upon detection, the team will investigate the alert, escalate it to the Incident Response (IR) team for remediation if they cannot resolve the issue directly themselves (e.g., stop the attack if still ongoing, re-image compromised systems, etc.), and finally report to the board. This overall process can take quite some time, and “time is money,” especially when it comes to security.
Moreover, since the signature-based approach only protects from known threats, the anomaly-based approach, which focuses on detecting abnormal behaviors, should in theory help detect unknown threats, but practically it significantly increases the false-positive rate. The direct consequence is the time required by the security team to investigate them, increasing the chance of missing true-positives. In a survey of the Ponemon Institute, they discovered that on average a company will have up to 17,000 alerts per weeks, but only 700 will be investigated!
A recent Ponemon Institute study6 found that companies spend $1.27 million annually on average by wasting time “responding to inaccurate and erroneous intelligence.” In addition, the rise of targeted attacks, so-called “advanced persistent threats” (APTs), makes it clear that this traditional reactive security posture is no longer sufficient. With shrinking security budgets,7 companies need to find new efficient ways to defend themselves. The question we have to answer is: How do we get ahead of threats? It is time to become proactive.
In the next post we’ll define threat intelligence and see how to leverage its possibilities with SIEM.
1 R. Gerhards. The Syslog protocol, March 2009. Accessed: Jan 6th 2016.
2 W3C. Logging control in w3c httpd, July 1995. Accessed: Jan 6th 2016.
3 ArcSight Inc. Common Event Format. Technical report, 2010.
4 J. Quittek, T. Zseby, and B. Claise. Requirements for IP flow information export, October 2004. Accessed: Jan 6th 2016.
5 Jon Friedman and Mark Bouchard. Definitive Guide to Cyber Threat Intelligence. CyberEdge Press, 2015.
6 Ponemon Institute LLC. The cost of malware containment. Technical report, January 2015.
7 Black Hat. 2015 Black Hat Attendee Survey. July 2015