ICS Is Serious Business (But There’s No Need to Panic)
Recently, there’s been a good bit of focus on industrial control systems (ICS) — the systems that monitor and help keep our critical infrastructure running. The electrical grid tends to get the most attention, but ICS includes water, dams, communications systems, pipelines, natural gas, transportation, and other process control systems. As more and more of these systems get connected to the internet they can make an attractive target for cybercriminals or state actors who are up to no good.
Our guest this week is Robert M. Lee. He’s CEO at Dragos, a company dedicated to the security of critical systems. Before Dragos he was in the U.S. Air Force, where he served as a cyber warfare operations officer in the U.S. Intelligence Community.
For those of you who’d prefer to read, here’s the transcript:
This is Recorded Future, inside threat intelligence for cybersecurity.
Hello everyone and thanks for joining us. I’m Dave Bittner from the CyberWire, and this is episode 34 of the Recorded Future podcast.
There seems to be a good bit of attention aimed at ICS, industrial control systems, lately. These are the systems that monitor and help keep our critical infrastructure running. The electrical grid tends to get the most attention, but ICS includes water, dams, communication systems, pipelines, natural gas, transportation, and other process control systems. As more and more of these systems get connected to the internet, they can make an attractive target for cybercriminals or state actors who are up to no good.
Our guest this week is Robert M Lee. He’s CEO at Dragos, a company dedicated to the security of securing critical systems. Before Dragos, he was in the U.S. Air Force, where he served as a cyber warfare operations officer in the U.S. intelligence community. Stay with us.
Robert M. Lee:
When we’re looking at ICS, or industrial control systems, it really is anything that is, obviously, industrial. So, sort of rugged and put into these hardened kind of environments, but more importantly, it’s about its ability to sense different variables from the environment, or take in inputs, and then be able to actuate or influence change into the environment. So it could be something as familiar looking as a Windows PC that has an HMI, or a human machine interface, a nice little graphical user interface that allows you to send commands to field equipment, or it could be the actual controller down in the field itself, which is a little rugged box, sometimes, little industrial-looking equipment that’s connected to sensors and actuators and turbines, and things like that. Even inside of industrial control systems, we would include all those little parts and widgets and things, as well. So really, it’s kind of this giant collection of all things related to the controlling and changing of the environment, or the physical process itself.
So, take us through the history of that. How did these systems come to be connected to the internet?
Robert M. Lee:
Sure. So, control systems have been around forever. I mean, think of some of the first beginning discussions of control systems. Everybody always cites the water clock, and going back to, like, Egyptian times, and things. The ability for the human race to modify their environment around them and detect and sort of automate some of that was where we started seeing control systems. Back before DARPA and ARPANET and all that stuff, we started seeing more interconnected communications and control systems. They weren’t, obviously, networked because we didn’t have that, but they were still building control systems and factory lines, and things that were leveraging them. It started, even, having this discussion back with Edison and power. Really, around the 80s is when you see really the modern control systems start coming into play, where we started really interconnecting these things. We have Modbus TCP, one of the original kind of networks, protocols, which was just taking the serial information that was communicating between the control systems and ramping it up into TCP header.
All of that started getting efficiency, and to some degree, safety and reliability out of the systems in a way that they hadn’t necessarily had before. So, as networking was brought to control systems, there was sort of a “one plus one equals three” kind of effect between that, where people were grabbing a lot more out of the environments than they had before. It was great. So the push on that was, wow, well, if the interconnection of one plant is really exciting, maybe we can start connecting multiple plants and substations, or control centers and things together to start pulling data more effectively, efficiently, and making even better decisions. So, we started seeing devices getting connected to the internet so that they could be accessed remotely.
Sort of, this next push that we’re seeing is more of the industrial IoT. Can we connect all the different valves and sensors and actuators and actually start having better control of our environments? And let’s do that across global dispersed plants and facilities, and can we make decisions for all these remote sites in a better way? So, the push of technology and interconnected and internet-connected devices has been on taking more value out of that control element, and of course, with that comes risk, especially if it’s connected up to the internet without authentication, or without the proper measures, like two-factor authentication or VPNs. That’s where challenges occur.
And so, on one hand, there’s a real risk there. Now, on the other hand, just because you see a PLC connected to Shodan, you search it on the internet, you’re like, “Oh, it’s a PLC connected to Shodan,” that doesn’t mean that you can just access it and everyone is going to die. There’s always that nuance, right? Just because you can touch one doesn’t mean you can kill everybody, but it introduces risks, which is inappropriate, and I think it goes a little bit further. And when we see poorly configured equipment that is internet connected without the appropriate safeguards, then that is usually a telltale sign that they’re also not taking additional and appropriate safeguards inside the network itself, of what you can’t see. So, seeing internet-connected devices to me isn’t so much the, “Oh my gosh, we’re all going to die,” risk, because that thing is connected. It’s the, “Oh my gosh, they’re not taking security at that facility correctly, probably at all.” That low-hanging fruit is still a challenge for them.
Long story short, the desire for liability, safety, and effectiveness and efficiency out of the process has introduced connections. Now, we have to think about the security controls to put around them to make sure that we do that in a safe and reliable way.
And is this a situation where that connectivity sort of predated the bad guys’ ability and desire to get at those systems?
Robert M. Lee:
Yeah, absolutely. I think there’s always been jerks in the world that want to do damage to industrial infrastructure, and some people come up with really early theories of attacks on an ICS, which are very much unfounded, and when you do the discovery … I think the pipeline in Russia comes up all the time, back in the 1980s of the CIA modified logic and stuff, and the pipeline exploded. Actually, when you dig into it, none of the facts actually hold up. Doesn’t mean that something didn’t happen, but whoever reported it, at least, got everything wrong about it.
So, there’s early theories, don’t get me wrong, but we didn’t really start seeing real targeted intrusions towards industrial control systems probably until the late 90s. Maybe early 90s in more classified settings — who knows — but, definitely in the late 90s, we started seeing more stuff. Even the first big campaign that probably caught a lot of people’s attention was Night Dragon, and Night Dragon predated APT1, if we’ll remember that big report from Mandiant. But Night Dragon was like McAfee and like Dmitri over there. In there and in the reporting, they even note, like, “Hey, there’s a lot of targeting of these weird energy and petrochemical companies, too. This is weird.” And you could tell the authors of the report were very good and smart security people, but they didn’t have the visibility or knowledge of the industrial environment to, sort of, take it further than that, and in reality, having talked to a lot of people that were involved in those cases on the asset owner level, yeah, there was a lot of stuff going on inside the industrial environment.
So, the desire to interconnect and take advantage of these systems predates the adversaries, to some extent, but the adversaries targeting these environments is a 20-plus year concept by now.
Can you take us through the distinction, and whether or not it’s a distinction without a difference between the bad guys accessing these sorts of things, and espionage?
Robert M. Lee:
Yeah, absolutely. So, if you’re looking to target industrial environments, there are certain things we are worried about, and scales of that. In terms of, not necessarily the intentions and motivations, but let’s just talk about, sort of, the scale. First scale is like espionage, right? Adversary gets in and steals intellectual property, and there’s a lot of intellectual property actually contained down in the industrial control environment, like, how you’re making the recipes, the efficiency to which you’re achieving, how is the steel being produced, where is the heat sensors and injects into the blast furnace? At what level is that occurring? There’s a lot of intellectual property in the production of things to be stolen, so there’s definitely a value of espionage there.
The next step up is, of course, damage, and then you scale out from there. Widescale damage, or is it just disruption, or is it physical damage to equipment? With all varying degrees of difficulty and challenges. Belief structure that forms … And I do use that word, as in, the “belief structure,” because it’s almost religious to people. The belief structure that forms … The American power grid is going to go down, is one of those things that drives a lot of the fear, and I’m not saying … I don’t think we should ever sit back and say, “This can’t happen,” but it is significantly and exponentially more difficult than people realize, especially since they always oversimplify it. There is no one grid. We have a lot of reliability and redundancy built in. There’s multiple portions of the grid. There’s major interconnects between it, and taking down “the grid” is something that I don’t even know how you would accomplish. We shouldn’t be limited by my imagination, but being someone who at least thinks that he knows something about ICS, I would say it’s much more difficult than people realize.
There’s a real fear, though — an adversary taking down power in D.C. for three hours. That’s totally manageable, totally accomplishable. It probably wouldn’t rise to the level of military conflict. It’s that “under the bar” kind of position where, you know what? Maybe Congress wouldn’t do anything about it, or wouldn’t be able to. Maybe the president wouldn’t be able to authorize anything that really was meaningful, but at the same time, the fear in the populace on a three-hour power outage would be uncontrollable. Somebody — a foreign power — turned off the lights? That would be a difficult concept to grapple with, and it would instantly scale to, “All of the grid is going to go down. We’re going to die.” It would change the industry overnight in a very, very bad way.
So, there’s a lot of things that can go wrong, and there’s definitely environments that are not as resilient as others. When we talk about the resiliency of the American power grid, that does not translate to the cookie factory down the road.
I have three variables, which I tend to look at and pontificate about, if you will, that I think about with adversary disruption. The first is complexity of the system — and you can add security into adding to that complexity — but how complex is the system? Is it an isolated cookie factory where I can learn everything about that cookie factory, and it’s got no interconnects, and everything else? Well, that’s not a very complex system. Is it an interconnect for the American power grid with a major region, where I’ve got redundant lines and infrastructure and everything else? Well, that’s a very complex system. A substation in Baltimore is not the same as a substation in D.C. That adds additional complexity, et cetera, et cetera.
The next variable is, what is the duration of the outage, or what is the duration of the disruption that you want to accomplish? Are we talking three hours in D.C.? That’s not too bad. We talking three hours across the eastern interconnect? That’s really difficult. Are we talking about a day in D.C.? Well, that’s pretty difficult. Are we talking about a day across the eastern interconnect? Wow, we’re talking exponentially more. So, there’s that variable.
The last one is impact itself, and specifically, are we talking disruption? Are we talking destruction? What level are we trying to achieve in that? And more so than just scale … Excuse me. One’s duration, one’s scale, so, combining those durations, scale, and complexity together produces a view that as the complexity of the problem increases for the adversary, the complexity of the system that you have, you have to spend exponential resources to achieve the same level of impact. As the impact that you want to achieve — and that, really, is the same as duration — as the duration or impact that you want to achieve increases, it is an exponential of that exponential to increase the scale. So, again, can I take down the cookie factory on the road? I could probably come up with some variables, and it’s not as difficult. Can I take down the eastern interconnect for a month? That is an exponential of an exponential.
It’s interesting because I think, certainly, a lot of the popular media reporting, you see situations of equipment being damaged and there not being backups available for equipment, and then, it would be three months before we’d have power back up and running, and people would be starving.
Robert M. Lee:
Yeah, and so, there’s some truth to the problems. We do have problems. I don’t think we should ever, sort of, sit back and be like, “We’re cool.” No, no, no. There are serious challenges, and isolated events. These things are very, very true. If you damaged a key turbine and you physically caused it to destroy, do we have a backup of it? No, and you can’t just keep backups of turbines because — or really we’re talking transformers in this case — but regardless, you can’t keep backups of the physical equipment because they’re all custom-made. It’s not like I can have a store of five and put them anywhere in the grid. No, it’s a one-for-one relationship, or the one that we’re trying to replace, so backups aren’t as scalable or liable as people realize, or want to believe. And if it was actually physically destroyed, what would the impact be? Yeah, like three months before you get another one, for sure. I think that’s completely realistic, and that scares the hell out of everybody, and probably rightfully so. That’s a really bad case.
But then you say, “Okay, well how many … ” So, you destroy one of those things. Are we out for power and everyone is starving? Oh, no. We rerouted power, and because of the redundant lines, everyone is actually fine. Oh, well shit. Okay. Well, what if you took out two of them? And that’s what always comes next. It’s the, “Well, what if, what if,” scenarios, which are good to consider. Just because there’s a low probability of an event occurring, does not mean that we don’t assess it, especially if it has a high impact. If the impact is high enough, you safeguard it, regardless. So, there is a high-impact scenario of, like, nine or 10 key transformers going out in the American power grid with physical destruction, and that high-impact scenario would take down major portions of the grid for a considerable amount of time, and you need to understand what you’re going to do about it.
But the complexity of physically destroying multiple pieces of equipment is much more so than people realize. “Oh, we could just do a cycling of power to physically destroy the generators because we saw that in an aurora video.” No, it doesn’t quite work that way. The aurora event is just physics. Sure, you can do it, but from an adversary’s ability to do it remotely, there’s a lot of complexity that goes into that, especially since the one transformer site …
Say you’re going after nine of these things. Each one of them will be different. Each one of the designs of the attack you have to achieve will be different. The operations to put it in place and to learn and do the intelligence ahead of time will be different. So now, you’re saying instead of doing one adversary operation that’s already very complex and very, very difficult against one site, I need to repeat it with nine times, and all on the same time. Because I don’t need to have a long period of time go by and get caught, because if I get caught, I’m screwed, and it’s the level that’s probably like a wartime scenario of preparation that I’m going to get really dinged in the international community for.
So it’s just, yeah. I push back, and I say, “Should we prepare for this? Yes. Should we do more security? Absolutely. Should we be freaking out? Not a chance. Is it a probable event? Not at all.” But we don’t measure probability in oil spills either. What is the probability of this pipeline rupturing and having an oil spill? It doesn’t matter. Go do your math. That’s fine about probability for, like, insurance, but for the fact that we’re going to have to protect against that and safeguard it for environmental reasons and safety reasons, the impact is so significant that we have to do something. It’s kind of the same discussion.
What part does threat intelligence play in the work that you all do?
Robert M. Lee:
Yeah. So, for us, from Dragos perspective and doing it from an ICS perspective, I worry about a couple things. One, we don’t have a lot of ICS security professionals, so we’ve got to take the talent we have and scale it as much as possible for the right problems, or the right solutions. The other problem I have is, we really do not understand our threat landscape. The traditional threat intelligence communities in IT have been going out and collecting information on forums and analyzing intrusions that happen during security operations, or instant response, or firewalls and IDS and antivirus, and things like that, sitting inside of IT networks reporting back to big vendors like FireEye and Microsoft and Symantec, and whoever, and giving them insight into intrusions. And from that, they codify those intrusions into knowledge of the adversary, which is threat intelligence.
That has never existed in ICS. We don’t have internet-connected sensors beaconing back about intrusions that are occurring. We don’t have a lot of incident responders going around and codifying lessons learned. It’s just a different landscape. So, we don’t know what the threat landscape is, or mostly. And what happens is, technology companies and best practices and NIST national guides and whatever, they still need to get produced, so they produce them off of some tribal knowledge, as well as IT security best practices, copy and pasted into the ICS, and that is entirely inappropriate.
So, in my view, intelligence and threat intelligence in industrial control systems is probably more important than in any other field that I work around, because we need to actually understand what the real risk is. Like, “Okay, here’s 50 patches that came out for an HMI. Well, actually, 49 of them don’t introduce any risk whatsoever, don’t make sense for adversary operations, but this one was being leveraged by an activity group targeting SCADA environments.” “Okay, go fix that one. We don’t have time for all 50. Go fix that one.” “Hey, patching doesn’t work.” In the Ukraine attack, it was just them using equipment against themselves. It wasn’t vulnerabilities and exploits. What do you do now? “Okay, well, we need to have monitoring, and you know what? We need to have the ability to cut VPNs for remote workers in this scenario, because that’s how the SCADA hijack played out.” “Okay, cool. Let’s do that.”
So, I think more than ever, industrial asset owners and operators and their security teams need to truly understand their threat model to understand, what do I care most about, what are my crown jewels, and what are the actual threats out there that could impact them? What are the behaviors that those threats have exhibited? How am I putting compensating controls and response plans and defenses into place to counter those behaviors — not everybody else’s threat behaviors, not every other IT security best practice? And that, then, makes the problem manageable. So, if we don’t have a lot of people, we better guide them correctly. That’s where threat intelligence for ICS comes in.
Our thanks to Robert M. Lee from Dragos for joining us.
Don’t forget to sign up for the Recorded Future Cyber Daily email, where every day you’ll receive the top results for trending technical indicators that are crossing the web, cyber news, targeted industries, threat actors, exploited vulnerabilities, malware, suspicious IP addresses, and much more. You can find that at recordedfuture.com/intel.
We hope you’ve enjoyed the show and that you’ll subscribe and help spread the word among your colleagues and online. The Recorded Future podcast team includes Coordinator Producer Amanda McKeon, Executive Producer Greg Barrette. The show is produced by Pratt Street Media with Editor John Petrik, Executive Producer Pete Kilpe, and I’m Dave Bittner.
Thanks for listening.