By Robert M. Lee

On October 20, 2019, the Twitter account @BabakTaghvaee posted that there was a fire at the Abadan Oil Refinery in Iran; notably the account claimed that the fire was a result of a confirmed cyber attack. A video was posted of the fire and the news organization Retuers had posted just prior to the tweet about the fire as well. The Reuters reporting cited Iranian state broadcaster IRIB to say that the fire was in a canal carrying waste from the oil refinery and was at that time under control. Various posts on social media took advantage of the claim to spread the information about the cyber attack and claim that it was “probably” a result of the alleged Iranian attacks on Saudi Aramco. A few commentators linked to the Reuters story on a secret cyber attack was carried out by the U.S. on Iran published on October 16th as proof and fell victim to the classic Post Hoc Propter Hoc fallacy of assuming correlation equals causation.

The purpose of this blog is to add some context to such events for the purpose of avoiding hype but to clearly point out a gap in the industrial cybersecurity community that we have around root cause analysis and the importance of setting forth a strategy across collection, visibility, and detection to ever get to the point where response scenarios can account for such processes.

Cyber attacks can absolutely have the capability to cause devastating effects. Adversaries have become more aggressive over the last few years in this space and are demonstrating an increase in knowledge and sophistication with regards to causing physical effects through cyber intrusions and capabilities. In 2017, the TRISIS malware leveraged by XENOTIME was responsible for a shut down of a Saudi Arabian petrochemical company where the adversary failed in their likely actual intent to kill people at the facility by targeting safety systems. In that case though, one of the interesting details is that the adversary tried multiple times to achieve their effect. The first time TRISIS was deployed it failed, the plant shut down, and the personnel involved attempted to do root cause analysis. Root cause analysis is well understood and practiced in the engineering and operations communities. However, those practices rarely fully consider a cyber component.

In the TRISIS case, the plant engineers could not determine what went wrong, i.e. they did not identify the cyber attack during or after the event and went back into operations giving the adversary another opportunity. It is not that the cyber attack was undetectable, it was perfectly detectable through a variety of detection approaches in the industrial networks, but the defenders at that site were not performing industrial specific cyber detection. Because of the lack of detection capabilities as well as the collection capabilities feeding into them some of the evidence was not available after the attack to properly get to root cause analysis of the event and what evidence was available was easy to miss. This is like trying to photograph the getaway car of a robbery after the car is already gone; you can still find other evidence such as tire tracks, but it would have been nice to have the photo of the license plate. Often times there are forensic practices that can take place after that attack even without good detection capabilities, but they can be easy to miss if not prepared for properly in the incident response procedures or highlighted through threat detection and intrusion analysis.

In the Abadan case it is unlikely from what we know of such incidents and normal engineering practices around root cause analysis that the personnel on site have had any opportunity at all to properly do root cause analysis. Refinery fires are not rare, but they are serious events that the engineering and operations community usually handle maturely with safety as the number one priority. While personnel are still trying to get the fire under control it is very unlikely that anyone is performing root cause analysis of the event to include a cyber component. Proper root cause analysis including cyber forensics is one of the most difficult tasks to achieve in industrial control systems (ICS) networks. The ICS cybersecurity community is maturing rapidly but still very far from being able to perform this level of a task reliably.

It is my estimate that only a small subset of the community is gaining visibility into the ICS networks today though the progress we are seeing is encouraging and a hallmark of increasing maturity. A smaller subset of that community though is pursuing a collection and detection strategy factored in to the products, process, and training they implement. A much smaller subset is tying this into what types of events they want to be able to respond to and gain root cause analysis. Even if Abadan’s oil refinery was world leading in this regard it is unlikely enough time has passed for anyone to properly analyze the information collected. For this reason, I would assess that any claims of a cyber attack are immature at this point and unlikely to be founded in proper evidence. Should cyber be considered though? Absolutely, especially with the increasing tension and demonstration of adversaries. But today the larger industry lives closer to Schrödinger’s ICS than we do to organizations’ reliably achieving root cause analysis.

It is my recommendation to the ICS cybersecurity community that events like this be used to highlight the gaps we have in our current defenses. We should not hype up such events but instead look inward and determine if we could answer similar questions of “was it a cyber attack?” in our own industrial and operations networks. I often recommend to organizations to start with a few scenarios that you want to be able to respond to taken both from intelligence-driven scenarios as well as consequence-driven scenarios. From those determine what types of requirements, such as root cause analysis, reliability, and safety will be important to the organization and its stakeholders. Develop incident response plans from those events and work backwards to define the type of detection that you’ll need to get to that incident response and the type of collection you’ll need to get to that detection. That will help define your visibility requirements. Instead of starting with visibility and working forward, potentially never getting to the results you need, start with the end in mind and work backwards to ensure the visibility requirements are aligned.

For more information on these topics I would recommend Dragos’ Collection Management Framework, the Four Types of Threat Detection, and Consequence-Driven ICS Cybersecurity papers as well as the Year In Review reports which should help you on your path to think about the challenges ahead and operate more safe and reliable infrastructure.