Siemens: ‘The most important action you can take to make operations resilient is to develop and implement an IR playbook. Resiliency is based on 3 key concepts: visibility, relationships, & speed.’
Siemens Gas and Power
Siemens Gas and Power GmbH & Co. KG, the global energy business of the Siemens group, has produced a ‘playbook’ for incident response (IR) based on a hypothetical scenario where a cyber-attack on a utility caused a city-wide blackout. The playbook, “Simulating a Cyberattack on the Energy Industry: A Playbook for Incident Response,” begins with the premise that teams that have built and practiced an IR playbook in advance of a breach will perform better than teams having to improvise every time
Report findings – extracts from the Siemens playbook, “Simulating a Cyberattack on the Energy Industry: A Playbook for Incident Response:
‘Whether an organisation is creating its first IR plan or building on existing capabilities, a clear OT response framework will help build a culture of continuous improvement and constant vigilance. Strong cyber-security IR begins before an incident occurs and continues long after normal operations have been restored.’
UK Energy Emergency Executive (E3CC)
This interactive tabletop exercise in London was held jointly by the UK Energy Emergency Executive (E3CC) and the UK Department for Business, Energy and Industrial Strategy (BEIS). Lessons from the exercise have been described as broadly applicable for regulators, utilities, and OT or IT security experts anywhere in the world.
Leaders will have to choose between competing interests during cyber-incidents and make decisions with partial information in very high-stress situations. Continuing plant operations may stop investigation of anomalies or make it much more difficult to preserve evidence. Someone in authority will then need to decide when and how to engage with partners, vendors, regulators, and the public.
Preparation—practicing a methodical response to a wide variety of threats: To prepare, IR teams should build and maintain an industrial forensic toolkit. An organisation should also identify which staff will centrally manage a crisis, define roles, and educate plant personnel. This team will be responsible for rebooting equipment, restoring operations, and eliminating vulnerabilities during an incident.
Identification—identifying a cyber-attack is underway: An initial signal might come in the form of an operational abnormality or more directly as ransomware. Field personnel are especially important in helping distinguish between security and process control system abnormalities. An investigative playbook can help diagnose, triage, and activate responders in assessing the impact and determining appropriate next steps.
Containment—ensuring the incident causes no further damage: The overarching priority is to isolate infections, maintain production, and, above all, ensure actions do not further jeopardise plant safety or operations. In an OT context, containment can be difficult; utilities must isolate the source of an attack and determine when to apply a built-for-purpose passive forensic tool to remove malware from production networks or limit unnecessary data transfers.
Eradication—removing the threat: The forensics team must ensure that essential operations are backed up should challenges arise with restoration. Possible methods could range from system patching or rebuilds to full architecture redesign. The team should preserve evidence, which may range from mapping of employee change control to full-system image capture.
Recovery—enacting a phased recovery plan to restore full strength operations: This requires focusing on restoring critical systems first—or operating in analogue mode—until there is confidence in system-level performance. An environmental and safety check should be done in parallel to control for unintended performance impacts of restoration.
Lessons learned—documenting lessons learned from the incident. The lessons-learned process is an ongoing activity that must not only capture the immediate impacts of an incident, but also the long-term improvements of plant security. This could range from a better designed process-control system and stand-up of a physical-command response centre, to enhancing an organisation’s monitoring capabilities. This response system should include utility peers, vendors, authorities, and the security community.
Everything learned from those six steps goes into the feedback loop. An IR plan can and should always get smarter and faster after every cyber-incident.
The most important action organisations can take to make their operations resilient is to develop and implement an IR playbook. Resiliency is based on three key concepts: visibility, relationships, and speed. These elements are fundamental to developing a forward-looking IR playbook that brings together intelligence and leaders under a single umbrella.
Visibility means that utilities can see and understand the complexities of their systems – continuously monitoring and investigating potential threats.
Relationships matter in a crisis. The ability to share information throughout a common supply chain with trusted vendors make the difference in getting to a resolution. Relationships need to work at all levels of the organisation with a clearly defined escalation path.
Speed becomes critical during a crisis. Incident response requires system operators to quickly and accurately understand, contain and recover from an attack before its full impact can cause outages or spread to other systems.