Cyber Security of Smart Grids
Europe is in chaos. Water, food and fuel are in short supply. Government and social structures fail. These are the scenes from a novel by Marc Elsberg. But this is no post-apocalyptic drama following a nuclear war, invasion of aliens or meteor strike. In his book Blackout, Morgen ist es zu spät Marc explores the aftermath of a concerted cyberattack on Europe’s electricity infrastructure; smart meters and Supervisory Control and Data Acquisition (SCADA) systems are hacked resulting in a chain of events that leads to the disruption of electricity supply to most of Europe for a fortnight.
We can question the feasibility, technical details and likelihood of such an attack. The book and other works of fiction, such as Channel4’s “Drama documentary” Blackout, do however graphically illustrate how dependant society has become on the electricity network.
I have spoken on the topic of the Cyber Security threat to Smart Grids at a couple of events, this blog entry captures my thoughts on this topic and explores approaches to making grids more secure.
The smart grid is defined as follows on Wikipedia:
A smart grid is a modernized electrical grid that uses analog or digital information and communications technology to gather and act on information - such as information about the behaviours of suppliers and consumers - in an automated fashion to improve the efficiency, reliability, economics, and sustainability of the production and distribution of electricity. Electronic power conditioning and control of the production and distribution of electricity are important aspects of the smart grid.
This, and most other definitions, makes it clear that the smart grid relies on the use and integration of a range of technologies including power electronics, communications, industrial control and information technology.
In a previous series of posts I explored the IT / OT convergence trend; the trend of operational systems using more and more technology that was originally developed for IT systems and the integration of information and operational systems. The second post highlights some security challenges:
- Bespoke OT systems benefited from complete separation and security by obscurity; the integration of systems, adoption of commodity hardware and use of open standards is eroding this “benefit”.
- The exponential growth of the Internet of Things (IoT) increases the risk that consumer grade devices are used beyond their design points; this includes their security characteristics.
- Lessons learnt in high value IT systems (e.g. banking) have not been universally applied to OT systems.
- Our increasing dependence on electricity has increased the value of the grid as a target.
Real attacks such as Stuxnet and research such as this honeypot demonstrate that attacks on utilities are possible and that there are many potential attackers. There are an increasing number of reports that illustrate the vulnerability of internet connected devices, such as the recently exposed infusion pump. While many of these reports are sensationalist and don’t present real world threats, they do highlight the need to consider security as an important design feature of critical infrastructure that is based on commodity hardware. The Shodan search engine lets users find specific equipment that is attached to the internet; this can help attackers find vulnerable targets but it can also be used to find and secure these weak points, as argued in the article: “Shodan makes us more secure”
It is tempting to use the security threat as an argument against making the grid smarter and halting further convergence between OT and IT systems. I don’t think this will happen due to the economic benefits on offer:
- Analytics using information from various sources, both traditional OT (e.g. voltage, current etc.) and IT (e.g. cost of kWh for the next half hour, weather forecast etc.) to optimise operational, tactical and strategic decision making has the potential to make better use of existing assets and significantly reducing the need for investment in new infrastructure.
- Commodity technology hardware, IT standards and protocols are significantly cheaper than bespoke solutions and improve interoperability thereby preventing vendor lock-in and opening up new opportunities for system design, procurement and integration.
Furthermore, security by obscurity only works against broad, random attacks. A determined attacker, with a specific target, will be able to gain enough information to exploit weaknesses, especially now that the internet facilitates sharing and finding such information.
So how can we secure smart grids?
There is no single answer to this question but the following approaches will contribute to a more secure smart grid:
Risk based approach
The amount of money that can be spent on securing smart grids is finite and it is therefore critical to spend it on the education, organisational changes, processes and technologies that will make the biggest difference. Performing a risk assessment is a good start, ideally using one of the available frameworks (e.g. NIST Security Framework). This will identify potential vulnerabilities, the impact that they can have and the likelihood of exploitation. Partnerships with peers and organisations that specialise in the research of vulnerabilities and threats are important in this process as few network operators have the resources to obtain this information independently and keep it up to date.
Many attacks that are reported don’t rely on rare 0-day vulnerabilities or complex malware but exploit simple vulnerabilities such as:
- Devices that are connected to the internet and don’t have any (or have default) passwords set,
- Devices that are connected to telephone lines can be reached by dialling their number and don’t have caller-id or any password verification,
- Open wireless networks,
- Obtaining passwords through social engineering approaches,
- Out of date software with well-known vulnerabilities,
- Shared accounts, no user and role segmentation,
- No antivirus protection,
- Overreliance on edge protection with no network segmentation,
- Data encryption and message signing not enabled,
- Lack of user education.
These should all be identified in a thorough risk assessment. Fixing them can however be more complex in an OT environment when compared to IT systems:
- Operational systems of have a very large number of distributed device types and instances,
- Communication links often only provide low bandwidth,
- Not all support remote patching, cryptography and other security features that we take for granted in the IT environment,
- Verifying compliance of all devices can be a time consuming task.
The output of the risk assessment should therefore be prioritised into a cyber-security roadmap to ensure that the most critical are addressed first. Many of the risks that are identified are likely to not be smart grid specific; many risks will apply to the whole organisation (e.g. governance, awareness & training). Ideally the smart grid cyber security roadmap should be integrated into, or at least aligned with, an organisation wide roadmap.
Threats and vulnerabilities change all the time. Risk assessment therefore needs to be an ongoing process to ensure that the most critical risks are addressed.
Design security into solutions from the start
It is tempting to focus system design on meeting functional requirements. Experience however shows that addressing non-functional requirements, especially security, later in the lifecycle is costly at best and often ineffective. Furthermore, security implications may impact the functional design.
Consider the following smart meter example: The functionality to remotely switch smart meters between pre-payment and credit mode may have very a valid business case for a smart meter deployment. The engineering it takes to add a switch and the meter firmware required to meet this requirement may seem very reasonable. This function however implies the ability to interrupt supply remotely. This has significant security implications, especially if the same (or similar) meter type is rolled out to many premises. If these are not considered at the time the functional specifications are agreed then the cost attributable to each function cannot be calculated which in turn leads to sub-optimal design choices.
My experience is that non-functional requirements (such as security, availability and performance) are often left until much later in the design life cycle. “Fixing” security after much of the design is completed is not only expensive (some of the design work may have to be redone) but is also likely to result in a less secure design.
A “secure by design” approach ensures that security implications are considered in all design choices throughout the system lifecycle (requirements specification, design, build, test and operate).
Defence in depth
As mentioned before, proprietary OT systems relied heavily on separation and secrecy to remain secure. Internet connected IT systems on the other had have long followed a defence in depth approach where different areas (or zones) of the system are separated from each other by firewalls (and other devices), often supplied by different manufacturers and administered by different people.
Intrusion detection systems evaluate behaviour within the system to identify potential breaches and suspicious behaviour of authorised users in real time. This applies to network traffic, server activity, data access and other footprints left by internal or external attackers. Analysis (both real time and historical) of the evidence can automate prioritisation and classification of threats to ensure that the most critical ones are acted on first.
Once an attack has been identified it needs to be met with a well-rehearsed response. In the case of OT systems this will usually happen while the system is still running. An effective response requires trained people as well as processes and procedures that are flexible enough to respond to a range of attacks but prescriptive enough to ensure safe eviction of any attackers and restoration of normal operations. Forensic analysis of attacks should feed back into the risk analysis to ensure that similar attacks are prevented.
This approach can be summarised as:
- Prevent attackers from getting in, but assume that they will.
- Limit the damage by separating different elements of the system
- Detect when under (internal or external) attack.
- Be able to operate while under attack.
- Respond and get attacker out.
- Restore system to normal operation.
- Investigate and take steps to minimise risk (probability and / or impact) of similar attacks.
Grids are becoming smarter and given the changing demand (from renewable, distributed generation, new loads such as electric vehicles and space heating) I see this trend increasing unless we want to invest significantly in network assets. The cyber threat is real and will increase as we become ever more dependent on electricity and potential opportunities for cyber-attack increase.
There are however proven frameworks, management principles, design techniques and technologies that can enable us to make grids secure and robust. I am optimistic that we can avert the scenarios portrayed in the works of fiction alluded to earlier; but only if we take the threat serious and pay attention to it at every stage of the design, build, test and operate life cycle.
Erwin Frank-Schultz, Technical Leader for Energy and Utilities in the UK
PS: Thank you Tom Mellor (@Vintage951) for reviewing the post.
The opinions in this article are my own and don't necessarily represent IBM's positions, strategies or opinions.