Introduction to Securing Cyber Physical Systems
In Part 9 of this new Cyber Physical Systems series, I will introduce the different methods and techniques to securing cyber physical systems since we now know how vulnerable they are.
CPS Security
Securing CPS is not straightforward and there is no one-size-fits-all. However, some of the main security requirements of CPS are:
Privacy: CPS could contain customers data which need to be kept confidential and private
Resiliency: must be resilient to accidents and attacks
Dependability: safety (defined in term’s of organization’s goals) and reliability (recover from disruptions due to attack or natural disaster)
Interaction and coordination: maintain continuous interaction and coordination between cyber and physical systems
Operational security (OpSec): OpSec ensures physical, information and personnel security through careful planning, risk assessment and risk management. This involves (critical information identification, threat analysis, vulnerability analysis, risk assessment, countermeasure).
System hardening: defends a wider range of threats through defence-in-depth, for instance by isolating critical applications, segmentation and use of DMZ.
Adoption of security measures have many benefits for protection the CPS components, layers, and domains. However, the CPS may also be impacted by the application of such security measures. Some of these impacts could be:
Reduced performance: affecting normal performance of the CPS. There should be a security performance trade-off.
Higher power consumption: this is a serious issue for resource limited and battery operated devices. High power consumption means shorter lifespan and higher maintenance cost.
Transmission delay: additional delay due to added encryption. Despite the protective advantage, the delay may not be acceptable in a real-time CPS system.
Higher cost: higher security may lead to higher computational cost, higher initial capital, higher training cost, higher update and operation cost
Compatibility issues: CPS may not be compatible with the deployed security measure, due to many reasons such as software, firmware, hardware (architectural) incompatibility.
Operational security delay: after deploying a security measure, there needs to be a training phase before full operational security is reached and during this time, the system may be prone to attack until the OpSec is complete for the new system.
CPS Security Solutions
CPS systems can be divided into four main types based on the aspect of their criticality:
Safety Critical: attack can cause loss of life, chronic deadly diseases, significant damage to environment (fire, flood, radioactivity)
Mission Critical: attack can cause fatal/non-fatal, total/partial failure of a CPS from achieving its objective
Business Critical: attack can cause financial and economic losses, damaged reputation and loss CPS contractors and clients
Security Critical: attack can causes security breach of CPS (rootkit, backdoors, vulnerability exploitation)
Main CPS security solutions are either (or combination of both):
Cryptographic-based solutions: secure communication channel from attack (passive/active) and unauthorized access. Traditional cryptography using cipher and hash function are not easily applicable. Any solution needs to maintain overall efficiency in addition to security.
Non-cryptographic based solutions: mitigate and eliminate cyber attacks and malicious events using techniques other than cryptography such as IDMZ, IDS, firewall, honeypots, etc..
Cryptographic-Based Solutions (Confidentiality)
The first important aspect of cryptographic based solutions is the idea of confidentiality. The idea is to secure the CPS communication channels. Some of the presented solutions are:
Encompression - reducing overhead and mitigating encryption through compression before encryption. Even symmetric ciphers can greatly increase the energy consumption and thus shorten device battery time. To mitigate this problem, compression techniques can be used before encryption to reduce this added workload as well as transmission cost. Compared to encryption, compression can be realized with a much lower computational and energy footprint. Encompression which is (compressive sensing + encryption + integrity checking) shows that an energy reduction of up to 78% versus traditional encryption and integrity checking (compression + AES + SHA).
Ultra-lightweight and low-latency block ciphers - designed in such a way that the overhead for decryption on top of encryption is negligible, block ciphers such as PRINCE that are implemented in hardware and can encrypt a block within one clock cycle are desirable for real-time security applications. In addition, block ciphers converts fixed size blocks of data at a time, it converts a more significant number of bits than stream cipher. Ultra-lightweight block ciphers such as PRESENT implemented into the hardware can provide both security and hardware efficiency. These novel methods can provide cryptography for any resource constrained, normal, industrial or even medical device.
Bump-in-the-wire (BITW) - this is a solution that can be used for legacy devices. BITW is a network appliance that is used to add integrity, authentication and confidentiality to the network packets exchanged between legacy devices. Legacy device sends unencrypted and unauthenticated packets, which the BITW will tunnel over a secure channel to another BITW device at the other end of the communication channel. AGA (American GAS Association) has its own BITW encryption standard, AGA-12 Standard for Serial SCADA protection protocol (SSPP). The SSPP protocol layer adds additional information to the original SCADA message to encrypt and provide message authentication. However, this additional info adds latency to serial communication. Also, the BITW is ineffective if the end-point is compromised. It only works to secure the communication channel.
Cryptographic-Based Solutions (Integrity)
The second important aspect of cryptographic based solutions is the idea of integrity. The idea is to prevent any physical or logical modification of incoming or outgoing real-time data. The solutions for this include:
Security Information and Event Management (SIEM) - these are products and services that combine security information management and security event management. They provide real-time analysis of security alerts generated by applications and network hardware. A good video describing this system can be found below.
Another solution is the idea of Trustworthy Autonomic Interface Guardian Architecture (TAIGA). TAIGA monitors communication between the embedded controller and physical process. This is an autonomous architecture which provides the physical process with a last line of defense against cyber-attacks. TAIGA switches process control to a trusted backup controller if an attack causes a system specification violation. TAIGA architecture requires integration of a trusted safety-preserving backup controller.
Next, there is the Shadow Security Unit (SSU). SSU is a low cost device that is used in parallel with a PLC or RTU to secure SCADA systems. SSU can monitor the communication control channels along with its physical process Input/Output lines to constantly assess both security and operational status of PLC or RTU.
Finally, there is the idea of Watermarking. Physical watermarking is a solution to authenticate the correct operation of a control system. In physical watermarking, a randomly generated input or watermark that is known to legitimate parties is added to the physical system. Watermarking can reveal if the data has been tampered with.
Cryptographic-Based Solutions (Authentication)
Finally, we can talk about authentication solutions. The idea of authentication solutions is to prevent unauthorized parties access to the system.
The first solution is Homomorphic Encryption. Homomorphic Encryption scheme with a modified decryption algorithm to enhance the confidentiality of the communication. In addition, homomorphic encryption is an advanced cryptographic scheme to directly enable arithmetic operations on the encrypted variables without decryption.
Non-Cryptographic Based Solutions (Authentication)
For non-cryptography based solutions, we can talk about an Intrusion Detection System (IDS). These are designed to detect abnormal behaviour in the network. These can be split into Physical and Cyber based. Physical models are based on the normal CPS operation and can detect physical anomalies. Cyber-based models recognize cyber attacks. There are also distributed (for each object) and centralized IDSs. IDS can also be signature based or anomaly based.
Another solution is Honeypots and the concept of deception. The design of a virtual, high-interaction, server-based ICS honeypot to ensure a realistic cost-effective, and maintainable ICS honeypot that captures the attacker activities. An example is HoneyBot - the first software hybrid interaction honeypot specifically designed for networked robotic systems to fool attackers into believing that their exploits are successful.
Another one was HoneyPhy - a physics-aware framework for complex CPS. Honeypots need to be convincing to fool attackers. For the CPS to be convincing to the attacker, not only does the networking need to be modelled realistically, but also the modelling of the device actuation fingerprints and how the attached process responds to actuations needs to be realistic as well.
To defend against MitM attacks, one defense is to build a model of systems considering sensor and actuator channel attacks, and use the model to detect intrusions and protect the system from damages caused by MitM attacks on communication networks channels and defend it by NA-Safe Controllability (Network-Attack Safe Controllability) which can detect the attacks in the network and prevent the system from reaching an unsafe state.
Additionally, to defend against DoS attacks, you can design a control strategy to minimize control system deviation in DoS attack environments. This could be lowering system bandwidth (designing a more stable/restricted control system). However, this can directly impact the performance of the control system so it very much depends on the requirements of the physical process.
Generally constraining it so no drastic (fast, quick or sudden) changes would be allowed. Maximally a robust controller under DoS attack can be mitigated by designing a controller that can handle high level of DoS without damaging its stability.
Using Event Triggered Control (UTC) to reduce the attacker’s opportunity of analysing the transmission information and find vulnerabilities. This will reduce the data transmission over network, sensor data would be sent only if an event is triggered. Less traffic means less chance of network traffic being captured and analyzed.
Attack Detection
One big difference in control systems compared to traditional IT systems is that instead of creating models of network traffic or software behaviour, we can use a representative model of the physical system. Basically, if we can estimate the behaviour/output of the physical system to the controlling input, then by comparing the expected and observed behaviour/output, we can identify any attack to the sensor data or actuator.
However, this depends on the quality of our estimate and we may have some false alarms. Therefore, we would need:
A model of the physical system to estimate its behaviour
An anomaly detection algorithm to detect anomalies in system’s behaviour
However, for the majority of process control systems, development of an accurate model is very difficult.
Physical security is the most rudimentary step. Implementing things to secure the remote site including things like:
Access control using gates and check points
Monitoring and surveillance
Locks and card swipes
Additionally, implementing things that can deter and detect including:
door/window contacts
motion sensors
lighting
strobes/alarms
signs
anti-intrusion fencing
contamination detection and notification
earthquake sensors
With regards to attack detection, the watermarking technique discussed earlier should not impair the controlling performance and an idela watermarking scheme guarantees the nominal performance is not affected without attacks.
Watermarking can introduce some degradation in the signal which might impact the controlling performance, depending on the system. This performance degradation would be minimum ideal water marking scheme guarantees the nominal performance is not affected under normal conditions.
In watermarking, the sensors outputs are watermarked. The watermarked signal is then transmitted through an unsecured network. Watermarking and the original signal are extracted. The watermark is then analyzed for tamper detection (MitM). Finally, the original signal is passed to the controller (PLC).
Watermarking is also used to verify the integrity of data. Fragile Watermarking Scheme (FWC) and Sliding Group Watermarking (SGW) are two of the examples. In a nutshell, they work by organizing sensor data into groups before calculating the group hash digest and storing them in the least significant bits.
This hash is contained and concealed within the data. Attackers would not know about the watermark and changing the data would impact the watermark embedded in the data.
Finally, as a last note, we can talk about false data injection (FDI). Fault injection (fuzzing) is a technique to understand the behaviour of a system. It stresses the system using exposing components (software or hardware) to conditions beyond their operating limits. It is commonly used to test for vulnerabilities in communication interfaces and protocols.
However, instead of fuzzing expensive and critical controllers, fuzzing is performed on the emulated PLCs. Hence, allowing for non-destructive vulnerability detection.
Industrial proprietary communication protocols and devices can be customized and have complicated structures. Fuzzing system cannot quickly generate test data that adapt to various protocols.
In recent years, AI and machine learning is used to learn the features of a protocol and generates mutated test data automatically. ICPFuzzer is an example of such a tool. It is a black box fuzzing system that can automatically execute the testing process and reveal vulnerabilities that interrupt and crash industrial control communication.
Just a very quick summary image you can find below if you want a brief overview: