- 2023-01-04 Add new experiment to compare the attack identification with/out physics information.
- 2023-01-01 This paper has been published by IEEE Transactions on Smart Grid (early access). The paper is available at https://ieeexplore.ieee.org/document/9998121.
- 2022-12-21 This paper has been accepted by IEEE Transactions on Smart Grid. Copyright of the paper is reserved by IEEE.
- 2022-08-20 We update paper by adding more baseline experiment on event-triggered Max-Rank MTD and Robust-MTD. We open source the code for DDET-MTD.
Please cite our paper if you find it useful for your research.
@ARTICLE{9998121, author={Xu, Wangkun and Higgins, Martin and Wang, Jianhong and Jaimoukha, Imad M. and Teng, Fei}, journal={IEEE Transactions on Smart Grid}, title={Blending Data and Physics Against False Data Injection Attack: An Event-Triggered Moving Target Defence Approach}, year={2022}, volume={}, number={}, pages={1-1}, doi={10.1109/TSG.2022.3231728}} }
This repo contains all the codes and data for DDET-MTD: Data-Driven Event-Triggered MTD against Power System FDI Attack for our paper Blending Data and Physics Against False Data Injection Attack: An Event-Triggered Moving Target Defence Approach, coauthoered by Wangkun Xu, Martin Higgins, Jianhong Wang, Imad. M. Jaimoukha, and Fei Teng.
Fast and accurate detection of cyberattacks is a key element for a cyber-resilient power system. Recently, data-driven detectors and physics-based Moving Target Defences (MTD) have been proposed to detect false data injection (FDI) attacks on state estimation. However, the uncontrollable false positive rate of the data-driven detector and the extra cost of frequent MTD usage limit their wide applications. Few works have explored the overlap between these two areas. To fill this gap, this paper proposes blending data-driven and physics-based approaches to enhance the detection performance. To start, a physics-informed data-driven attack detection and identification algorithm is proposed. Then, an MTD protocol is triggered by the positive alarm from the data-driven detector. The MTD is formulated as a bilevel optimisation to robustly guarantee its effectiveness against the worst-case attack around the identified attack vector. Meanwhile, MTD hiddenness is also improved so that the defence cannot be detected by the attacker. To guarantee feasibility and convergence, the convex two-stage reformulation is derived through duality and linear matrix inequality. The simulation results verify that blending data and physics can achieve extremely high detection rate while simultaneously reducing the false positive rate of the data-driven detector and the extra cost of MTD. All codes are available at https://github.com/xuwkk/DDET-MTD.
The proposed DDET-MTD has three successive components in one execution cycle.
First, the LSTM-AE detector is trained on the normal dataset offline and then applied on the sensor measurement collected from SCADA in real-time operation. If a positive alarm is raised when solving the state estimation at time
In the last component, based on the identified attack, a robust MTD algorithm is triggered to verify the positive alarm from the LSTM-AE detector at the next state estimation time
The key packages used in this project are:
- PyPower for power system operations. PYPOWER is a power flow and Optimal Power Flow (OPF) solver. It is a port of MATPOWER to the Python programming language.
- PyTorch constructing deep learning detector. PyTorch is an open source machine learning framework that accelerates the path from research prototyping to production deployment.
- CVXPY for solving convex MTD optimization problem. CVXPY is an open source Python-embedded modeling language for convex optimization problems. It lets you express your problem in a natural way that follows the math, rather than in the restrictive standard form required by solvers. This package is free to use.
- MOSEK is a software package for the solution of linear, mixed-integer linear, quadratic, mixed-integer quadratic, quadratically constraint, conic and convex nonlinear mathematical optimization problems. The applicability of the solver varies widely and is commonly used for solving problems in areas such as engineering, finance and computer science (source). It is used to accurately and fast solve the convex SDP in hidden-effective MTD. Academic licenses are available for academic institutions and research institutions.
The main structure of this repo is summarised as follows:
.
├── configs
│ ├── config.py # cases, OPF, SE, and MTD configurations
│ ├── config_mea_idx.py # sensor deployment and measurement noise configurations
│ └── nn_setting.py # hyperparameters for neural network
├── utils
│ ├── class_se.py # class to support fundamental power system operation such as OPF, SE, and Max-Rank MTD
│ ├── fdi_att.py # functions to generate false data injection attack
│ └── load_data.py # convenient methods to return case, data, and dataloader
├── figures
│ └── case14.png # figures for case14 system
├── gen_data
│ ├── case14 # simulated measurements and states for the case14 system
│ ├── raw_data # the raw load and PV profiles
│ └── gen_data.py # functions to modify the IEEE standard case and generate the data
├── metric
│ └── case14 # metrics for the case14 system
├── models
│ ├── dataset.py # torch dataset and dataloader
│ ├── early_stopping.py
│ ├── model.py # torch LSTM-AE model
│ └── evaluation.py # functions to detect and identify attacks using LSTM-AE
└── optim
│ ├── optimization.py # functions for the two-stage MTD optimisation and its evaluation methods.
│ └── robust_mtd.py # functions for the baseline robust MTD
└── repo_figure
│── saved_model
│ └── checkpoint_rnn.pt # trained LSTM-AE model for the case14 system
│── draw_metric.ipynb # ipython notebook to draw all the figures in the paper
│── draw_profile.ipynb # ipython notebook to draw the load and pv profiles
|── evaluate_convergence.py # evaluate the convergence performance of stage one and stage two MTDs
│── evaluation_ddd_no_physics.py # evaluate the performance of the data-driven (LSTM-AE) detector identification without physics
│── evaluation_ddd.py # evaluate the performance of the data-driven (LSTM-AE) detector
│── evaluation_baseline_no_attack.py # evaluate the baseline algorithms without attack
│── evaluation_baseline_with_attack.py # evaluate the baseline algorithms with attack
│── evaluation_event_trigger.py # evaluate the performance of the DDET-MTD under attack
│── evaluation_fpr.py # evaluate the performance of the DDET-MTD with attack
│── gen_load_pv.ipynb # generate the load and pv profiles for the case14 system
│── test_jacobian.ipynb # test the accuracy of approximation of effectiveness and hiddenness
└── readme.md
The fundamental algorithms, e.g.
class_se.py
,fdi_att.py
,gen_data.py
are copied from our repository steady-state-power-system.
The only data needed for this repository is the raw load consumption and pv data where you can downloaded from Google Drive. The load data is cleaned from ElectricityLoadDiagrams20112014 Data Set and the PV data is cleaned from Elia. After downloading, you should unzip the contents (load.cav
and pv.csv
) directly under folder raw_data
.
The load data generation is included in gen_data/gen_data.py
:
- Clean Active and Reactive Load and PV. We use 4-month data in this project. The original 15min load and PV data are interpolated linearly into 5min resolution. To total length of load and PV data is 35136 (roughly 12x24x30x4) (see
gen_data/gen_data.py
). - IEEE Case File. Standard IEEE case files stored in PyPower is modified. For bus 14 system, we modify the standard case file from PyPower by adding load consumption on non reference bus, improving the load levels, reducing the branch power flow limit, and change the generator cost (see
gen_data/gen_data.py
). - Construct Load and PV Suitable for Different Cases. It is import that the OPF converges during each run. Therefore the load and PV levels are adjusted to be not too large.
- Load: Rescale the each of the raw load by the default load level in step 2. The reactive power is determined with randomly 0.97-0.99 power factor.
- PV: Rescale it by penetration level equals to 30%, meaning that the maximum total PV generation per 5min will be 30% of the maximum total load consumption. For the PV data, we add up to random reduction with a maximum probability to mimic the cloud. Although the reactive power for PV is generated, but we don't use it any further. The PV is considered in restricted mode, meaning that no reactive power can be generated for voltage regulation.
- Measurement. We consider RTU measurements including active and reactive power injections at all buses, from end active and reactive power flows at all branches. Instead of using the constant measurement noise, we run the AC Optimal Power Flow (OPF) on the default load condition in step 2 and calculate the std of noise as 2% for power measurements. To generate dataset of measurement for LSTM-AE, we run Optimal Power Flow under each 35136 load point and record the measurements and estimated states.