Skip to content

Commit

Permalink
Documentation Update (#2)
Browse files Browse the repository at this point in the history
* Dropped in a few example shell commands in README.md to help with installation
* Briefly described the major architectural components of Revizor in docs/architecture.md
* Created docs/cli.md to describe the different execution modes and their command-line switches
* Created docs/modules.md to briefly describe the various Python modules and some test case generation details

Co-authored-by: Connor Shugg <[email protected]>
  • Loading branch information
cwshugg and Connor Shugg authored Sep 6, 2022
1 parent def3889 commit e74373f
Show file tree
Hide file tree
Showing 7 changed files with 285 additions and 20 deletions.
52 changes: 37 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,25 +21,36 @@ Make sure you're not running these experiments on an important machine.

## Requirements & Dependencies

1. Hardware Requirements
### 1. Hardware Requirements

So far, Revizor supports only Intel CPU. It was tested on Intel Core i7-6700 and i7-9700, but it should work on any other Intel CPU just as well.
So far, Revizor supports only Intel CPUs. It was tested on Intel Core i7-6700 and i7-9700, but it should work on any other Intel CPU just as well.

1. Software Requirements
### 2. Software Requirements

* Linux v5.6+ (tested on Linux v5.6.6-300 and v5.6.13-100; there is a good chance it will work on other versions as well, but it's not guaranteed).
* Linux Kernel Headers

```shell
# check linux version
cat /proc/version
```

* Linux Kernel Headers

```shell
# On Ubuntu
sudo apt install linux-headers-$(uname -r)
```
* MSR Tools:
sudo apt-get install linux-headers-$(uname -r)
```

* MSR Tools

```shell
# On Ubuntu
sudo apt install msr-tools
```
* Python 3.9+
```

* [Python 3.9+](https://www.python.org/downloads/)

```shell
# On Ubuntu 18
sudo apt install python3.9 python3.9-distutils
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 2
Expand All @@ -50,21 +61,30 @@ pip3 install --upgrade setuptools
pip3 install --upgrade pip
pip3 install --upgrade distlib
```

* [Unicorn 1.0.2+](https://www.unicorn-engine.org/docs/)
* Python bindings to Unicorn:

```shell
sudo apt install unicorn
```

* Python bindings to Unicorn

```shell
pip3 install --user unicorn

# OR, if installed from sources
cd bindings/python
sudo make install
```

* Python packages `pyyaml`, `types-pyyaml`, `numpy`, `iced-x86`:

```shell
pip3 install --user pyyaml types-pyyaml numpy iced-x86
```

1. Software Requirements for Revizor Development
### 3. Software Requirements for Revizor Development

Tests:
* [Bash Automated Testing System](https://bats-core.readthedocs.io/en/latest/index.html)
Expand All @@ -74,7 +94,7 @@ Tests:
Documentation:
* [pdoc3](https://pypi.org/project/pdoc3/)

1. (Optional) System Configuration
### 4. (Optional) System Configuration

For more stable results, disable hyperthreading (there's usually a BIOS option for it).
If you do not disable hyperthreading, you will see a warning every time you invoke Revizor; you can ignore it.
Expand All @@ -85,13 +105,15 @@ In addition, you might want to stop any other actively-running software on the t

## Installation

1. Get the x86-64 ISA description:
### 1. Get the x86-64 ISA description:

```bash
cd src/x86/isa_loader
./get_spec.py --extensions BASE SSE SSE2 CLFLUSHOPT CLFSH
```

2. Install the executor kernel module:
### 2. Install the executor kernel module:

```bash
cd src/x86/executor
make uninstall # the command will give an error message, but it's ok!
Expand Down Expand Up @@ -167,7 +189,7 @@ The fuzzer is controlled via a single command line interface `cli.py` (located i

# Documentation

For more details, see [docs/_main.md](docs/_main.md).
For more details, see [docs/main.md](docs/main.md).

## Contributing

Expand Down
54 changes: 51 additions & 3 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,56 @@
# Architecture
# Revizor's Architecture

![architecture](Arch.png)
![architecture](diagrams/arch.png)

**Under construction**
Revizor has **five** chief components:

1. Test Case Generator
2. Input Generator
3. Model
4. Executor
5. Analyser

The **Test Case Generator** and **Input Generator** are responsible for
generating random test cases to be run through the **Model** and **Executor**.
The results are examined by the **Analyser** for contract violations.

## Test Case Generator

The TCG is responsible for generating random assembly test cases. It takes an
Instruction Set Specification as input in order for it to understand the
instructions and syntax it can use for generation.

## Input Generator

The IG is responsible for generating the *inputs* that are passed into a test
case created by the TCG. Largely, this means **register** and **memory** values
that the microarchitecture will be primed with before executing the test case.
In this way, a single test case program can be run across several different
inputs, allowing for multiple contract traces (and later, hardware traces) to be
collected for analysis.

## Model

The Model's job is to accept test cases and inputs from the TCG & IG and
*emulate* the test case to collect **contract traces**. A single test case seeded
with several inputs (`N` inputs) will create several contract traces (`N`
contract traces) as the model's output. These are passed to the Analyser to
determine **input classes**.

## Executor

The Executor, on the other side from the Model, is responsible for running the
*same* test cases (with the *same* inputs) on physical hardware to collect
**hardware traces**. Hardware traces from the same input class are collected and
studied by the Analyser to detect **contract violations**.

## Analyser

The Analyser receives contract traces from the Model and hardware traces from
the Executor to accomplish two primary goals:

1. Compare contract traces to set up **input classes**.
2. Compare hardware traces to detect **contract violations**.

[comment]: <> (## Instruction Set Spec)

Expand Down
71 changes: 71 additions & 0 deletions docs/cli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Command-Line Interface

Revizor can run in one of multiple "modes":

* **Fuzzing mode** is revizor's main form of execution. It's what invokes all
components of [revizor's architecture](architecture.md) to enable hardware
fuzzing.
* **Analysis mode** invokes the analyser to compare existing contract traces and
hardware traces.
* **Minimize mode** accepts a test case and attempts to minimize its size.
It acts as a "watered-down" version of **fuzzing mode** that focuses solely on
a single test case.

To select a mode on the command-line, begin your command with:

```shell
cli.py MODE # ... arguments go here

# Where MODE can be:
# fuzz for fuzzing mode
# analyse for analysis mode
# minimize for test case minimization mode
```

## Fuzzing Mode

The following command-line arguments are supported in `fuzz` mode:

* `-s` / `--instruction-set` - accepts a path to an XML file specifying the
instruction set revizor should use.
* `-c` / `--config` - accepts a path to a YAML configuration file for revizor.
* `-n` / `--num-test-cases` - accepts an integer specifying the number of test
cases to create and test during the fuzzing campaign.
* `-i` / `--num-inputs` - accepts an integer specifying the number of inputs to
generate for each test case (which corresponds to the number of contract
traces to collect).
* `-w` / `--working-directory` - accepts a path to a directory into which
revizor will place its output files during the campaign.
* `-t` / `--testcase` - accepts a path to an existing test case for the fuzzer
to run. (Revizor will *only* run this test case if this is specified.)
* `--timeout` - accepts an integer specifying the number of seconds to run the
fuzzer. Once the timeout has been reached, fuzzing will cease.
* `--nonstop` - if enabled, this keeps the fuzzer running after it encounters a
violation. (Otherwise, if it's not specified, revizor will stop after the
first violation is found.)

## Analysis Mode

The following command-line arguments are supported in `analyse` mode:

* `--ctraces` - accepts a path to a file containing contract traces.
* `--htraces` - accepts a path to a file containing hardware traces.
* `-c` / `--config` - accepts a path to a YAML configuration file for revizor.

## Minimize Mode

The following command-line arguments are support in `minimize` mode:

* `-i` / `--infile` - accepts a path to the test case revizor will attempt to
minimize.
* `-o` / `--outfile` - accepts a path specifying where the minimized version of
the original test case will be written to.
* `-c` / `--config` - accepts a path to a YAML configuration file for revizor.
* `-n` / `--num-inputs` - accepts an integer specifying the number of inputs to
try for the test case.
* `-f` / `--add-fences` - if enabled, revizor will add as many `LFENCE`
instructions as possible to the test case's assembly code while still
preserving the violation-inducing behavior.
* `-s` / `--instruction-set` - accepts a path to an XML file specifying the
instruction set revizor should use.

2 changes: 1 addition & 1 deletion docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ For a complete list, see `src/config.py`.
* `model_max_nesting` [int]: Maximum number of simulated mispredictions.
* `model_max_spec_window` [int]: Size of the speculation window.

## Generator Configuration
# Generator Configuration

* `instruction_set` [str]: Tested ISA.
Only one option is currently supported - "x86-64" (default).
Expand Down
File renamed without changes
2 changes: 1 addition & 1 deletion docs/_main.md → docs/main.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,4 @@

# Tutorials

None so far
None so far
124 changes: 124 additions & 0 deletions docs/modules.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# Revizor Modules and Interfaces

Revizor's implementation and [architecture](architecture.md) is separated into
multiple Python files:

* `cli.py` - implements the command-line interface of revizor.
* `config.py` - implements parsing and managing of revizor's YAML configuration
file.
* `generator.py` - implements the **Test Case Generator** portion of
[revizor's architecture](architecture.md).
* `input_generator.py` - implements the **Input Generator** portion of
[revizor's architecture](architecture.md).
* `model.py` - implements the Unicorn-based **Model** portion of
[revizor's architecture](architecture.md).
* `executor.py` - implements the **Executor** portion of
[revizor's architecture](architecture.md).
* `analyser.py` - implements the **Analyser** portion of
[revizor's architecture](architecture.md).
* `postprocessor.py` - defines the `MinimizerViolation` class, used during
`minimize` mode to reduce a violation-inducing test case down to a smaller
size while still maintaining the violation-inducing behavior.
* `fuzzer.py` - implements `fuzz` mode that utilizes all main components to
perform end-to-end hardware fuzzing.
* `coverage.py` - will collect coverage in the future; currently not in use.
* `factory.py` - used to configure revizor accordingly to the user provided
YAML configuration. Implements a simplified version of the Factory pattern:
Defines a series of dictionaries that allows revizor to choose
between various contract, generation techniques, executors, analysers, etc.
In future, it be also used to implement multiple-ISA support.
* `interfaces.py` - defines abstract classes (i.e., interfaces) of all main
components of revizor (e.g., abstract `Executor`, `Model`, `TestCase`,
`Input`, etc)
* `isa\_loader.py` - defines the `InstructionSet` class, used to load an
ISA's specifications from a JSON file provided via the
[command-line interface](cli.md).
* `service.py` - defines logging, statistical, and other services to all other
modules within revizor.

## Architecture-specific Implementation

The modules above are ISA-independent. The architecture-specific implementations
are located in the subdirectories. For example, the implementation of the modules
for the x86-64 architecture is located in `src/x86/`. It's structure largely
mirrors the main modules of revizor (e.g., `x86_model.py` contains x86-specific
parts of the **Model** module). The only unique parts are:

* `*_target_desc.py` - defines constants describing the ISA (e.g., a list of
available registers) and some helper functions.
* `isa_spec/get_spec.py` - a script for transforming the ISA description provided
by the CPU vendor (different for every vendor) into a unified JSON format
* `executor/` - contains a low-level implementation of the executor. The
implementation will be different for each architecture. For black-box x86 CPUs,
it is a Linux kernel module.

## Abstract Test Case

This describes a number of Python classes within revizor that define parts of an
assembly test case. Revizor's TCG uses them to generate syntactically-valid
assembly. The classes are defined in `interfaces.py`.

#### `OperandSpec`

The `OperandSpec` class defines a set of valid operands for any given assembly
instruction. Each `InstructionSpec` object (described below) contains a list of
these operand specifications. It contains properties such as:

* The `type` of operand
* The `width` of the operand
* Whether or not the operand is a `src` or `dest` operand

#### `InstructionSpec`

This class represents a single instruction specification. It contains a name
(i.e. the actual instruction mnemonic, such as `ADD`) and a list of
`OperandSpec`s, defining valid operands for the instruction. It also has a
number of boolean flags that indicate unique attributes about the instruction,
such as:

* If the instruction contains a memory write
* If the instruction is a control-flow instruction

#### `Operand`

The `Operand` class defines an actual operand to be used in an instruction
placed into the TCG's generated test case (not to be confused with
`OperandSpec`, which is a set of rules used to define possible operand choices
for an instruction). This is an **abstract base class** that provides a number
of sub-classes:

* `RegisterOperand`
* `MemoryOperand`
* `ImmediateOperand`
* `LabelOperand`
* `AgenOperand`
* `FlagsOperand`

#### `Instruction`

Similar to the relationship between `OperandSpec` and `Operand`, the
`Instruction` class defines an actual instruction, constrained by an
`InstructionSpec`, that is used during test case generation. It contains a list
of `Operand`s and is linked to its neighboring instructions via object
references.

#### `BasicBlock`

Thisi class represents a single basic block within the generated test case (a
**basic block** is a straight-line sequence of assembly instructions that has a
single entry and exit point). It contains a list of all instructions contained
within, references to its successor basic block(s), and a list of "terminator"
instructions (instructions that exit the basic block, such as a branch).

#### `Function`

This object represents a collection of basic blocks that form a function. It has
an "entry" basic block and an "exit" basic block, along with a list of all basic
blocks that comprise the function.

#### `TestCaseDAG`

**DAG** is short for **Directed Acyclic Graph**. This object represents the
*entire* test case's control flow. It contains a list of functions that, within,
define all instructions to be written out to the test case's assembly file.

0 comments on commit e74373f

Please sign in to comment.