Skip to content

Local Benchmarks

Alberto Sonnino edited this page Dec 14, 2021 · 16 revisions

When running benchmarks, the codebase is automatically compiled with the feature flag benchmark. This enables the node to print some special log entries that are then read by the python scripts and used to compute performance. These special log entries are clearly indicated with comments in the code, but make sure to not modify or delete them (otherwise the benchmark scripts will fail to interpret the logs).

Parametrize the benchmark

After cloning the repo and installing all dependencies, you can use Fabric to run benchmarks on your local machine. Locate the task called local in the file fabfile.py:

@task
def local(ctx):
    ...

The task specifies two types of parameters, the benchmark parameters and the nodes parameters. The benchmark parameters look as follows:

bench_params = {
    'faults': 0,
    'nodes': 4,
    'rate': 1_000,
    'tx_size': 512,
    'duration': 20,
}

They specify the number of faulty nodes ('faults), the number of nodes to deploy (nodes), the input rate (tx/s) at which the clients submits transactions to the system (rate), the size of each transaction in bytes (tx_size), and the duration of the benchmark in seconds (duration). The minimum transaction size is 9 bytes, this ensure that the transactions of a client are all different. The benchmarking script will deploy as many clients as nodes and divide the input rate equally amongst each client. For instance, if you configure the testbed with 4 nodes and an input rate of 1,000 tx/s (as in the example above), the scripts will deploy 4 clients each submitting transactions to one node at a rate of 250 tx/s. When the parameters faults is set to f > 0, the last f nodes and clients are not booted; the system will thus run with n-f nodes (and n-f clients).

The nodes parameters contain the configuration of the consensus and the mempool:

node_params = {
    'consensus': {
        'timeout_delay': 1_000,
        'sync_retry_delay': 10_000,
    },
    'mempool': {
        'gc_depth': 50,
        'sync_retry_delay': 5_000,
        'sync_retry_nodes': 3,
        'batch_size': 15_000,
        'max_batch_delay': 10
    }
}

They are defined as follows:

  • timeout_delay (consensus): Nodes trigger a view-change when this timeout (in milliseconds) is reached.
  • sync_retry_delay (consensus and mempool): Nodes re-broadcast sync requests when this timeout (in milliseconds) is reached.
  • gc_depth (mempool): Depth of the garbage collector ensuring the mempool does not run out of memory during network asynchrony.
  • sync_retry_nodes (mempool): Number of nodes to pick at random upon trying to sync missing batches.
  • batch_size (mempool): The maximum size (in bytes) of a batch.
  • max_batch_delay (mempool): Maximum delay between batch creation (in milliseconds).

Run the benchmark

Once you specified both bench_params and node_params as desired, run:

$ fab local

This command first recompiles your code in release mode (and with the benchmark feature flag activated), thus ensuring you always benchmark the latest version of your code. This may take a long time the first time you run it. It then generates the configuration files and keys for each node, and runs the benchmarks with the specified parameters. It finally parses the logs and displays a summary of the execution similarly to the one below. All the configuration and key files are hidden JSON files; i.e., their name starts with a dot (.), such as .committee.json.

-----------------------------------------
 SUMMARY:
-----------------------------------------
 + CONFIG:
 Faults: 0 nodes
 Committee size: 4 nodes
 Input rate: 1,000 tx/s
 Transaction size: 512 B
 Execution time: 20 s

 Consensus timeout delay: 1,000 ms
 Consensus sync retry delay: 10,000 ms
 Mempool GC depth: 50 rounds
 Mempool sync retry delay: 5,000 ms
 Mempool sync retry nodes: 3 nodes
 Mempool batch size: 15,000 B
 Mempool max batch delay: 10 ms

 + RESULTS:
 Consensus TPS: 967 tx/s
 Consensus BPS: 495,294 B/s
 Consensus latency: 2 ms

 End-to-end TPS: 960 tx/s
 End-to-end BPS: 491,519 B/s
 End-to-end latency: 9 ms
-----------------------------------------

The 'Consensus TPS' and 'Consensus latency' respectively report the average throughput and latency of the consensus core. The consensus latency thus refers to the time elapsed between the block's creation and commit. In contrast, 'End-to-end TPS' and 'End-to-end latency' report the performance of the whole system, starting from when the client submits the transaction. The end-to-end latency is often called 'client-perceived latency'. To accurately measure this value without degrading performance, the client periodically submits 'sample' transactions that are tracked across all the modules until they get committed into a block; the benchmark scripts use sample transactions to estimate the end-to-end latency.

The next section provides a step-by-step tutorial to run benchmarks on Amazon Web Services (AWS).

Clone this wiki locally