Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add benchmarks for varied forms of lookups #2142

Merged
merged 15 commits into from
Sep 3, 2024
Merged

Conversation

rymnc
Copy link
Member

@rymnc rymnc commented Aug 28, 2024

This PR introduces benchmarks for varied methods of database lookups for blocks and included transactions, namely -

  1. header_and_tx_lookup
  2. full_block_lookup
  3. multi_get_lookup

We can use these benchmarks to ascertain which form is best for this use-case, and then implement it.

It is suggested to alter the bench matrix before running and using a beefy machine to do so.

Run it with cargo bench --bench db_lookup_times

Linked Issues/PRs

Description

  • moved the implementation of KeyValueMutate for rocksdb out of the test module.
  • each benchmark is located in benches/benches/db_lookup_times.rs
  • stores FullFuelBlocks along with the CompressedBlocks and Transactions table to not affect single header/tx lookup times in other usages.
  • implemented various traits for the FullFuelBlocks table
  • the notion doc has more information about the analysis we did.
  • each benchmark randomizes the block height that is queries per iteration to ensure randomness and fair comparison
  • rocksdb caching has been disabled for these benchmarks

Checklist

  • Breaking changes are clearly marked as such in the PR description and changelog
  • New behavior is reflected in tests
  • The specification matches the implemented behavior (link update PR if changes are needed)

Before requesting review

  • I have reviewed the code myself
  • I have created follow-up issues caused by this PR and linked them here

After merging, notify other teams

[Add or remove entries as needed]

let height = get_random_block_height(block_count);
group.bench_function(format!("{block_count}/{tx_count}"), |b| {
b.iter(|| {
let view = database.latest_view().unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to create view before the benchmark=)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its draft, im still cleaning up !

Copy link
Member Author

@rymnc rymnc Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed! (sorry, I don't have the commit hash, it was included in a rather large commit)


for (block_count, tx_count) in matrix() {
let database = open_db(block_count, tx_count, method);
let height = get_random_block_height(block_count);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we choose a random height on each iteration? We could create rng before the iterations and just pass it between iterations

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in f716ba2

@rymnc rymnc force-pushed the test/multi-get-perf branch 5 times, most recently from 38772a1 to 42fea47 Compare August 29, 2024 08:11
@rymnc rymnc marked this pull request as ready for review August 29, 2024 09:50
@rymnc rymnc requested review from a team and xgreenx August 29, 2024 09:50
@rymnc rymnc self-assigned this Aug 29, 2024
Copy link
Contributor

@netrome netrome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is not the prettiest as you mentioned in our call yesterday, but it seems to be doing what it's supposed to do.

I'd like to see the remaining todo comments removed or clarified with follow-up issues before approving.

benches/benches/db_lookup_times_utils/matrix.rs Outdated Show resolved Hide resolved
benches/benches/db_lookup_times_utils/matrix.rs Outdated Show resolved Hide resolved
benches/benches/db_lookup_times_utils/utils.rs Outdated Show resolved Hide resolved
benches/benches/db_lookup_times.rs Outdated Show resolved Hide resolved
@rymnc rymnc requested a review from netrome August 29, 2024 15:29
Copy link
Contributor

@netrome netrome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one more todo comment to handle.

@rymnc rymnc requested a review from netrome August 29, 2024 17:31
netrome
netrome previously approved these changes Aug 29, 2024
netrome
netrome previously approved these changes Aug 30, 2024

/// See [`FullFuelBlocks`](crate::tables::FullFuelBlocks)
/// we don't use this table at the moment, only used for benchmarks
FullFuelBlocks = 22,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]: If we're gonna keep it, maybe put it above GenesisMetadata.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenesisMetadata is 21, so logically this should come after, right? or is it the general preference to keep everything before the GenesisMetadata table?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. If you look at previous additions we've always kept GenesisMetadata on the bottom.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, will make that change

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in c2d5cdc

@@ -863,6 +867,33 @@ fn next_prefix(mut prefix: Vec<u8>) -> Option<Vec<u8>> {
None
}

impl<Description> KeyValueMutate for RocksDb<Description>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be behind a feature flag, since it wasn't exposed publicly before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be behind the rocksdb feature flag, and it doesn't affect any other api's :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be behind the "test-helpers" feature. We shouldn't allow modification of the Rocksdb outside of the commit method

Copy link
Member Author

@rymnc rymnc Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in #2159, precisely in 24644c2

benches/benches/db_lookup_times_utils/seed.rs Outdated Show resolved Hide resolved
Comment on lines 147 to 148
// we seed compressed blocks and transactions to not affect individual
// lookup times
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't make sense to me inside the context of seed_full_blocks. I'm also not sure what it means to "affect individual lookup times".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the context is that if we do include FullFuelBlocks in the codebase at some point and use it, we want to use it in tandem with the FuelBlocks and Transactions tables so as to not affect their individual lookup times, i.e cases where we make individual block header calls (to get the block height, for example), and where we make individual tx lookups.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand now. I think this comment should be above the insert_full_block in that case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 7cca278

Comment on lines 118 to 121
// we seed compressed blocks and transactions to not affect individual
// lookup times
fn insert_full_block(database: &mut RocksDb<OnChain>, height: u32, tx_count: u32) {
let block = insert_compressed_block(database, height, tx_count);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I know I said here, but I think this is best.

Suggested change
// we seed compressed blocks and transactions to not affect individual
// lookup times
fn insert_full_block(database: &mut RocksDb<OnChain>, height: u32, tx_count: u32) {
let block = insert_compressed_block(database, height, tx_count);
fn insert_full_block(database: &mut RocksDb<OnChain>, height: u32, tx_count: u32) {
// we seed compressed blocks and transactions to not affect individual
// lookup times
let block = insert_compressed_block(database, height, tx_count);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 20184a4

@rymnc rymnc requested a review from netrome September 3, 2024 13:50
@rymnc rymnc enabled auto-merge (squash) September 3, 2024 14:06
@rymnc rymnc merged commit 09cf975 into master Sep 3, 2024
32 of 33 checks passed
@rymnc rymnc deleted the test/multi-get-perf branch September 3, 2024 14:13
@@ -863,6 +867,33 @@ fn next_prefix(mut prefix: Vec<u8>) -> Option<Vec<u8>> {
None
}

impl<Description> KeyValueMutate for RocksDb<Description>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be behind the "test-helpers" feature. We shouldn't allow modification of the Rocksdb outside of the commit method

@@ -67,6 +67,9 @@ pub enum Column {
UploadedBytecodes = 19,
/// See [`Blobs`](fuel_vm_private::storage::BlobData)
Blobs = 20,
/// See [`FullFuelBlocks`](crate::tables::FullFuelBlocks)
/// we don't use this table at the moment, only used for benchmarks
FullFuelBlocks = 22,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarks shouldn't affect the columns that are used in the real logic. RocksDB allows opening it with a custom set of columns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continuing work in #2159

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -47,6 +60,21 @@ impl TableWithBlueprint for FuelBlocks {
}
}

impl TableWithBlueprint for FullFuelBlocks {
type Blueprint = Merklized<
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to create a Merkle tree over the blocks used only by P2P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in #2159

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let mut group = c.benchmark_group(method);

for (block_count, tx_count) in matrix() {
let database = open_db(block_count, tx_count, method);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to clean up the database after benchmarks. If you want to keep the results there, then we can add additional ENV variable that you can set to disable clean up logic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in #2159

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fn generate_bench_transactions(tx_count: u32) -> Vec<Transaction> {
let mut txes = vec![];
for _ in 0..tx_count {
txes.push(Transaction::default_test_tx());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] It would be nice to make transactions random to be sure that compression will not affect it too much

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under the hood the transactions are randomized ~ not sure if I understand correctly!
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_random_fee_input always return the same result=D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, okay ~ will alter

rymnc added a commit that referenced this pull request Sep 9, 2024
>[!NOTE]
> This PR follows up #2142 with comments after merge

## Linked Issues/PRs
<!-- List of related issues/PRs -->
- #2142 
- #2023 

## Description
<!-- List of detailed changes -->
- refactors the test database to use a custom set of columns that will
not create any new tables/columns in `fuel-core`
- refactors the functions to return `Result`s instead of unwrapping
everywhere
- automated cleaning up of benchmark databases
- `KeyValueMutate` impl of `RocksDb` is now behind the `test-helpers`
feature flag
- moved the `db_lookup_times_utils` to the `src` directory in the
benches crate so we can test our insertion/fetching logic

## Checklist
- [x] Breaking changes are clearly marked as such in the PR description
and changelog
- [x] New behavior is reflected in tests
- [x] [The specification](https://github.com/FuelLabs/fuel-specs/)
matches the implemented behavior (link update PR if changes are needed)

### Before requesting review
- [x] I have reviewed the code myself
- [ ] I have created follow-up issues caused by this PR and linked them
here

### After merging, notify other teams

[Add or remove entries as needed]

- [ ] [Rust SDK](https://github.com/FuelLabs/fuels-rs/)
- [ ] [Sway compiler](https://github.com/FuelLabs/sway/)
- [ ] [Platform
documentation](https://github.com/FuelLabs/devrel-requests/issues/new?assignees=&labels=new+request&projects=&template=NEW-REQUEST.yml&title=%5BRequest%5D%3A+)
(for out-of-organization contributors, the person merging the PR will do
this)
- [ ] Someone else?

---------

Co-authored-by: Green Baneling <[email protected]>
@xgreenx xgreenx mentioned this pull request Sep 17, 2024
xgreenx added a commit that referenced this pull request Sep 18, 2024
## Version v0.36.0

### Added
- [2135](#2135): Added metrics
logging for number of blocks served over the p2p req/res protocol.
- [2151](#2151): Added
limitations on gas used during dry_run in API.
- [2188](#2188): Added the new
variant `V2` for the `ConsensusParameters` which contains the new
`block_transaction_size_limit` parameter.
- [2163](#2163): Added
runnable task for fetching block committer data.
- [2204](#2204): Added
`dnsaddr` resolution for TLD without suffixes.

### Changed

#### Breaking
- [2199](#2199): Applying
several breaking changes to the WASM interface from backlog:
- Get the module to execute WASM byte code from the storage first, an
fallback to the built-in version in the case of the
`FUEL_ALWAYS_USE_WASM`.
- Added `host_v1` with a new `peek_next_txs_size` method, that accepts
`tx_number_limit` and `size_limit`.
- Added new variant of the return type to pass the validation result. It
removes block serialization and deserialization and should improve
performance.
- Added a V1 execution result type that uses `JSONError` instead of
postcard serialized error. It adds flexibility of how variants of the
error can be managed. More information about it in
FuelLabs/fuel-vm#797. The change also moves
`TooManyOutputs` error to the top. It shows that `JSONError` works as
expected.
- [2145](#2145): feat:
Introduce time port in PoA service.
- [2155](#2155): Added trait
declaration for block committer data
- [2142](#2142): Added
benchmarks for varied forms of db lookups to assist in optimizations.
- [2158](#2158): Log the
public address of the signing key, if it is specified
- [2188](#2188): Upgraded the
`fuel-vm` to `0.57.0`. More information in the
[release](https://github.com/FuelLabs/fuel-vm/releases/tag/v0.57.0).

## What's Changed
* chore(p2p_service): add metrics for number of blocks requested over
p2p req/res protocol by @rymnc in
#2135
* Weekly `cargo update` by @github-actions in
#2149
* Debug V1 algorightm and use more realistic values in gas price
analysis by @MitchTurner in
#2129
* feat(gas_price_service): include trait declaration for block committer
data by @rymnc in #2155
* Convert gas price analysis tool to CLI by @MitchTurner in
#2156
* chore: add benchmarks for varied forms of lookups by @rymnc in
#2142
* Add label nochangelog on weekly cargo update by @AurelienFT in
#2152
* Log consensus-key signer address if specified by @acerone85 in
#2158
* chore(rocks_db): move ShallowTempDir to benches crate by @rymnc in
#2168
* chore(benches): conditional dropping of databases in benchmarks by
@rymnc in #2170
* feat: Introduce time port in PoA service by @netrome in
#2145
* Get DA costs from predefined data by @MitchTurner in
#2157
* chore(shallow_temp_dir): panic if not panicking by @rymnc in
#2172
* chore: Add initial CODEOWNERS file by @netrome in
#2179
* Weekly `cargo update` by @github-actions in
#2177
* fix(db_lookup_times): rework core logic of benchmark by @rymnc in
#2159
* Add verification on transaction dry_run that they don't spend more
than block gas limit by @AurelienFT in
#2151
* bug: fix algorithm overflow issues by @MitchTurner in
#2173
* feat(gas_price_service): create runnable task for expensive background
polling for da metadata by @rymnc in
#2163
* Weekly `cargo update` by @github-actions in
#2197
* Fix bug with gas price factor in V1 algorithm by @MitchTurner in
#2201
* Applying several breaking changes to the WASM interface from backlog
by @xgreenx in #2199
* chore(p2p): dnsaddr recursive resolution by @rymnc in
#2204

## New Contributors
* @acerone85 made their first contribution in
#2158

**Full Changelog**:
v0.35.0...v0.36.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants