Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MTG-1057 Add consistency check tools #343

Open
wants to merge 16 commits into
base: new-main
Choose a base branch
from
Open

MTG-1057 Add consistency check tools #343

wants to merge 16 commits into from

Conversation

n00m4d
Copy link
Contributor

@n00m4d n00m4d commented Dec 19, 2024

What

This PR adds new package with two binaries: compressed_assets and regular_assets. That two binaries are verifying if our RocksDB is consistent.

How

Compressed assets

This binary takes a csv file with cNFT tree addresses and verifying if each minted asset in a tree has valid proofs.

Regular assets

This binary takes solana accounts snapshot, iterates over it and checks if RocksDB contains all the NFTs and token accounts.

@n00m4d n00m4d changed the title MTG 1062 Add consistency check tools MTG 1057 Add consistency check tools Dec 19, 2024
@n00m4d n00m4d changed the title MTG 1057 Add consistency check tools MTG-1057 Add consistency check tools Dec 19, 2024
Copy link
Contributor

@StanChe StanChe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly spelling, several comments on args types and some improvements suggested. It'll work as it is, so either create additional tickets and merge or fix some comments and put non done into tickets

clap = { workspace = true, features = ["env"] }
tokio = { workspace = true, features = ["sync"] }
tokio-util = { workspace = true }
solana-sdk = "~1.18.13"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not the version from the workspace? Same question applies to other versions below


## Compressed assets

Binary `compressed_assets` is taking `csv` file with tree keys and check proof for each minted asset in a tree.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Binary `compressed_assets` is taking `csv` file with tree keys and check proof for each minted asset in a tree.
The binary `compressed_assets` takes a `csv` file with tree keys and checks the proof for each minted asset in a tree.


Binary `compressed_assets` is taking `csv` file with tree keys and check proof for each minted asset in a tree.

Here is example of `csv` file it expects to receive:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Here is example of `csv` file it expects to receive:
Here is an example of the `csv` file it expects to receive:

cargo r --bin compressed_assets -- --rpc-endpoint https://solana.rpc --db-path /path/to/rocksdb --trees-file-path ./trees.csv --workers 50 --inner-workers 300
```

Workers parameter points how many trees will be processed in parallel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Workers parameter points how many trees will be processed in parallel.
The `workers` parameter points to how many trees will be processed in parallel.


Workers parameter points how many trees will be processed in parallel.

Inner workers parameter points how many threads each worker will use to process the tree.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Inner workers parameter points how many threads each worker will use to process the tree.
The `inner-workers` parameter points to how many threads each worker will use to process the tree.

rpc_endpoint: String,

#[arg(long)]
db_path: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
db_path: String,
db_path: PathBuf,

db_path: String,

#[arg(long)]
trees_file_path: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
trees_file_path: String,
trees_file_path: PathBuf,


if let Ok(proofs) = get_proof_for_assets::<MaybeProofChecker, Storage>(
rocks_cloned,
vec![asset_id],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method is fixed and will return proofs for a list of assets. I'd suggest we pass in at least a 1000 assets, but in theory a whole chunk will be possible (although memory intensive)


const CHANNEL_SIZE: usize = 100_000_000;

const MISSED_ASSET_FILE: &str = "./missed_asset_data.csv";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be an arg with PathBuf type and this value as default

tokio::task::spawn_blocking(move || {
let cf = &db.cf_handle(C::NAME).unwrap();
let encoded_key = C::encode_key(key);
Ok(db.key_may_exist_cf(cf, &encoded_key))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may produce false positives. We already have the method

  pub(crate) fn sync_has_key(db: Arc<DB>, key: C::KeyType) -> crate::Result<bool> {
        let cf = &db.cf_handle(C::NAME).unwrap();
        let encoded_key = C::encode_key(key);
        if !db.key_may_exist_cf(cf, &encoded_key) {
            return Ok(false);
        }
        let res = db.get_cf(cf, &encoded_key)?;
        Ok(res.is_some())
    }

I'd suggest wrapping it with async instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a mix of some custom code and some code copied from Solana. If anything changes on their end, it'll be tricky to support the changes. I presume we couldn't reuse the types from Solana, could we? If we couldn't I'd suggest putting all the copied code in a separate file/files with minimal to no changes and adding custom logic beside it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants