-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test checkpoint data builder #20749
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
3 Skipped Deployments
|
9a95f7e
to
ba321ee
Compare
ba321ee
to
094faa4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm probably assuming a lot, but it looks like the TestCheckpointDataBuilder's purpose is to quickly generate transactions + checkpoints that are self-contained and individually correct, for the purpose of benchmarking indexer performance?
A noob question, but why not use Simulacrum? With #20729 approved, we could also generate test checkpoint data by writing test files through the transactional test runner, which would similarly allow us to generate arbitrary checkpoint data
pub fn new(checkpoint: u64) -> Self { | ||
Self { | ||
checkpoint, | ||
epoch: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of defaulting to 0, thinking we should instead set it to None, and on TestCheckpointDataBuilder::build_checkpoint
, if we didn't explicitly set this, then we'd error out
Otherwise, during our testing, we might neglect to set the correct epoch, resulting in strange behavior in our indexer
Perhaps a struct that wraps the checkpoint data builder, and would be solely responsible for advancing the epoch correctly
Edit: returning to this, seeing as how state isn't maintained across transactions and across checkpoints, I see that this isn't meant to be something like simulacrum to emulate a lockstep network but a way to generate txs and checkpoints in bulk
/// Mutate an existing object in the transaction. | ||
/// `object_idx` is a convenient representation of the object's ID. | ||
pub fn mutate_object(mut self, object_idx: u64) -> Self { | ||
let tx_builder = self.next_transaction.as_mut().unwrap(); | ||
let object_id = derive_object_id(object_idx); | ||
let object = self | ||
.object_map | ||
.get(&object_id) | ||
.cloned() | ||
.expect("Mutating an object that doesn't exist"); | ||
tx_builder.mutated_objects.push(object); | ||
self | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like this doesnt make any changes to the object, just registers it in the tx builder's mutated_objects
, so i guess this would be useful when testing object mutation but we don't care what exactly, or when mocking complex object mutation behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct!
/// Complete the current transaction and add it to the checkpoint. | ||
pub fn finish_transaction(mut self) -> Self { | ||
let TransactionBuilder { | ||
sender_idx, | ||
gas, | ||
created_objects, | ||
mutated_objects, | ||
deleted_objects, | ||
events, | ||
} = self.next_transaction.take().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so within the checkpoint builder, state isn't preserved from one transaction to the next, we consume the contents of the transactionBuilder and finalize it into a standalone transaction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just updated the PR to support generating multiple checkpoints.
This will be used to write unit tests for various handler pipelines in indexer-alt, not for benchmarking. |
094faa4
to
cef22ed
Compare
cef22ed
to
dc2470b
Compare
6f9aa10
to
194ab05
Compare
#20761 is a good demonstration on how much easier it becomes to test the handlers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, this looks good to me. This would indeed be nice to have for indexer unit tests, where we don't care so much about correctness from the network and more about whether we're indexing to the data or feature tables correctly
/// Wrap an existing object in the transaction. | ||
/// `object_idx` is a convenient representation of the object's ID. | ||
pub fn wrap_object(mut self, object_idx: u64) -> Self { | ||
let tx_builder = self.checkpoint_builder.next_transaction.as_mut().unwrap(); | ||
let object_id = Self::derive_object_id(object_idx); | ||
assert!(self.live_objects.contains_key(&object_id)); | ||
tx_builder.wrapped_objects.insert(object_id); | ||
self | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this case, should we remove it from the live_objects set? and also update the mutated_objects set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning to this: agree that it makes more sense to keep the various object functions restricted to its corresponding set, i.e, wrap and unwrap_object deal with the unwrapped_objects field, and so on, then consolidate in finish_transaction
self.live_objects | ||
.extend(output_objects.iter().map(|o| (o.id(), o.clone()))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see, when we unwrap, its added to output_objects, and then we extend the live_objects here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this is because we derive the input_objects
from live_objects
, so if we do this earlier we'll erroneously add unwrapped objects
@wlmyng Thank you for the review! |
7b77e36
to
cbee358
Compare
What's the difference or benefit of this compared to simulacrum which already provides a way to build chain states |
Biggest thing for me over simulacrum is that while it simplifies the checkpoint and epoch creation process, there's still a lot of overhead to do even a simple transaction, and gets unwieldy when trying to emulate complex transaction effects But perhaps we can consolidate? A simulacrum that can also mock out transaction execution .. or is that a thing already? |
It's really about how easy it is to write tests that need some form of checkpoint data. |
Ah, so if i understand correctly then this doesn't do any execution but lets you just craft a txn/effect how you want to more easily create test scenarios that would otherwise take a lot of setup for? |
Correct! |
Description
This PR adds a builder that could generate arbitrary checkpoint data.
It will make it easy to test the indexer.
Test plan
Added unit tests.
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.