-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DPLT-1020 Fetch indexing metadata file to determine last_indexed_block. #96
Conversation
gabehamilton
commented
Jun 5, 2023
•
edited
Loading
edited
- Retrieves last_indexed_block from indexing metadata file to delineate where direct filtering from S3 Lake blocks should start.
- Use archival RPC nodes to determine date of start_block_height so historical processing can go further back than five epochs.
Use archival RPC nodes to determine data of start_block_height
"Last indexed block from latest_block.json: {:?}", | ||
last_indexed_block | ||
); | ||
let last_indexed_block: u64 = from_str(last_indexed_block.as_str().unwrap()).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if last_indexed_block
happens to null because there is an issue in S3 or any network issues, it will cause this line to panic without any helpful error messages.
Can we change the function signature to return a Result<BlockHeight, Error> and either use ?
to propagate the error message to the caller of this method or use an `.except("Unable to convert to a u64") on this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All set now. More error handling for the S3 files coming in https://pagodaplatform.atlassian.net/browse/DPLT-1012
@@ -116,21 +135,53 @@ async fn process_historical_messages( | |||
block_difference | |||
} | |||
|
|||
pub(crate) async fn last_indexed_block_from_metadata( | |||
aws_config: &SdkConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're instantiating a new S3 client in a few different places within this file, probably easier just to create one which we can share across all these functions :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had started with that but ran into errors where I needed to clone it. Maybe I can narrow it down though now that more functions are using it. Also possible that those errors were related to my attempt to use the s3_fetcher module from near-lake-framework; another reason to review.
let metadata: serde_json::Value = serde_json::from_str(&metadata).unwrap(); | ||
let last_indexed_block = metadata["last_indexed_block"].clone(); | ||
let last_indexed_block = last_indexed_block.as_str(); | ||
if last_indexed_block.is_none() { | ||
return Err(anyhow::anyhow!( | ||
"No last_indexed_block found in latest_block.json" | ||
)); | ||
} | ||
let last_indexed_block = last_indexed_block.unwrap(); | ||
let last_indexed_block = from_str(last_indexed_block); | ||
if last_indexed_block.is_err() { | ||
return Err(anyhow::anyhow!( | ||
"last_indexed_block couldn't be converted to u64" | ||
)); | ||
} | ||
let last_indexed_block = last_indexed_block.unwrap(); | ||
tracing::info!( | ||
target: crate::INDEXER, | ||
"Last indexed block from latest_block.json: {:?}", | ||
last_indexed_block | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably remove this entire block of code by parsing metadata
as a strong type in serde
. That would do all the type & existence checking for us meaning we wouldn't have to do it here.
Having concrete types is also nice from a documentation standpoint, as it's more clear what the JSON actually contains.
if last_indexed_block.is_err() { | ||
tracing::error!( | ||
target: crate::INDEXER, | ||
last_indexed_block = ?last_indexed_block, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I didn't know you could do this
Merging to test longer historical processing on dev. Will address PR comments in subsequent PR. |