Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

draft: reference implementation NEP-539 #10918

Closed
wants to merge 3 commits into from

Conversation

jakmeier
Copy link
Contributor

@jakmeier jakmeier commented Apr 2, 2024

Reference implementation of NEP-539.

Advice: It is best to review per commit.

The reference implementation is without tests and certain features might not be working properly. But it should show how all the relevant parts are handled.

While the most relevant parts will be patched here as well, most of the further development to make this ready for nearcore will happen on separate branches that build on this.

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far but I only got half the way. Will continue next week.

random thoughts:

  • Iterating all the delayed / buffered receipts sounds like a slow operation and one that mightly contributes to state witness size. We may want to optimize it to maintaining dedicated counters for size and gas of respective queues or buffers and update it on the fly as we push or pop receipts.

Comment on lines 265 to 267
// TODO: If `Nep539CongestionControl` is enabled, we
// need to compute it by iterating actual receipts.
chunk_extra.stored_receipts_info().cloned().unwrap_or_default(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary function of this method is to take the chunk extra of the parent and split the info to the children shards. This is going to be done for all blocks in the epoch preceding switch to the new epoch. In theory we should do what you suggested in the TODO - recalculate it based on the chunk.

That being said those values are not going to be used except maybe for the last one in the epoch. I think we can consider just copying the info from the parent and accept that we'll have overestimated congestion for one block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overestimating would be fine, yes, but we need to bootstrap the correct value in order to have correct values at all. This is unless you want to iterate the entire queues each time, which is not what the current implementation does.

Comment on lines +743 to +750
// TODO: Is this the right epoch ID?
let shard_uid = self.epoch_manager.shard_id_to_uid(shard_id, block_header.epoch_id())?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use the epoch id of the previous block. This is very confusing though.

Comment on lines 753 to 780
let stored_receipts_info = match prev_chunk_extra.stored_receipts_info() {
Some(it) => it.clone(),
None => {
// No receipts information stored previously. If the cross-shard
// congestion feature is not enabled, simply return a dummy value,
// it won't be used anyway.
// If the feature is enabled but the info is not available, it must
// be the first chunk with the feature. In that case, compute it.

// TODO: check if we can really use flat storage here
let use_flat_storage = true;
let trie = self.runtime_adapter.get_trie_for_shard(
shard_id,
block_header.prev_hash(),
*prev_chunk_extra.state_root(),
use_flat_storage,
)?;
let prev_epoch_id = self.epoch_manager.get_epoch_id(block_header.prev_hash())?;
let config = self.runtime_adapter.get_protocol_config(&prev_epoch_id)?;
node_runtime::compute_stored_receipts_info(&trie, &config.runtime_config)?
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to avoid this complexity I would consider just returning default if it's none. We can afford to have congestion control kick in one block later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we need to bootstrap the correct value at least once anyway, so I think we cannot get rid of the complexity, only move it around.

Comment on lines 892 to 896
// TODO: figure out real info instead of `unwrap_or_default`
stored_receipts_info: chunk_extra
.stored_receipts_info()
.cloned()
.unwrap_or_default(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances can the info in chunk extra be none?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only when the protocol version changes. Perhaps also after resharding, depends on how that is implemented.

@@ -835,8 +889,17 @@ impl<'a> ChainUpdate<'a> {
gas_limit: chunk_extra.gas_limit(),
is_new_chunk: false,
is_first_block_with_chunk_of_version: false,
// TODO: figure out real info instead of `unwrap_or_default`
stored_receipts_info: chunk_extra
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chunk_extra should really be called prev_chunk_extra
Outside of the scope of this PR, just venting.

}

impl StoredReceiptsInfo {
pub fn nep539_congestion_info(&self) -> CongestionInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we avoid referencing the nep number in the method names?

If you're foreseeing changes to the stored info perhaps we append struct names with V1 and put them inside an enum like we do for Block at al?

Comment on lines 71 to 73
const MAX_OUTGOING_GAS_PER_SHARD: Gas = 30 * 10u64.pow(15);
let gas = MAX_OUTGOING_GAS_PER_SHARD as u128 * (u16::MAX as u128 + 1)
/ (self.incoming_congestion as u128 + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting something more along the lines of

mix(max_send_limit, min_send_limit, incoming_congestion);

Comment on lines 85 to 87
let gas = MAX_TX_GAS as u128
- (MAX_TX_GAS - MIN_TX_GAS) as u128 * (u16::MAX as u128 + 1)
/ (self.incoming_congestion as u128 + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about mix helper method? :)

Comment on lines 92 to 72
pub fn shard_accepts_transactions(&self) -> bool {
// TODO: find better home for the const?
const MEMORY_CONGESTION_THRESHOLD: u16 = u16::MAX / 2;
self.memory_congestion <= MEMORY_CONGESTION_THRESHOLD
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not within the NEP but we can also do linear interpolation here.

fn u16_fraction(value: u64, max: u64) -> u16 {
let bounded_value = std::cmp::min(value, max);
(bounded_value as u128 * u16::MAX as u128 / max as u128) as u16
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self
continue review from here

This is a part of NEP-539.

This PR alone only adds the fields, without actually making use of them.
But it adds all the necessary data for the congestion control strategy
to be able to function. This makes this PR by far the largest.

Splitting the feature into multiple PRs is supposed to make reviewing it
easier.
This adds one outgoing buffer for each other shard
to the trie.

This PR is a part of NEP-539.
This PR builds on other parts of NEP-539.
It uses the congestion info and the outgoing receipts
buffers introduced in previous commits.
@jakmeier
Copy link
Contributor Author

jakmeier commented Jun 5, 2024

Closing this because the NEP was approved and the actual implementation was included in dozens of smaller PRs.

@jakmeier jakmeier closed this Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants