-
Notifications
You must be signed in to change notification settings - Fork 740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc v2: backpressure chainHead_v1_storage #5741
Conversation
substrate/client/rpc-spec-v2/src/chain_head/chain_head_storage.rs
Outdated
Show resolved
Hide resolved
substrate/client/rpc-spec-v2/src/chain_head/chain_head_storage.rs
Outdated
Show resolved
Hide resolved
The CI pipeline was cancelled due to failure one of the required jobs. |
Storage::query_iter
substrate/client/rpc-spec-v2/src/chain_head/subscription/inner.rs
Outdated
Show resolved
Hide resolved
substrate/client/rpc-spec-v2/src/chain_head/subscription/inner.rs
Outdated
Show resolved
Hide resolved
substrate/client/rpc-spec-v2/src/chain_head/subscription/inner.rs
Outdated
Show resolved
Hide resolved
impl OperationState { | ||
pub fn stop(&mut self) { | ||
if !self.stop.is_stopped() { | ||
self.operations.lock().remove(&self.operation_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
annoying to lock the mutex here instead of AtomicBool but this is needed to have an async notification when operation is stopped.
couldn't find a better way for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is acceptable since the operationStop
shouldn't be happening too often if at all. We are also acquiring the mutex on dropping RegisteredOperations
to clean up the tracing of operation IDs.
substrate/client/rpc-spec-v2/src/chain_head/chain_head_storage.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for tackling this 🙏
Tiny nits around the return value of the chainHead_continue
method and some open question about the number of reserved operation that should discard items
Co-authored-by: James Wilson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and a nice general improvement in the code!
Hey! Just circling back on this, do you think it is feasible to backport this PR into |
Close #5589 This PR makes it possible for `rpc_v2::Storage::query_iter_paginated` to be "backpressured" which is achieved by having a channel where the result is sent back and when this channel is "full" we pause the iteration. The chainHead_follow has an internal channel which doesn't represent the actual connection and that is set to a very small number (16). Recall that the JSON-RPC server has a dedicate buffer for each connection by default of 64. - Because `archive_storage` also depends on `rpc_v2::Storage::query_iter_paginated` I had to tweak the method to support limits as well. The reason is that archive_storage won't get backpressured properly because it's not an subscription. (it would much easier if it would be a subscription in rpc v2 spec because nothing against querying huge amount storage keys) - `query_iter_paginated` doesn't necessarily return the storage "in order" such as - `query_iter_paginated(vec![("key1", hash), ("key2", value)], ...)` could return them in arbitrary order because it's wrapped in FuturesUnordered but I could change that if we want to process it inorder (it's slower) - there is technically no limit on the number of storage queries in each `chainHead_v1_storage call` rather than the rpc max message limit which 10MB and only allowed to max 16 calls `chainHead_v1_x` concurrently (this should be fine) - Iterate over 10 accounts on westend-dev -> ~2-3x faster - Fetch 1024 storage values (i.e, not descedant values) -> ~50x faster - Fetch 1024 descendant values -> ~500x faster The reason for this is because as Josep explained in the issue is that one is only allowed query five storage items per call and clients has make lots of calls to drive it forward.. --------- Co-authored-by: command-bot <> Co-authored-by: James Wilson <[email protected]>
I had a look and stable2409 looks possible to backport and I have created a PR for it but stable2407 requires a bunch of others PRs that are not backported. Hopefully stable2409 is sufficient? |
I'd be great to backport all recent PRs around rpc server v2 to both supported versions. This will ensure that nodes get those fixes faster. I leave it to you to decide 👍🏻 |
Close #5589 This PR makes it possible for `rpc_v2::Storage::query_iter_paginated` to be "backpressured" which is achieved by having a channel where the result is sent back and when this channel is "full" we pause the iteration. The chainHead_follow has an internal channel which doesn't represent the actual connection and that is set to a very small number (16). Recall that the JSON-RPC server has a dedicate buffer for each connection by default of 64. - Because `archive_storage` also depends on `rpc_v2::Storage::query_iter_paginated` I had to tweak the method to support limits as well. The reason is that archive_storage won't get backpressured properly because it's not an subscription. (it would much easier if it would be a subscription in rpc v2 spec because nothing against querying huge amount storage keys) - `query_iter_paginated` doesn't necessarily return the storage "in order" such as - `query_iter_paginated(vec![("key1", hash), ("key2", value)], ...)` could return them in arbitrary order because it's wrapped in FuturesUnordered but I could change that if we want to process it inorder (it's slower) - there is technically no limit on the number of storage queries in each `chainHead_v1_storage call` rather than the rpc max message limit which 10MB and only allowed to max 16 calls `chainHead_v1_x` concurrently (this should be fine) - Iterate over 10 accounts on westend-dev -> ~2-3x faster - Fetch 1024 storage values (i.e, not descedant values) -> ~50x faster - Fetch 1024 descendant values -> ~500x faster The reason for this is because as Josep explained in the issue is that one is only allowed query five storage items per call and clients has make lots of calls to drive it forward.. --------- Co-authored-by: command-bot <> Co-authored-by: James Wilson <[email protected]>
Yeah, ok I had another look and it was possible. Opened #6114 as well puuh :) |
yeah, cool! |
Close #5589
This PR makes it possible for
rpc_v2::Storage::query_iter_paginated
to be "backpressured" which is achieved by having a channel where the result is sent back and when this channel is "full" we pause the iteration.The chainHead_follow has an internal channel which doesn't represent the actual connection and that is set to a very small number (16). Recall that the JSON-RPC server has a dedicate buffer for each connection by default of 64.
Benchmarks using subxt on localhost
The reason for this is because as Josep explained in the issue is that one is only allowed query five storage items per call and clients has make lots of calls to drive it forward..