Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Use Message Queue pallet for UMP dispatch #6271

Merged
merged 216 commits into from
May 19, 2023
Merged

Conversation

gavofyork
Copy link
Member

@gavofyork gavofyork commented Nov 11, 2022

Depends on paritytech/substrate#12485, #4097, #7105
Fixes #6129
Partially addresses paritytech/polkadot-sdk#489
Partially addresses paritytech/polkadot-sdk#488

This utilises the new pallet-message-queue for UMP message dispatch. With the unified dispatch system, there is no need for "UMP" to have a pallet of its own, so it is removed in favour of there being a single ump_tests.rs file to ensure that the message-queue functionality is working as expected for UMP. The event UMP::ExecutedUpward was replace by a MessageQueue::Processed.

This also means that the two weight fields of the HostConfiguration struct are no longer needed, so there is now a v5 of said struct with them gone. A migration to v5 is provided and should be used when this is merged.

Small auxiliary changes:

  • Respect MAX_UPWARD_MESSAGE_SIZE_BOUND in the configuration pallet.
  • Add a WellKnownKey struct for typed access to well-known keys.

Migrations

  • Configuration pallet v5 -> v6
  • Enqueueing a HostConfiguration update

Also re-writes the V4 -> V5 migration of the config pallet to be re-usable.

HostConfiguration

An update for the HostConfiguration is queued to update the UMP constants. This has to happen in accord with upgrading the UMP queue and is therefore not done in a normal Referendum. Considerations about the affected values are in gdrive or here as
pdf:

Constants (per para): Polkadot Kusama  Wested   Rococo
max_upward_queue_size [bytes] 1048576 4194304 8388608 8388608
max_upward_message_num_per_candidate 16 128 512 1024
max_upward_message_size [bytes] 65531 65531 131067 32763
max_upward_queue_count (1 Byte msgs) 174762 699050 1398101 1398101
MessageQueueMaxStale 8 16 48 96

Note For Para* Teams

The UMP message queue has a lazy message deletion mechanisms to unclog it in case it got filled with overweight messages.
This reaping mechanisms can be triggered by any account on the relay chain by calling MessageQueue::reap_page. This basically requires a offchain worker on the relay either for all parachains at once, or one for each. This observation mainly comes from a security perspective, since a completely clogged UMP queue is a possible DoS vector. More convenience is planned in paritytech/substrate#13877.

TODO

  • Introduce MessageQueue pallet and configure properly in Relay-chain runtimes.
  • Enable HostConfiguration migration in Relay-chain runtimes.
  • Manage avoidance of message deletion:
    • Prior to allowing a para to be retired, it should have an empty dispatch queue.
    • After a para is scheduled to be retired, it should not be allowed to receive any UMP messages.
    • Tests for both conditions
  • Permanently remove queue from the MQ pallet after para offboarding.
  • Fix XCM simulator
  • Benchmark and weight for receive_upward_messages.
  • OnFinalize hook to update the well-known-keys queue-capacity values. (I.e. after they've been executed this block.)
  • Update implementers guide

Follow Ups

  • Remove MAX_UPWARD_MESSAGE_SIZE_BOUND and use the bound from the MQ pallet.

@gavofyork gavofyork added the A3-in_progress Pull request is in progress. No review needed at this stage. label Nov 11, 2022
@gavofyork gavofyork added B7-runtimenoteworthy C3-medium PR touches the given topic and has a medium impact on builders. D9-needsaudit 👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited. E0-runtime_migration PR introduces code that might require downstream chains to run a runtime upgrade. labels Nov 11, 2022
Copy link
Member

@eskimor eskimor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

When wrapping my head around xcmp/hrmp and in particular ump, one of my first question that comes to mind for UMP, why do we need queuing at all? I can think of a couple of reasons:

  1. Execution is expected to be significantly more costly than storing.
  2. We don't want to be forced to drop messages. Given 1, by queuing we buy
    us time to react, e.g. ramping up the price so we can economically back
    pressure. If we executed immediately and we are above weight limit, we would
    have no other way than dropping messages. In the case of UMP, parachains can see filled up queue and stop sending.
  3. We already have a 1 element (per para) queue, via
    PendingAvailabilityCommitments. We could in theory back pressure right here
    and not free a core until we processed its messages. The problem apart from
    potentially worse performance (block times degrade on bursts) is fairness:
    The load might actually be coming from other parachains and we would
    potentially back-pressure on "light" ones, just because the others filled up
    the block already. Other issues?

Other observations:

  • Since the message gets into state on backing but gets queued
    on inclusion: The block causing the enqueuing of the message, does not actually
    contain it. So far just an observation, triggered by me wondering whether we could still be forced to drop messages somewhere because we exceed the weight limit due to the needed state access for example.

  • Missing apart from the already mentioned bullet points is actual processing of messages? There is ProcessXcmMessage::process_message, but it does not seem to be called anywhere and the corresponding test is also blank.

runtime/parachains/src/inclusion/mod.rs Outdated Show resolved Hide resolved
@ordian ordian mentioned this pull request Nov 25, 2022
7 tasks
@gavofyork
Copy link
Member Author

When wrapping my head around xcmp/hrmp and in particular ump, one of my first question that comes to mind for UMP, why do we need queuing at all?

It is crucial to understand that incoming messages might have arbitrary effects within a runtime. There may be constraints related to the conditions under which these effects may be realised. The most obvious example of this is that a certain amount of weight might be required. However, more exotic chains may need to impose further constraints relevant to its own abstractions and these we cannot possibly imagine right now.

Therefore, necessarily, from the point of view of the low level protocols XCMP, UMP and DMP (*MP), we must restrict our concern to ferrying datagrams and nothing more. The message queuing and execution subsystem(s) are left with the concern of ensuring that the proper effects take place ASAP given the datagrams received and taking into account quality of service and ephemeral constraints and resource limitations.

This message queue pallet provides this functionality, accepting datagrams infallibly as fast as they can be delivered by the underlying message passing system(s) and promising that (subject to certain well-understood static assumptions) they will be executed eventually. It takes into account 2D weight and QoS ensuring that one parachain's message-passing activity cannot unfairly starve another parachain from attention.

Signed-off-by: Oliver Tale-Yazdi <[email protected]>
Signed-off-by: Oliver Tale-Yazdi <[email protected]>
Signed-off-by: Oliver Tale-Yazdi <[email protected]>
@redzsina redzsina added D1-audited 👍 PR contains changes to critical logic that has been properly reviewed and externally audited. and removed D9-needsaudit 👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited. labels May 16, 2023
runtime/parachains/src/inclusion/mod.rs Outdated Show resolved Hide resolved
Co-authored-by: muharem <[email protected]>

Signed-off-by: Oliver Tale-Yazdi <[email protected]>
@ggwpez
Copy link
Member

ggwpez commented May 19, 2023

bot merge

@paritytech-processbot paritytech-processbot bot merged commit 7c1dc94 into master May 19, 2023
@paritytech-processbot paritytech-processbot bot deleted the gav-message-queue branch May 19, 2023 16:14
@ggwpez ggwpez added T6-XCM This PR/Issue is related to XCM. T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes. T2-API This PR/Issue is related to APIs. labels May 21, 2023
ordian added a commit that referenced this pull request May 23, 2023
* master: (60 commits)
  Ensure all `StorageVersion`s on Rococo/Westend are correct and migration hooks pass (#7251)
  Try-runtime proper return types (#7146)
  Have OCW mined election once a week on Westend (#7248)
  Bump enumn from 0.1.5 to 0.1.8 (#7226)
  Companion to #14183: FRAME: Allow message ID to be mutated in `ProcessMessage` (#7262)
  Remove TODO comment (#7260)
  Fix build (#7261)
  Update syn (#7258)
  Use Message Queue pallet for UMP dispatch (#6271)
  Freeze chain if there are byzantine threshold + 1 invalid votes against a local candidate (#7225)
  Revert chain if at least f+1 validators voted against a candidate (#7151)
  Ensure all `StorageVersion`s on Polkadot/Kusama are correct (#7199)
  Forgotten pub reexport for `GlobalConsensusParachainConvertsFor` (#7238)
  PVF: Vote invalid on panics in execution thread (after a retry) (#7155)
  PVF: Remove `rayon` and some uses of `tokio` (#7153)
  [xcm] Foreign global consensus parachain LocationToAccountId converter (#7016)
  Update docs (#7230)
  Bump parity-db to 0.4.8 (#7231)
  Merge branch 'master' of https://github.com/paritytech/polkadot (#7224)
  Relax the watermark rule in the runtime (#7188)
  ...
ordian added a commit that referenced this pull request May 23, 2023
…slashing-client

* ao-past-session-slashing-runtime: (61 commits)
  Ensure all `StorageVersion`s on Rococo/Westend are correct and migration hooks pass (#7251)
  Try-runtime proper return types (#7146)
  Have OCW mined election once a week on Westend (#7248)
  Bump enumn from 0.1.5 to 0.1.8 (#7226)
  Companion to #14183: FRAME: Allow message ID to be mutated in `ProcessMessage` (#7262)
  Remove TODO comment (#7260)
  Fix build (#7261)
  Update syn (#7258)
  Use Message Queue pallet for UMP dispatch (#6271)
  Freeze chain if there are byzantine threshold + 1 invalid votes against a local candidate (#7225)
  Revert chain if at least f+1 validators voted against a candidate (#7151)
  Ensure all `StorageVersion`s on Polkadot/Kusama are correct (#7199)
  Forgotten pub reexport for `GlobalConsensusParachainConvertsFor` (#7238)
  PVF: Vote invalid on panics in execution thread (after a retry) (#7155)
  PVF: Remove `rayon` and some uses of `tokio` (#7153)
  [xcm] Foreign global consensus parachain LocationToAccountId converter (#7016)
  Update docs (#7230)
  Bump parity-db to 0.4.8 (#7231)
  Merge branch 'master' of https://github.com/paritytech/polkadot (#7224)
  Relax the watermark rule in the runtime (#7188)
  ...
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B1-note_worthy Changes should be noted in the release notes C3-medium PR touches the given topic and has a medium impact on builders. D1-audited 👍 PR contains changes to critical logic that has been properly reviewed and externally audited. E0-runtime_migration PR introduces code that might require downstream chains to run a runtime upgrade. T1-runtime This PR/Issue is related to the topic “runtime”. T2-API This PR/Issue is related to APIs. T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes. T6-XCM This PR/Issue is related to XCM.
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

XCM: General XCM Queue Pallet
10 participants