implement custom proposer #1320

coriolinus · 2020-06-26T13:46:55Z

Closes #1248.

A custom proposer with an overseer handle will suit our needs better than the basic authorship one.

It's not clear precisely why this is desired, but it's a pattern I've seen in several places, so I'm going this to be on the safe side. Worst case, we can revert this commit pretty easily.

We have from the problem description: > This Proposer will require an OverseerHandle to make requests via. That's next on the plate.

Closes #1339. Includes #1320.

rphmeier · 2020-07-02T14:13:22Z

roadmap/implementors-guide/src/node/utility/provisioner.md

+
+When a validator is selected by BABE to author a block, it becomes a block producer. The provisioner is the subsystem best suited to choosing which specific backed candidates and availability bitfields should be assembled into the block. To engage this functionality, a `ProvisionerMessage::RequestInherentData` is sent; the response is a set of non-conflicting candidates and the appropriate bitfields. Non-conflicting generally means that there are never two distinct parachain candidates included for the same parachain.
+
+One might ask: given `ProvisionerMessage::RequestInherentData`, what's the point of `ProvisionerMessage::RequestBlockAuthorshipData`? The answer is that the block authorship data includes more information than is present in the inherent data; disputes, for example.


Wouldn't any consumer of RequestInherentData be interested in disputes as well?

In practice our main consumer is the block authorship logic, which is clearly interested in disputes so it can report them to the chain. I am not aware of any other component which would be interested in disputes or dispute-related reports.

The proposer consumes RequestInherentData, and if it cares about disputes, I haven't yet seen why it does.

Yeah, on further inspection we probably don't need the proposer to be aware of disputes. I was thinking about a more antiquated approach where the Proposer constructs the dispute transactions and submits them to the block directly.

It seems that the "right way" to submit reports (based on @andresilva's work in GRANDPA/BABE) is to call a runtime API that submits a transaction to the queue. I guess that will end up being the main role of the MA subsystem - acting as an intermediary between POD reports and the Runtime API subsystem. Maybe we should rename it to Misbehavior Reporting, as most reports are "complete", and the actual dispute resolution / arbitration process will happen on-chain in the Validity module.

Then we shouldn't need the dispute records to be present in the Provisioner then either, because the assumption is that anything that the Provisioner is responsible for everything that the Proposer will need. If the Proposer doesn't need them, then the Provisioner shouldn't either. And then I suggest removing RequestBlockAuthorshipData, as I don't see an immediate need for it if there are no disputes and no other intended users of the Provisioner.

I think I agree with those proposals, but I also think they're out of scope for this PR.

Yup, that's fair. We have a discussion on-record to reference now at least

Created an issue from this so we don't lose it: #1362.

node/core/proposer/src/lib.rs

rphmeier · 2020-07-02T16:11:10Z

node/core/proposer/src/lib.rs

+		record_proof: RecordProof,
+	) -> Self::Proposal {
+		let mut proposal = async move {
+			let provisioner_inherent_data = self.provisioner_inherent_data.await.map_err(Error::ClosedChannelFromProvisioner)?;


The implication of returning an error here is that the validator node will not author any block. I think that it's better to author an "empty" block (at least in terms of inherent data) than no block, if the overseer or subsystems are unresponsive. Coupled with a critical warning of some kind. This would make the chain more robust.

Does it actually make the chain more robust, though? If this inherent data doesn't get injected, the inclusion_inherent module panics:

polkadot/runtime/parachains/src/inclusion_inherent.rs

Lines 133 to 134 in aeb79d4

data.get_data(&Self::INHERENT_IDENTIFIER)

.expect("inclusion inherent data failed to decode")

I think 9bae56e takes care of this as well; do you agree?

node/core/proposer/src/lib.rs

rphmeier · 2020-07-02T17:56:45Z

This could use a rebase onto master to only contain the commits that are part of this PR

coriolinus · 2020-07-03T12:23:46Z

This could use a rebase onto master to only contain the commits that are part of this PR

I agree, but after having attempted it for a long while, I don't think its worth the effort.

This means that we can wrap the delayed call into the same timeout check used elsewhere.

This satisfies two requirements: - only applies the timeout to actually fetching the provisioner data, not to constructing the block after - simplifies the problem of injecting default data if we could not get the real provisioner data in time.

rphmeier · 2020-07-03T19:10:58Z

@coriolinus the "hack" I always use when facing a bad rebase is to take these steps:

git checkout master; git pull upstream master
git checkout -b mybranch2
git log mybranch and then cherry-pick every commit that I made to mybranch in order
git branch -D mybranch; git checkout -b mybranch; git branch -D mybranch2
git push upstream mybranch2 --force

It's kind of messy, but it works

rphmeier · 2020-07-03T19:22:49Z

Looks fine but I would still like the rebase if not too much trouble.

burdges · 2020-07-06T10:15:54Z

If I understand, this Provisioner provides what Alistair and I conflated with "memepool", so we've immediate concerns like: Are these "transactions" correct? Are we moving them into blocks fast enough? Do they conflict? Or did someone else do them first?

These "transactions" have fairly bespoke logic however, like a fuller bitfields replacing emptier ones, and backing statements changing bits meaning from a block onwards, but older bitfields still being useful.

We also have censorship concerns that become much harder to discuss, formalize, etc., probably less bad than with slashing though, but not sure.

Waiting for availability pieces before issuing bitfields

Yes, we'd want nodes to delay issuing their first bitfield on some relay chain block until they can recieve some pieces, but neither should they get stuck between this delay, advancing the relay chain block on which they build bitfields, and the torrent of new blocks.

Waiting for bitfields before proposing a set of bitfields and backable candidates.

Yes, we should wait for bitfields before expiring candidates and accepting newly backed candidates. In my mind, we should handle the backing statements mostly off the relay chain, which sounds like only a minor optimization that an MVP can ignore. In all cases, we could politely decline to gossip backing statements until the previously backed candidate either expires or advances from availability to approvals, which I suppose fits with our gossip strategy elsewhere?

And this timeout: waiting for the Provisioner to provide the proposal. Essentially, the amount of time the authorship component waits for anything before just saying "fuck it" and authoring a block. Typically, the Provisioner will be gated on the above timeout or resolve near-instantly, but it is better practice not to rely on that and get stuck.

This sounds internal to the node, right? We check all the gossip signatures when receiving, and we do not have that many parachains, so I think the Provisioner can accomplish it's work quite quickly.

There are some interesting cases here, like including older availability statements for slow expired candidates. I'd skip doing that for the MVP, but maybe some such things should be collected by some "polkadot informational log parachain" eventually.

We should prevent regular transactions from competing with the A&V system for block space, or else we should remove tipping entirely, which eventually might require removing user transactions form the relay chain entirely, but maybe block space reserves work fine over the short term.

rphmeier · 2020-07-06T18:34:59Z

And this timeout: waiting for the Provisioner to provide the proposal. Essentially, the amount of time the authorship component waits for anything before just saying "fuck it" and authoring a block. Typically, the Provisioner will be gated on the above timeout or resolve near-instantly, but it is better practice not to rely on that and get stuck.

This sounds internal to the node, right? We check all the gossip signatures when receiving, and we do not have that many parachains, so I think the Provisioner can accomplish it's work quite quickly.

Yeah, this timeout is about handling what is essentially a node bug.

However we also want a related timeout which is how long to wait for bitfields / backing before authoring the relay-chain block.

Intuitively, this is going to have to be slightly longer than the timeout for signing bitfields.

rphmeier and others added 28 commits June 17, 2020 19:50

network bridge skeleton

54873ea

move some primitives around and add debug impls

5eea3bf

protocol registration glue & abstract network interface

27456b5

add send_msgs to subsystemctx

3bf17fa

select logic

e05abd2

transform different events into actions and handle

ba005c9

implement remaining network bridge state machine

7572bba

start test skeleton

24a91c4

make network methods asynchronous

064a00c

extract subsystem out to subsystem crate

c116061

port over overseer to subsystem context trait

e6162de

fix minimal example

844a9d1

fix overseer doc test

cae1561

update network-bridge crate

2c9be73

write a subsystem test-helpers crate

534535f

write a network test helper for network-bridge

8ac6269

set up (broken) view test

51ff607

Revamp network to be more async-friendly and not require Sync

960029b

fix spacing

eb52f9a

fix test compilation

f6526c4

insert side-channel for actions

0f5a1b1

Add some more message types to AllMessages

8f75746

introduce a test harness

940216b

impl ProvideInherent for InclusionInherent

a20cfd5

reduce import churn; correct expect message

5c44804

move inclusion inherent identifier into primitives

6efd05a

It's not clear precisely why this is desired, but it's a pattern I've seen in several places, so I'm going this to be on the safe side. Worst case, we can revert this commit pretty easily.

bump kusama spec_version to placate CI

fedd931

copy sc_basic_authorship::{ProposerFactory, Proposer}

d696e1b

We have from the problem description: > This Proposer will require an OverseerHandle to make requests via. That's next on the plate.

coriolinus added A3-in_progress Pull request is in progress. No review needed at this stage. B0-silent Changes should not be mentioned in any release notes labels Jun 26, 2020

coriolinus added 3 commits July 2, 2020 12:51

impl std::error::Error for Error

cfec8c2

document error-handling rationale

6092ba0

cause polkadot-service-new to compile correctly

c78dd28

coriolinus marked this pull request as ready for review July 2, 2020 12:16

github-actions bot added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Jul 2, 2020

coriolinus requested a review from rphmeier July 2, 2020 12:16

coriolinus added a commit that referenced this pull request Jul 2, 2020

convert SignedAvailabilityBitfields from newtype to typedef

37546e0

Closes #1339. Includes #1320.

coriolinus mentioned this pull request Jul 2, 2020

Convert SignedAvailabilityBitfields from newtype to typedef #1342

Merged

rphmeier reviewed Jul 2, 2020

View reviewed changes

node/core/proposer/src/lib.rs Show resolved Hide resolved

rphmeier reviewed Jul 2, 2020

View reviewed changes

node/core/proposer/src/lib.rs Outdated Show resolved Hide resolved

coriolinus added 3 commits July 3, 2020 15:01

Move potentially-blocking call from fn init -> fn propose

d37e89e

This means that we can wrap the delayed call into the same timeout check used elsewhere.

document struct Proposer

b91be19

extract provisioner data fetch

9bae56e

This satisfies two requirements: - only applies the timeout to actually fetching the provisioner data, not to constructing the block after - simplifies the problem of injecting default data if we could not get the real provisioner data in time.

rphmeier approved these changes Jul 3, 2020

View reviewed changes

rphmeier added A8-mergeoncegreen and removed A0-please_review Pull request needs code review. labels Jul 3, 2020

Merge branch 'master' into prgn-custom-proposer

0b0b86c

gavofyork merged commit 69ce9ff into master Jul 5, 2020

gavofyork deleted the prgn-custom-proposer branch July 5, 2020 17:22

coriolinus mentioned this pull request Jul 6, 2020

Submit misbehavior reports via a runtime API which submits a tx to the queue #1362

Open

rphmeier mentioned this pull request Jul 6, 2020

implement bitfield signing subsystem #1364

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement custom proposer #1320

implement custom proposer #1320

coriolinus commented Jun 26, 2020 •

edited

Loading

rphmeier Jul 2, 2020

rphmeier Jul 2, 2020

coriolinus Jul 2, 2020

rphmeier Jul 2, 2020

rphmeier Jul 2, 2020

coriolinus Jul 3, 2020

rphmeier Jul 3, 2020

coriolinus Jul 6, 2020

rphmeier Jul 2, 2020

coriolinus Jul 3, 2020

coriolinus Jul 3, 2020

rphmeier commented Jul 2, 2020

coriolinus commented Jul 3, 2020

rphmeier commented Jul 3, 2020 •

edited

Loading

rphmeier commented Jul 3, 2020

burdges commented Jul 6, 2020

rphmeier commented Jul 6, 2020


		When a validator is selected by BABE to author a block, it becomes a block producer. The provisioner is the subsystem best suited to choosing which specific backed candidates and availability bitfields should be assembled into the block. To engage this functionality, a `ProvisionerMessage::RequestInherentData` is sent; the response is a set of non-conflicting candidates and the appropriate bitfields. Non-conflicting generally means that there are never two distinct parachain candidates included for the same parachain.

		One might ask: given `ProvisionerMessage::RequestInherentData`, what's the point of `ProvisionerMessage::RequestBlockAuthorshipData`? The answer is that the block authorship data includes more information than is present in the inherent data; disputes, for example.

	data.get_data(&Self::INHERENT_IDENTIFIER)
	.expect("inclusion inherent data failed to decode")

implement custom proposer #1320

implement custom proposer #1320

Conversation

coriolinus commented Jun 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rphmeier commented Jul 2, 2020

coriolinus commented Jul 3, 2020

rphmeier commented Jul 3, 2020 • edited Loading

rphmeier commented Jul 3, 2020

burdges commented Jul 6, 2020

rphmeier commented Jul 6, 2020

coriolinus commented Jun 26, 2020 •

edited

Loading

rphmeier commented Jul 3, 2020 •

edited

Loading