Runtime message results #4443

ptrus · 2022-01-21T13:13:13Z

Closes: #4402

Implementation of runtime message results as described in ADR 0012.

TODO:

finish the remaining todos
implement rust types for results
add checks to e2e test runtime
update the result types in the ADR

codecov · 2022-01-24T11:07:18Z

Codecov Report

Merging #4443 (63f9a4e) into master (795eb65) will increase coverage by 0.13%.
The diff coverage is 69.97%.

❗ Current head 63f9a4e differs from pull request most recent head 0e1bdcb. Consider uploading reports for the commit 0e1bdcb to get more accurate results

@@            Coverage Diff             @@
##           master    #4443      +/-   ##
==========================================
+ Coverage   68.66%   68.79%   +0.13%     
==========================================
  Files         415      415              
  Lines       46571    46798     +227     
==========================================
+ Hits        31976    32194     +218     
- Misses      10635    10638       +3     
- Partials     3960     3966       +6

Impacted Files	Coverage Δ
go/consensus/tendermint/abci/mux.go	`57.12% <0.00%> (+1.25%)`	⬆️
go/consensus/tendermint/api/app.go	`0.00% <0.00%> (ø)`
go/consensus/tendermint/apps/beacon/beacon.go	`73.77% <0.00%> (ø)`
...consensus/tendermint/apps/keymanager/keymanager.go	`62.63% <0.00%> (ø)`
go/consensus/tendermint/apps/registry/genesis.go	`47.32% <0.00%> (ø)`
...o/consensus/tendermint/apps/scheduler/scheduler.go	`71.14% <0.00%> (ø)`
...nt/apps/supplementarysanity/supplementarysanity.go	`77.61% <0.00%> (ø)`
go/roothash/api/api.go	`80.00% <ø> (ø)`
go/runtime/registry/host.go	`71.88% <0.00%> (-4.81%)`	⬇️
go/staking/api/api.go	`62.33% <ø> (ø)`
... and 54 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d51556...0e1bdcb. Read the comment docs.

pro-wh · 2022-01-27T19:37:22Z

go/consensus/tendermint/abci/messages.go

-func (md *messageDispatcher) Publish(ctx *api.Context, kind, msg interface{}) error {
-	if len(md.subscriptions[kind]) == 0 {
-		return api.ErrNoSubscribers
+func (md *messageDispatcher) Publish(ctx *api.Context, kind, msg interface{}) ([]interface{}, error) {


now that I actually see the tests, the interface for this feels so idiosyncratic. for one, it's one of those rare functions where the first return value is meaningful when the error is non-nil.

from the code in this PR, the behavior is like this:

the results slice always has one element per subscriber

the error is a multierror with an element for each subscriber that errored, thus varying length

the subscriber's interface treats its response as undefined when the error is non-nil

for a subscriber that errors, its corresponding slot in the results is left as nil

I just wonder if we will really be able to write rigorous multi-subscriber programs with this. my concerns are:

it's not clear to what extent the code calling Publish is expected to know the composition and ordering of the subscribers. the previous code where the errors are piled together suggest the calling code doesn't know/doesn't care. but surely that knowledge must be increased now that we have responses that the calling code will be looking for.

it's not clear how the calling code should find the result it's looking for. by known index? or should it maybe loop through and find it by type? the ADR suggests that the zeroth result is the important one that the runtime will want to know about.

some calling code that's less knowledgeable about the composition of subscribers won't be able to differentiate subscribers that succeed and return nil from subscribers that errored. at least not from looking at the results slice. not sure if this will come up.

for calling code that's more knowledgeable about the composition of subscribers, I think it will be hard to match up errors with what subscriber it came from. maybe those use cases will have to include more metadata in the errors to work around that.

do we have cases where there are multiple subscribers for a message? or if not, what are the intended use cases? how do/will we handle the situation where some subscribers fail and some succeed? is the zeroth result always the one that a runtime will want to see?

the error is a multierror with an element for each subscriber that errored, thus varying length

The error return type should probably be changed to be a list of errors, so then the caller knows which responses are non failed.

do we have cases where there are multiple subscribers for a message? or if not, what are the intended use cases?

No, no such cases. Not sure about intended use-cases.

how do/will we handle the situation where some subscribers fail and some succeed

I guess it's up to the caller to decide.

is the zeroth result always the one that a runtime will want to see?

For all existing cases, yes, as there's always a single subscriber. Maybe the code should propagate all results to the caller, and not just the first one (although this makes no difference at this point in time).

There's at least a use case for zero subscribers, so that we can support non-lock-step changes to paratimes and core. Maybe now is the time to consider switching to a zero-or-one subscriber model of message handlers?

error slice would be unambiguous I guess. potentially verbose retrofitting work though.

I see some places in this PR's callsite changes like this

// Notify other interested applications about the resumed runtime. if _, err = app.md.Publish(ctx, registryApi.MessageRuntimeResumed, rt); err != nil {

I guess we've been using this message dispatcher class for things other than runtime-sent messages too. as I understand it, the requirements there are "if any handler fails, abort tx or halt consensus"

I guess we've been using this message dispatcher class for things other than runtime-sent messages too.

Yes, this is a general pubsub mechanism used by the consensus layer services.

We could simplify this by doing something like:

If any subscribe handler failed, only an error is returned (with nil result).

Only the last non-nil result is returned, possibly panic in case multiple handlers return a non-nil result.

This should be enough for the current use cases.

Ok agreed, i'll make Publish return a single result then

Updated. Please take a look, I'll update the ADR once this new approach is finalized.

go/consensus/tendermint/abci/messages.go

kostko · 2022-02-02T09:30:14Z

go/consensus/tendermint/abci/messages.go

-func (md *messageDispatcher) Publish(ctx *api.Context, kind, msg interface{}) error {
-	if len(md.subscriptions[kind]) == 0 {
-		return api.ErrNoSubscribers
+func (md *messageDispatcher) Publish(ctx *api.Context, kind, msg interface{}) ([]interface{}, error) {


We could simplify this by doing something like:

If any subscribe handler failed, only an error is returned (with nil result).

Only the last non-nil result is returned, possibly panic in case multiple handlers return a non-nil result.

This should be enough for the current use cases.

go/consensus/tendermint/apps/staking/transactions_test.go

go/roothash/api/api.go

runtime/src/consensus/roothash.rs

pro-wh

new dispatcher interface looks good, thanks!

pro-wh · 2022-02-03T22:39:36Z

go/consensus/tendermint/abci/messages.go

+				result = resp
+			case resp != nil && result != nil:
+				// Multiple non-nil results, this is unexpected and unsupported by the pub-sub interface at this time.
+				panic(fmt.Sprintf("unexpected result: got: %d, previous result: %d", resp, result))


Should be sufficient to set a flag and return an error (and possibly nil result), I think. Callers will either abort the tx or halt consensus as appropriate.

Maybe in the future we can split up the dispatcher to have SubscribeRespondingly and SubscribeQuietly, so that we can detect problems at setup time instead of at dispatch time

Agreed, but I think that panic is probably more suitable, as this is considered a programmer error at the moment.

go/consensus/tendermint/abci/messages_test.go

go/consensus/tendermint/apps/roothash/messages.go

go/consensus/tendermint/apps/staking/transactions_test.go

Implementation of runtime message results as described in [ADR 0012]. [ADR 0012]: docs/adr/0012-runtime-message-results.md

ptrus force-pushed the ptrus/feature/runtime-message-results branch 4 times, most recently from e71d44e to b69946e Compare January 24, 2022 10:33

ptrus added the s:ready-ci Status: ready for CI label Jan 24, 2022

ptrus force-pushed the ptrus/feature/runtime-message-results branch from b69946e to 6d6832a Compare January 24, 2022 10:35

ptrus force-pushed the ptrus/feature/runtime-message-results branch 8 times, most recently from edcd45f to ceaf072 Compare January 27, 2022 14:55

ptrus marked this pull request as ready for review January 27, 2022 14:55

ptrus requested review from kostko, peterjgilbert, pro-wh and Yawning as code owners January 27, 2022 14:55

pro-wh reviewed Jan 27, 2022

View reviewed changes

ptrus force-pushed the ptrus/feature/runtime-message-results branch 3 times, most recently from bba46ac to 63f9a4e Compare February 1, 2022 15:13

kostko reviewed Feb 2, 2022

View reviewed changes

ptrus force-pushed the ptrus/feature/runtime-message-results branch 5 times, most recently from 3335210 to 20553c3 Compare February 3, 2022 18:47

ptrus force-pushed the ptrus/feature/runtime-message-results branch from 20553c3 to fc9b76c Compare February 3, 2022 18:53

pro-wh approved these changes Feb 3, 2022

View reviewed changes

kostko approved these changes Feb 4, 2022

View reviewed changes

go/consensus/tendermint/abci/messages_test.go Outdated Show resolved Hide resolved

go/consensus/tendermint/apps/roothash/messages.go Outdated Show resolved Hide resolved

go/consensus/tendermint/apps/staking/transactions_test.go Outdated Show resolved Hide resolved

ptrus force-pushed the ptrus/feature/runtime-message-results branch 3 times, most recently from 3e3669c to 3ac4212 Compare February 4, 2022 15:10

ptrus enabled auto-merge February 4, 2022 15:10

ptrus force-pushed the ptrus/feature/runtime-message-results branch 2 times, most recently from 26651f9 to 3390ff9 Compare February 4, 2022 15:33

Runtime message results

0e1bdcb

Implementation of runtime message results as described in [ADR 0012]. [ADR 0012]: docs/adr/0012-runtime-message-results.md

ptrus force-pushed the ptrus/feature/runtime-message-results branch from 3390ff9 to 0e1bdcb Compare February 4, 2022 15:41

ptrus merged commit 44b27c2 into master Feb 4, 2022

ptrus deleted the ptrus/feature/runtime-message-results branch February 4, 2022 16:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime message results #4443

Runtime message results #4443

ptrus commented Jan 21, 2022 •

edited

Loading

codecov bot commented Jan 24, 2022 •

edited

Loading

pro-wh Jan 27, 2022 •

edited

Loading

ptrus Jan 27, 2022 •

edited

Loading

pro-wh Jan 27, 2022

pro-wh Jan 27, 2022 •

edited

Loading

kostko Jan 28, 2022

kostko Feb 2, 2022

ptrus Feb 3, 2022 •

edited

Loading

ptrus Feb 3, 2022

kostko Feb 2, 2022

pro-wh left a comment

pro-wh Feb 3, 2022

ptrus Feb 4, 2022

Runtime message results #4443

Runtime message results #4443

Conversation

ptrus commented Jan 21, 2022 • edited Loading

codecov bot commented Jan 24, 2022 • edited Loading

Codecov Report

pro-wh Jan 27, 2022 • edited Loading

Choose a reason for hiding this comment

ptrus Jan 27, 2022 • edited Loading

Choose a reason for hiding this comment

pro-wh Jan 27, 2022

Choose a reason for hiding this comment

pro-wh Jan 27, 2022 • edited Loading

Choose a reason for hiding this comment

kostko Jan 28, 2022

Choose a reason for hiding this comment

kostko Feb 2, 2022

Choose a reason for hiding this comment

ptrus Feb 3, 2022 • edited Loading

Choose a reason for hiding this comment

ptrus Feb 3, 2022

Choose a reason for hiding this comment

kostko Feb 2, 2022

Choose a reason for hiding this comment

pro-wh left a comment

Choose a reason for hiding this comment

pro-wh Feb 3, 2022

Choose a reason for hiding this comment

ptrus Feb 4, 2022

Choose a reason for hiding this comment

ptrus commented Jan 21, 2022 •

edited

Loading

codecov bot commented Jan 24, 2022 •

edited

Loading

pro-wh Jan 27, 2022 •

edited

Loading

ptrus Jan 27, 2022 •

edited

Loading

pro-wh Jan 27, 2022 •

edited

Loading

ptrus Feb 3, 2022 •

edited

Loading