-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime message results #4443
Runtime message results #4443
Conversation
e71d44e
to
b69946e
Compare
b69946e
to
6d6832a
Compare
Codecov Report
@@ Coverage Diff @@
## master #4443 +/- ##
==========================================
+ Coverage 68.66% 68.79% +0.13%
==========================================
Files 415 415
Lines 46571 46798 +227
==========================================
+ Hits 31976 32194 +218
- Misses 10635 10638 +3
- Partials 3960 3966 +6
Continue to review full report at Codecov.
|
edcd45f
to
ceaf072
Compare
func (md *messageDispatcher) Publish(ctx *api.Context, kind, msg interface{}) error { | ||
if len(md.subscriptions[kind]) == 0 { | ||
return api.ErrNoSubscribers | ||
func (md *messageDispatcher) Publish(ctx *api.Context, kind, msg interface{}) ([]interface{}, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that I actually see the tests, the interface for this feels so idiosyncratic. for one, it's one of those rare functions where the first return value is meaningful when the error is non-nil.
from the code in this PR, the behavior is like this:
- the results slice always has one element per subscriber
- the error is a
multierror
with an element for each subscriber that errored, thus varying length - the subscriber's interface treats its response as undefined when the error is non-nil
- for a subscriber that errors, its corresponding slot in the results is left as nil
I just wonder if we will really be able to write rigorous multi-subscriber programs with this. my concerns are:
- it's not clear to what extent the code calling Publish is expected to know the composition and ordering of the subscribers. the previous code where the errors are piled together suggest the calling code doesn't know/doesn't care. but surely that knowledge must be increased now that we have responses that the calling code will be looking for.
- it's not clear how the calling code should find the result it's looking for. by known index? or should it maybe loop through and find it by type? the ADR suggests that the zeroth result is the important one that the runtime will want to know about.
- some calling code that's less knowledgeable about the composition of subscribers won't be able to differentiate subscribers that succeed and return
nil
from subscribers that errored. at least not from looking at the results slice. not sure if this will come up. - for calling code that's more knowledgeable about the composition of subscribers, I think it will be hard to match up errors with what subscriber it came from. maybe those use cases will have to include more metadata in the errors to work around that.
do we have cases where there are multiple subscribers for a message? or if not, what are the intended use cases? how do/will we handle the situation where some subscribers fail and some succeed? is the zeroth result always the one that a runtime will want to see?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the error is a multierror with an element for each subscriber that errored, thus varying length
The error return type should probably be changed to be a list of errors, so then the caller knows which responses are non failed.
do we have cases where there are multiple subscribers for a message? or if not, what are the intended use cases?
No, no such cases. Not sure about intended use-cases.
how do/will we handle the situation where some subscribers fail and some succeed
I guess it's up to the caller to decide.
is the zeroth result always the one that a runtime will want to see?
For all existing cases, yes, as there's always a single subscriber. Maybe the code should propagate all results to the caller, and not just the first one (although this makes no difference at this point in time).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's at least a use case for zero subscribers, so that we can support non-lock-step changes to paratimes and core. Maybe now is the time to consider switching to a zero-or-one subscriber model of message handlers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error slice would be unambiguous I guess. potentially verbose retrofitting work though.
I see some places in this PR's callsite changes like this
// Notify other interested applications about the resumed runtime.
if _, err = app.md.Publish(ctx, registryApi.MessageRuntimeResumed, rt); err != nil {
I guess we've been using this message dispatcher class for things other than runtime-sent messages too. as I understand it, the requirements there are "if any handler fails, abort tx or halt consensus"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we've been using this message dispatcher class for things other than runtime-sent messages too.
Yes, this is a general pubsub mechanism used by the consensus layer services.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could simplify this by doing something like:
- If any subscribe handler failed, only an error is returned (with nil result).
- Only the last non-nil result is returned, possibly panic in case multiple handlers return a non-nil result.
This should be enough for the current use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok agreed, i'll make Publish
return a single result then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Please take a look, I'll update the ADR once this new approach is finalized.
bba46ac
to
63f9a4e
Compare
func (md *messageDispatcher) Publish(ctx *api.Context, kind, msg interface{}) error { | ||
if len(md.subscriptions[kind]) == 0 { | ||
return api.ErrNoSubscribers | ||
func (md *messageDispatcher) Publish(ctx *api.Context, kind, msg interface{}) ([]interface{}, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could simplify this by doing something like:
- If any subscribe handler failed, only an error is returned (with nil result).
- Only the last non-nil result is returned, possibly panic in case multiple handlers return a non-nil result.
This should be enough for the current use cases.
3335210
to
20553c3
Compare
20553c3
to
fc9b76c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new dispatcher interface looks good, thanks!
result = resp | ||
case resp != nil && result != nil: | ||
// Multiple non-nil results, this is unexpected and unsupported by the pub-sub interface at this time. | ||
panic(fmt.Sprintf("unexpected result: got: %d, previous result: %d", resp, result)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be sufficient to set a flag and return an error (and possibly nil result), I think. Callers will either abort the tx or halt consensus as appropriate.
Maybe in the future we can split up the dispatcher to have SubscribeRespondingly and SubscribeQuietly, so that we can detect problems at setup time instead of at dispatch time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but I think that panic is probably more suitable, as this is considered a programmer error at the moment.
3e3669c
to
3ac4212
Compare
26651f9
to
3390ff9
Compare
Implementation of runtime message results as described in [ADR 0012]. [ADR 0012]: docs/adr/0012-runtime-message-results.md
3390ff9
to
0e1bdcb
Compare
Closes: #4402
Implementation of runtime message results as described in ADR 0012.
TODO: