-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADR-038 State Streaming Plugin System Updates #11175
ADR-038 State Streaming Plugin System Updates #11175
Conversation
@egaxhaj-figure Can you rename the PR title? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new HaltAppOnDeliveryError
approach greatly simplifies things vs the previous channel based communication approach, but it does require that the listening methods (e.g. ListenBeginBlock
) be synchronous with the app. Obviously this is not an issue- and indeed is the entire point- when HaltAppOnDeliveryError
is true. But doesn't this mean that when HaltAppOnDeliveryError
is false we still block and wait for an error to return (even though we'll ignore it)? So we would need a different implementation of the listening service to have asynchronous listening when HaltAppOnDeliveryError
is false (an implementation where a nil err is immediately returned).
Whereas with the channel communication whether or not we wait on the external service is managed by the SDK using the channel communication and the same external service implementation can be used in either case.
Needing a second implementation was a dramatization, we just need to add additional configuration option for the external service that tells it whether or not to immediately return a |
We also need to handle listeners concurrently. Something like below?
|
@egaxhaj-figure oh yeah, good point. If Overall, if we add some language around using |
I can't think of a reason to wait for other goroutines to finish when I agree, we do need to mention it in the docs that client application need to account for duplicate data when blocks are replayed on a node restart.
The only concern I have about immediately returning Would the following cover our two cases?
|
I see what you mean, although in my experience it is not unusual for an async function to return an
LGTM! But this approach of execute asynchronously then syncing back up (or not) with the use of a wg is starting to look a lot like the original approach using the |
How about the following?
|
@i-norden I'm updating the PR and removing the suggested WaitGroup Adding a timeout in the commit cycle will cause more problems than the one we're trying to solve for. For example:
I understand your concern about a goroutine possibly blocking indefinitely but this is an extreme edge case that we should not cover for given the two reasons above. Client APIs for databases like Postgres and streaming platforms like Kafka have built-in timeouts for when they fail to communicate with the server side. If the edge case does occur then the a node will simply Also, the community has been waiting for this feature for quite some time now and we are at a good point where we can get this out and wait for feedback. |
Does this mean that if a client subscribes to some events, and then doesn't actually receive any events from the connection, the node will halt? |
There are two modes that subscribers can operate in:
Both scenarios are controlled by |
@egaxhaj-figure
And that configuration parameter is defined by the node, not the subscriber, correct? If so, 👍 (I will note again that events don't go through consensus and therefore are not verifiable or necessarily accurate and cannot be treated as a source of truth by consumers.) |
Yes. Here's an example of what that might look like.
|
Can you point me to the code that ensures these invariants? Also, what informed the specific values for each of those configuration settings? |
You can read more about If you have question beyond the docs, you can reach out to members of the Kafka core team on slack |
Oh, Kafka is a requirement for this feature? That's surprising; Kafka is an enormous operational burden. But, okay! |
The |
You're right, I missed that in the quoted config. Mea culpa. I am curious what "delivery error" means specifically for each of the various plugins. |
|
@i-norden @marbar3778 @robert-zaremba Please review the latest ADR changes and approve? Checkout this PR for it's implementation. Love to see this make it in the 0.46 release. We have teams that will hugely benefit from state listening. In addition, please note that my time to support for this feature going forward will be very limited as resources within our team are being moved to support other efforts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! The primary point of debate that remains is whether or not to include the kafka plugin code here or in another repo as the core SDK teams are concerned about providing ongoing support for that pkg. I think that discussion can be brought over to your implementation PR.
This is something we would be happy to host separately and maintain over on the Provenance side. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
does this only need to be rereviewed? |
Yes. |
Got caught up now, and talked with Ian. He will review and we should be ready to merge this soon I think the scope of adr-038 needs to be evaluated it seems like the spec is still influx and its hard to grasp what is going on for at least me. @alexanderbez do you have thoughts to this? |
extending app.toml is already possible so there should be a way to get the possible app.toml mentioned above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
for _, listener := range app.abciListeners { | ||
listener.ListenDeliverTx(app.deliverState.ctx, req, res) | ||
wg := new(sync.WaitGroup) | ||
for _, streamingListener := range app.abciListeners { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there any concerns for performance with this approach?
closing in favour of #12629 to get this merged |
For #10096
This PR introduces updates to ADR-038 for the plugin-based streaming services. These updates reflect the implementation approach taken in provenance-io#49.
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!
to the type prefix if API or client breaking changeCHANGELOG.md
Reviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...
!
in the type prefix if API or client breaking change