Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: deadlock when closing transient push query #4297

Merged

Conversation

big-andy-coates
Copy link
Contributor

Description

fixes: #4296

The produce side not calls offer in a loop, with a short timeout, to try and put the row into the blocking queue. When the consume side closes the query, e.g. on an EOFException if the user has closed the connection, the query first closes the queue; setting a flag the producers are checking on each loop; causing any producers to exit the loop. Then it can safely close the KS topology.

Testing done

Added unit tests and manual testing

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

fixes: confluentinc#4296

The produce side not calls `offer` in a loop, with a short timeout, to try and put the row into the blocking queue. When the consume side closes the query, e.g. on an `EOFException` if the user has closed the connection, the query first closes the queue; setting a flag the producers are checking on each loop; causing any producers to exit the loop. Then it can safely close the KS topology.
@big-andy-coates big-andy-coates requested a review from a team as a code owner January 13, 2020 17:26
@big-andy-coates
Copy link
Contributor Author

We may want to backport this onto 5.4.1. But I'll do that manually.

Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find @big-andy-coates!

@agavra agavra requested a review from a team January 13, 2020 18:28
@agavra agavra self-assigned this Jan 13, 2020
@big-andy-coates big-andy-coates merged commit 6b5ce0c into confluentinc:master Jan 14, 2020
@big-andy-coates big-andy-coates deleted the transient_deadlock branch January 14, 2020 12:52
big-andy-coates added a commit that referenced this pull request Jan 14, 2020
fixes: #4296

The produce side not calls `offer` in a loop, with a short timeout, to try and put the row into the blocking queue. When the consume side closes the query, e.g. on an `EOFException` if the user has closed the connection, the query first closes the queue; setting a flag the producers are checking on each loop; causing any producers to exit the loop. Then it can safely close the KS topology.

(cherry picked from commit 6b5ce0c)
big-andy-coates added a commit that referenced this pull request Jan 14, 2020
big-andy-coates added a commit to big-andy-coates/ksql that referenced this pull request Jan 14, 2020
fixes: confluentinc#4296

The produce side not calls `offer` in a loop, with a short timeout, to try and put the row into the blocking queue. When the consume side closes the query, e.g. on an `EOFException` if the user has closed the connection, the query first closes the queue; setting a flag the producers are checking on each loop; causing any producers to exit the loop. Then it can safely close the KS topology.

(cherry picked from commit 6b5ce0c)
big-andy-coates added a commit to big-andy-coates/ksql that referenced this pull request Jan 14, 2020
big-andy-coates added a commit to big-andy-coates/ksql that referenced this pull request Jan 14, 2020
fixes: confluentinc#4296

The produce side not calls `offer` in a loop, with a short timeout, to try and put the row into the blocking queue. When the consume side closes the query, e.g. on an `EOFException` if the user has closed the connection, the query first closes the queue; setting a flag the producers are checking on each loop; causing any producers to exit the loop. Then it can safely close the KS topology.

(cherry picked from commit 6b5ce0c)
@big-andy-coates big-andy-coates mentioned this pull request Jan 14, 2020
2 tasks
big-andy-coates added a commit that referenced this pull request Jan 26, 2020
* fix: deadlock when closing transient push query (#4297)

fixes: #4296

The produce side not calls `offer` in a loop, with a short timeout, to try and put the row into the blocking queue. When the consume side closes the query, e.g. on an `EOFException` if the user has closed the connection, the query first closes the queue; setting a flag the producers are checking on each loop; causing any producers to exit the loop. Then it can safely close the KS topology.

(cherry picked from commit 6b5ce0c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock in transient queries
2 participants