-
Notifications
You must be signed in to change notification settings - Fork 962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ERROR Graceful stop of task failed
#677
Conversation
It looks like @dmitryikh hasn't signed our Contributor License Agreement, yet.
You can read and sign our full Contributor License Agreement here. Once you've signed reply with Appreciation of efforts, clabot |
[clabot:check] |
It looks like @dmitryikh hasn't signed our Contributor License Agreement, yet.
You can read and sign our full Contributor License Agreement here. Once you've signed reply with Appreciation of efforts, clabot |
[clabot:check] |
@confluentinc It looks like @dmitryikh just signed our Contributor License Agreement. 👍 Always at your service, clabot |
Hi @dmitryikh, it looks like your diagnosis of the issue is spot on and the changes in your PR would help prevent the error from occurring. However, they would also cause the source database to be queried 10 times a second, which may be undesirable if, for example, reading from a large number of tables concurrently with a large number of tasks from the same database. One possible alternative approach I can think of is to add some kind of interrupt semantics to the Thanks for your PR, looking forward to working with you on this! |
Hi, @C0urante , I don't agree with you about the case that querying will occur 10 times per second. Only this part of the code will be repeated 10x per second: while (running.get()) {
final TableQuerier querier = tableQueue.peek();
if (!querier.querying()) {
... That didn't perform any actual querying to the database. We just peek first table from Am I wrong? |
@dmitryikh apologies, you are correct! This approach seems fine; one request I have is that we alter the log message on line 301 ( I think either logging Again, apologies for misreading your PR and thank you for correcting me. Looking forward to merging this! |
2a14535
to
a66412f
Compare
@C0urante , please have a look. Thanks for spending your time on this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmitryikh looks good to me!
Since this is a bug fix, we'd like to backport these changes to the earliest relevant feature branch, which in this case is 3.3.x
. Could you please retarget the PR to that branch, and then we can merge?
a66412f
to
d9ed197
Compare
@C0urante , done. |
Thanks @dmitryikh! |
This fix will be available in all future bug fix releases for CP 3.3 through 5.3, and will also be included in all future releases for CP 5.4 onward. |
I've bumped into the problem:
Which leads to duplicates in kafka topics while rebalancing.
The reason is that JDBC source task is sleeping in
poll()
too much (exactlyPOLL_INTERVAL_MS_CONFIG
which is 60000 ms in my case). And worker is unable to finish all tasks intask.shutdown.graceful.timeout.ms
which is 5000 ms by default.This can be seen in the log below (see
***
marks):I propose to sleep no more than 100ms in order to be able to react on task shutdown.