-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Fix possible race condition starting datafeed #51646
[ML] Fix possible race condition starting datafeed #51646
Conversation
Datafeeds being closed while starting could result in and NPE. This was handled as any other failure, masking out the NPE. However, this conflicts with the changes in elastic#50886. Related to elastic#50886 and elastic#51302
Pinging @elastic/ml-core (:ml) |
@@ -520,7 +520,12 @@ private void runTask(TransportStartDatafeedAction.DatafeedTask task) { | |||
// a context with sufficient permissions would coincidentally be in force in some single node | |||
// tests, leading to bugs not caught in CI due to many tests running in single node test clusters. | |||
try (ThreadContext.StoredContext ignore = threadPool.getThreadContext().stashContext()) { | |||
innerRun(runningDatafeedsOnThisNode.get(task.getAllocationId()), task.getDatafeedStartTime(), task.getEndTime()); | |||
Holder holder = runningDatafeedsOnThisNode.get(task.getAllocationId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting in a sleep(100) before this line provokes the NPE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if you could just change two words.
Thanks for fixing this. Maybe it will help with occasional weird failures we get during test cleanup.
if (holder != null) { | ||
innerRun(holder, task.getDatafeedStartTime(), task.getEndTime()); | ||
} else { | ||
logger.warn("Datafeed [{}] was closed while being opened", task.getDatafeedId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use the terms “started” and “stopped” with datafeeds instead of “opened” and “closed”, so please could you change those two words in this message.
Thanks David. |
The test that failed in CI was |
Datafeeds being closed while starting could result in and NPE. This was
handled as any other failure, masking out the NPE. However, this
conflicts with the changes in #50886.
Related to #50886 and #51302