-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tr: Fetch Wait channel before killTask in restart #5889
Conversation
Currently, if killTask results in the termination of a process before calling WaitTask, Restart() will incorrectly return a TaskNotFound error when using the raw_exec driver on Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch. Should kill have similar treatment or doesn't it matter since kill always transitions to a terminal state?
I’m debugging a similar issue during executor shutdown right now which might end up leading to something similar, but not fully sure yet. |
Good catch! Is this something we can test? Feels like a subtle thing that we might miss with enough refactoring. Also, I'm very confused how the ordering causes TaskNotFound error, mind if you elaborate on how it gets to that state? In raw_exec, Seeing [1] nomad/drivers/rawexec/driver.go Lines 383 to 387 in 079cfb4
[2] nomad/drivers/rawexec/driver.go Line 461 in 079cfb4
|
@notnoop DestroyTask gets called as part of terminating a tasks testing this reliably seems pretty hard right now, because it's very much integration test-y and would need a full driver matrix - might be dooable when we invest in more test infra soon though. |
Ah - makes sense - I missed that call - thanks for the clarification! |
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Currently, if killTask results in the termination of a process before
calling WaitTask, Restart() will incorrectly return a TaskNotFound
error when using the raw_exec driver on Windows.
This fetches the WaitCh before killing the process to avoid this race
condition.