-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent kill_timeout greater than progress_deadline #16761
Conversation
af0c8b2
to
e0dfa06
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, thanks @Juanadelacuesta! It looks like the new validation has meant some tests are now breaking, which need to be fixed up. Lets also add a changelog entry to this PR.
nomad/structs/structs.go
Outdated
@@ -5136,7 +5144,7 @@ func (u *UpdateStrategy) IsEmpty() bool { | |||
return true | |||
} | |||
|
|||
return u.MaxParallel == 0 | |||
return *u == *DefaultUpdateStrategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this doing an equality check rather than checking if the strategy is empty?
nomad/structs/structs.go
Outdated
@@ -5145,6 +5153,38 @@ func (u *UpdateStrategy) Rolling() bool { | |||
return u.Stagger > 0 && u.MaxParallel > 0 | |||
} | |||
|
|||
func (u *UpdateStrategy) Canonicalize() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry that having this new functionality will mean users upgrading will see their job specification change between versions. I wonder why we need this, when the canonicalization should be handled by before it reaches the RPC.
I took a look at the use of structs.DefaultUpdateStrategy
and it seems this only currently gets used by mocks and tests, which suggests we currently rely on the API default instantiation as mentioned above.
.changelog/16761.txt
Outdated
@@ -0,0 +1,3 @@ | |||
```release-note:improvement | |||
code: Prevent Kill Timeout greater than Progress Deadline on Update block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code or core? It'd be nice if the parameters were detailed in their jobspec form, as it makes it easier for users to grok.
ad6cd2f
to
4052150
Compare
73845bc
to
af22534
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of nit pick changes, but LGTM, thanks!
We will need to add backport labels before merging this in.
Co-authored-by: James Rasell <[email protected]>
Co-authored-by: James Rasell <[email protected]>
* func: add validation for kill timeout smaller than progress dealine * style: add changelog * style: typo in changelog * style: remove refactored test * Update .changelog/16761.txt Co-authored-by: James Rasell <[email protected]> * Update nomad/structs/structs.go Co-authored-by: James Rasell <[email protected]> --------- Co-authored-by: James Rasell <[email protected]>
* func: add validation for kill timeout smaller than progress dealine * style: add changelog * style: typo in changelog * style: remove refactored test * Update .changelog/16761.txt * Update nomad/structs/structs.go --------- Co-authored-by: Juana De La Cuesta <[email protected]> Co-authored-by: James Rasell <[email protected]>
* func: add validation for kill timeout smaller than progress dealine * style: add changelog * style: typo in changelog * style: remove refactored test * Update .changelog/16761.txt Co-authored-by: James Rasell <[email protected]> * Update nomad/structs/structs.go Co-authored-by: James Rasell <[email protected]> --------- Co-authored-by: James Rasell <[email protected]>
…release/1.4.x (#17205) * test: fix TestJobEndpoint_Scale_BatchJob * Prevent kill_timeout greater than progress_deadline (#16761) * func: add validation for kill timeout smaller than progress dealine * style: add changelog * style: typo in changelog * style: remove refactored test * Update .changelog/16761.txt Co-authored-by: James Rasell <[email protected]> * Update nomad/structs/structs.go Co-authored-by: James Rasell <[email protected]> --------- Co-authored-by: James Rasell <[email protected]> --------- Co-authored-by: Luiz Aoqui <[email protected]> Co-authored-by: Juana De La Cuesta <[email protected]> Co-authored-by: James Rasell <[email protected]>
This PR addresses the bug reported on #8487
If a kill_timeout is greater than the job's progress_deadline, allocations may keep running (after initial kill signal) for long enough that the deploy fails. However, the allocations that were scheduled by that job will still be pending after job failure, and will be placed and started once the previous allocations do exit. This PR adds a validation to avoid this situation.