-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shutdown_delay not considered /w group defined services #6704
Comments
Thanks for the bug report @djenriquez! Shutdown delay was only implemented for task services, but should apply when using group services as well. The 2 implementation options I can think of are:
|
@schmichael Thank you very much for the quick response! Not sure how difficult this work would be, sounds like it would require some struct changes, but would this be a quick one? I think there is patience internally since our apps are mostly fault-tolerant, but if not, we may need to revert out of network namespaces as the 502s are not pretty to see in our highly dynamic environment. |
Also regarding option 1, I'm not sure how that would work since the services being registered would represent all tasks in the group. You'd probably have to introduce logic to use the greatest shutdown delay of all the tasks. |
#1 wouldn't require any struct changes but is arguably the least user friendly: when an allocation is killed each task would wait its own shutdown_delay between deregistering services and sending the signal. So if you have 3 tasks in a group and only 1 sets |
Ah I see, shutdown signal would be handled differently for each task. That makes a lot of sense, thanks for clarifying. |
Another scenario for In my specific case, I have batch periodic jobs running every hour... they run really fast and generate some logs. I have filebeat running in the same task group to send the logs to logstash but what I noticed is, the I have |
@drewbailey should this issue have been closed by #6746? |
Yes thanks, not sure why it didn't auto-close :( |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Output from
nomad version
Operating system and Environment details
Amazon Linux 2
Issue
Allocations on shutdown do not seem to be respecting the shutdown_delay. I believe this may be because before, when services were mapped to a task, there is a 1:N correlation on which consul services to deregister before sending the kill signal. Now that the task does not have a service defined (since its in the group level), I believe it is completely ignoring
shutdown_delay
.We see this happening in our production environment, where on an allocation shutdown, a kill signal is sent and the service terminates almost immediately, even though we have a shutdown-delay defined as 10s for the tasks within the group, resulting in problematic 502s.
Is this a known issue/regression from upgrading to network namespaces? Should there be a group-level
shutdown_delay
field introduced?I see that shutdown_delay is included for the
sidecar_task
stanza, should this have been included in the more genericgroup
stanza ??The text was updated successfully, but these errors were encountered: