You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As of #2596, nomad deregisters any consul services associated with a task just before "soft"-killing the task. The task is expected to terminate itself within the configured kill_timeout or face a "hard" kill from nomad.
However, some solutions may have other components in a distributed system depending on those consul service deregistrations for making decisions, such as a web load balancer (like fabio, traefik, or HAProxy + consul-template) that forwards traffic to a pool of currently registered services. Being a distributed system, the changes do not propagate immediately, so the load balancer may end up sending some traffic to containers that have already disappeared (if their soft kill handling is faster than the time it takes the deregistration to propagate to the load balancer).
To facilitate this kind of use case, I want to propose adding a configurable delay between the time the consul services are deregistered and the time that nomad actually initiates the kill sequence for the task. This configurable delay would be zero by default, and thus would not affect existing solutions.
There has already been some discussion of this idea in #2596, but @schmichael has requested that we discuss the idea further in this ticket.
The text was updated successfully, but these errors were encountered:
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
As of #2596, nomad deregisters any consul services associated with a task just before "soft"-killing the task. The task is expected to terminate itself within the configured
kill_timeout
or face a "hard" kill from nomad.However, some solutions may have other components in a distributed system depending on those consul service deregistrations for making decisions, such as a web load balancer (like fabio, traefik, or HAProxy + consul-template) that forwards traffic to a pool of currently registered services. Being a distributed system, the changes do not propagate immediately, so the load balancer may end up sending some traffic to containers that have already disappeared (if their soft kill handling is faster than the time it takes the deregistration to propagate to the load balancer).
To facilitate this kind of use case, I want to propose adding a configurable delay between the time the consul services are deregistered and the time that nomad actually initiates the kill sequence for the task. This configurable delay would be zero by default, and thus would not affect existing solutions.
There has already been some discussion of this idea in #2596, but @schmichael has requested that we discuss the idea further in this ticket.
The text was updated successfully, but these errors were encountered: