-
Notifications
You must be signed in to change notification settings - Fork 681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanly draining envoy with DaemonSet/NLB deployment #145
Comments
Thanks for raising this issue. There's a PR in fly for adding a healthz to envoy and contour, #135. |
Healthz alone wouldn't solve it though with an NLB, since there are only TCP health checks available for a TCP NLB, but healthz on a extra port could definitely work |
This issue has been p1 without a milestone for over a year. I need to get product input on before we can prioritise it. |
We do now have the preStop hooks enabled. When an Envoy pod is starting to shut down, the hook should fire telling Envoy to drain all connections: https://github.com/heptio/contour/blob/master/examples/ds-hostnet-split/03-envoy.yaml#L60-L63 |
@stevesloka I've been testing with this recently and it seems like only telling Envoy to stop accepting connections is not enough, since Kubernetes will send the I've somewhat found a stable solution with this configuration changes:
With this I managed to get zero downtime Envoy rollouts on top of AWS NLB in a few simple tests (running hey against the NLB, the service below Envoy is a pod that sleeps for 1s before returning the HTTP response). [1]
|
@stevesloka sure, I'm up for it, I didn't do it before because I think it is still a bit ugly and I want to do more tests. |
I'm not using contour in current envs (now at a different company than where I was using it) / don't have pieces to particularly test |
I've been testing the mentioned configurations in an attempt to clean up and confirm the minimum required steps, still need a bit more checks before issuing the patch. During the tests, I found out that NLBs provisioned through Kubernetes services objects have fixed healthcheck settings (30s interval, 3 unhealthy threshold, tested on v1.12.10), it ignores the current annotations for customization of this behavior, I believe this is due to the alpha/beta state of NLB integration. Not sure if this was improved in more recent Kubernetes versions or when using the external cloud-provider. Configuring the Envoy service as Right now, I've achieved good results with the following settings on top of current examples:
This gives enough time for Kubernetes components and the NLB to notice the Unready status and remove the node/pod from the target group instances and delays the [1]
|
I have been experimenting with this as well, and I have a proposal for properly draining Envoy that builds on @rochacon's suggestion. Currently in The problem with the current preStop hook is that it does not wait until Envoy has drained all the connections. To address this, I think we could leverage the stats endpoint to create a while loop that sleeps until all the connections have been drained. Once the connections have been drained, the preStop hook returns and lets Kubernetes continue the pod shutdown process. Curious what everyone thinks of this idea. I have a PoC of that I'd be happy to contribute. |
Ping @youngnick who has strong opinions on this one.
… On 23 Aug 2019, at 05:53, Alexander Brand ***@***.***> wrote:
I have been experimenting with this as well, and I have a proposal for properly draining Envoy that builds on @rochacon's suggestion.
Currently in master, we have a preStop hook in the Envoy container that POSTs to the /healthcheck/fail endpoint. This tells Envoy that it needs to start draining connections. In addition, the failing healthcheck triggers the readiness probe to fail, which in turn tells the system to stop sending traffic to the Envoy that is shutting down.
The problem with the current preStop hook is that it does not wait until Envoy has drained all the connections. To address this, I think we could leverage the stats endpoint to create a while loop that sleeps until all the connections have been drained. Once the connections have been drained, the preStop hook returns and lets Kubernetes continue the pod shutdown process.
Curious what everyone thinks of this idea. I have a PoC of that I'd be happy to contribute.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I really like the stats watch idea, that's really good, much better than guessing timeouts. I'd be very interested to see the PoC. |
Yeah, I totally agree. What I ended up doing was adding a sidecar to handle this. The sidecar sends the POST request to Envoy, and then starts watching the stats endpoint. Once the drain is complete, it creates a file in the already-existing config volume which signals Envoy that the drain is complete. - name: graceful-terminator
image: busybox
command:
- ash
- -c
- while [[ ! -f /config/drain-complete ]]; do sleep 1; done; echo "Drain Complete"
lifecycle:
preStop:
exec:
command:
- ash
- -c
- >
wget -q -O- --post-data "" localhost:9001/healthcheck/fail;
while [[ $(wget -q -O- localhost:9001/stats | grep http.ingress_http.downstream_cx_active | awk '{print $2}') != 0 ]];
do echo "waiting until active connections are drained"; sleep 1; done;
touch /config/drain-complete;
volumeMounts:
- mountPath: /config
name: contour-config The envoy preStop hook waits until the file appears: lifecycle:
preStop:
exec:
command:
- bash
- -c
- while [[ ! -f /config/drain-complete ]]; do echo "waiting for drain confirmation"; sleep 1; done; The end result is:
All of this is governed by the Would love to upstream this if we think it is a reasonable approach. |
Would a small go program make this impl simpler? I like the idea but worry about all the code inline in the yaml definition. We could add tests, etc to the go code and maybe open up for other types of hooks that we might need in the future to manage Envoy's state. |
That is certainly another option! |
I like the use of the config volume for signal-passing, that's neat. I agree that a small go program that handles SIGTERM and then watches the Envoy stats until it's drained would be very clean, but doing it in bash is definitely a good start. |
One possibility would be to manage this cordoning via Contour.
Now envoy passes its $POD_NAME we have the ability to customise the
configuration sent to each Envoy should we choose. If, for the contour we
wanted to isolate, we sent an empty LDS table this would put the current
listeners into draining mode (600s default timeout) and immediately close
the current accepting sockets. That would prevent, possibly abruptly,
establishing more connections via the cordoned envoy.
…On Mon, 26 Aug 2019 at 16:53, Nick Young ***@***.***> wrote:
I like the use of the config volume for signal-passing, that's neat.
I agree that a small go program that handles SIGTERM and then watches the
Envoy stats until it's drained would be very clean, but doing it in bash is
definitely a good start.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#145>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAABYAZJTZ46N4OBC7AFATTQGN4YHANCNFSM4ELUE6IA>
.
|
Next steps for this issue:
On to the backlog for now. |
Signed-off-by: Daneyon Hansen <[email protected]>
sometimes it's necessary to re-deploy newer versions of contour / envoy into the cluster, and that involves killing off the running containers and replacing them with new ones.
If I'm running with an AWS ALB or classic load balancer I could probably add http healthchecks for the envoy health check endpoint + pod lifecycle hooks to make envoy stat failing it's healthcheck so it can be cleanly removed from the nlb without losing traffic
For my deployment to get HTTP2 and minimal hops, I'm using the DaemonSet / NLB deployment of contour and envoy. The NLB does TCP forwarding to contour. Currently, it healthchecks the port of contour it's forwarding traffic to (8080 or 8443). When I want / need to replace envoy, the pod is killed, envoy stops listening, but not long enough that the AWS NLB fails health checks and redirects traffic to the other host without having some lost traffic.
I'd like to help contribute closing this gap so that envoy and contour can safely / cleanly be upgraded without loss of traffic.
Two thoughts on how to do this:
SIGTERM
as part of it's lifecycle hooks (https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods) it shuts down this extra listener, which should cause health checks to fail and the removal of the instance from new traffic through the AWS NLB. After a configurable number of seconds from recieving SIGTERM, contour should put envoy into its "drain down" state where it stops accepting incoming connections / drains every existing connection. After the grace period for the pod finally comes up, or if envoy is fully drained, then the pod is killed and remaining pieces die non-gracefully / have their connections cut off.I think the second option is easier to implement and get working correctly in all cases than the first, the first I see potential issues with the envoy<->contour communication conflicting between "new" and old pods, as well as how you cleanly roll out a replacement of an existing daemon set
The text was updated successfully, but these errors were encountered: