-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readiness check behaves wrongly #3233
Comments
You can already configure router/helm/chart/router/values.yaml Line 207 in daf9dda
|
Yes, but that's not sufficient unfortunately. In order for the readiness to behave correctly both that value and being able to configure how long the router itself is able to work on draining connections needs to be configurable. We need to be able to respond to requests during the time when the LB is still removing us and after that removal stop serving requests and drain old connections |
I've bypassed this by adding a shell script that captures the signals and creates a file when it shuts down. The readiness probe then checks if that file exists in order to determine if the pod is still ready or not. This works but it's not especially nice :( |
I agree this isn't a great situation. I've got a WIP PR which addresses this problem and I've bumped the priority to P1. My hope is that I'll find some time to fix this soon. |
@SvenW Could you share what exactly did you do to overcome this issue please? I have tried to add |
Sure. Disregard the ugliness in my little bash script here:
In order for it to work I had to send a SIGINT to the router pid, if I didn't it didn't receive the termination as it should, and then in my deployment.yaml for the router I had to rewrite the readiness probe to a bash based like this:
|
That makes sense, that is actually beautiful, thanks! |
@SvenW And do you have any special configuration in deployment? |
I just use the default |
All right, one more question why do you have that 20s sleep? |
Something that is less than the terminationGracePeriodSeconds so we have a little time to say we're not ready before removing the currently running requests as the router does for us. |
But this trap 'kill -TERM $PID' TERM INT first command will send term signal to router as soon as it will receive it right? What is that for? I thought that we want to catch that signal, first create shutdown file and wait for 20s and then send term signal to router? Or am I missing something? |
It is catching the term signal now. We are sending the SIGINT signal to the pid after 20s. You can test it locally by building your local Dockerfile and then sending a SIGINT or SIGTERM signal to that container to validate that it's working as it should. If you have ideas of improving the bash script I would really appreciate it |
All right it works, I just had to add preStop with sleep for 10 seconds for some reason, after that 0 502 errors on deployment. Thanks! |
The router is a complex beast and it's not entirely straightforward to determine if a router is live/ready. From a kubernetes point of view, a service is: - live [if it isn't deadlocked](https://www.linkedin.com/posts/llarsson_betterdevopsonkubernetes-devops-devsecops-activity-7018587202121076736-LRxE) - ready if it is able to start accepting traffic (For more details: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) Our current health check isn't actually doing either of these things. It's returning a payload which indicates if the router is "healthy" or not and it's always returning "UP" (hard-code) with HTTP status code: 200. As long as it is serving traffic it appears "healthy". To address this we need: 1. A mechanism within the router to define whether it is ready or live. 2. A new interface whereby an interested party may determine if the router is ready or live. Both of these questions are significantly complicated by the fact that the router is architected to maintain connectivity in the face of changing configuration, schemas, etc... Multiple interested parties may be operating against the same router with different states (via different connections) and thus could, in theory, get different answers. However, I'm working on the assumption that, for k8s control purposes, a client, would only be interested in the current (latest) state of the router, so that's what we'll track. --- 1. How do we know if the router is ready or live? Ready - Not currently processing a change to its set of input - Is in state Running - Resources available (open files limit not reached) (although this removes a pod from service, which would be bad) - Is live Live - Health check enabled and responding 2. Extending our existing health interface If we add support for query parameters named "ready" and "live" to our existing health endpoint, then we can provide a backwards compatible interface which leverages the existing "/health" endpoint. We would support both POST and GET for maximum compatibility. Sample queries: ``` curl -XPOST "http://localhost:8088/health?ready" OR curl "http://localhost:8088/health?ready" curl -XPOST "http://localhost:8088/health?live" OR curl "http://localhost:8088/health?live" ``` You could only query for one at a time (which I think is fine). Note: Our health check returns a 200 with a JSON payload. It isn't easy to process that in k8s. Especially when you can't install useful utilities into your container image (like curl, jq, etc...) because of security and CVEs. When the router is not live or ready, return a 503 which k8s interprets as a probe fail. Fixes #3233 <!-- start metadata --> **Checklist** Complete the checklist (and note appropriate exceptions) before a final PR is raised. - [x] Changes are compatible[^1] - [ ] Documentation[^2] completed - [ ] Performance impact assessed and acceptable - Tests added and passing[^3] - [ ] Unit Tests - [ ] Integration Tests - [ ] Manual Tests **Exceptions** *Note any exceptions here* **Notes** [^1]. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. [^2]. Configuration is an important part of many changes. Where applicable please try to document configuration examples. [^3]. Tick whichever testing boxes are applicable. If you are adding Manual Tests: - please document the manual testing (extensively) in the Exceptions. - please raise a separate issue to automate the test and label it (or ask for it to be labeled) as `manual test` --------- Co-authored-by: Bryn Cooke <[email protected]> Co-authored-by: Pierre Tondereau <[email protected]> Co-authored-by: Avery Harnish <[email protected]> Co-authored-by: bryn <[email protected]> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Geoffroy Couprie <[email protected]> Co-authored-by: Simon Sapin <[email protected]> Co-authored-by: o0Ignition0o <[email protected]> Co-authored-by: Jesse Rosenberger <[email protected]> Co-authored-by: Jeremy Lempereur <[email protected]> Co-authored-by: Dylan Anthony <[email protected]>
The router is a complex beast and it's not entirely straightforward to determine if a router is live/ready. From a kubernetes point of view, a service is: - live [if it isn't deadlocked](https://www.linkedin.com/posts/llarsson_betterdevopsonkubernetes-devops-devsecops-activity-7018587202121076736-LRxE) - ready if it is able to start accepting traffic (For more details: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) Our current health check isn't actually doing either of these things. It's returning a payload which indicates if the router is "healthy" or not and it's always returning "UP" (hard-code) with HTTP status code: 200. As long as it is serving traffic it appears "healthy". To address this we need: 1. A mechanism within the router to define whether it is ready or live. 2. A new interface whereby an interested party may determine if the router is ready or live. Both of these questions are significantly complicated by the fact that the router is architected to maintain connectivity in the face of changing configuration, schemas, etc... Multiple interested parties may be operating against the same router with different states (via different connections) and thus could, in theory, get different answers. However, I'm working on the assumption that, for k8s control purposes, a client, would only be interested in the current (latest) state of the router, so that's what we'll track. --- 1. How do we know if the router is ready or live? Ready - Not currently processing a change to its set of input - Is in state Running - Resources available (open files limit not reached) (although this removes a pod from service, which would be bad) - Is live Live - Health check enabled and responding 2. Extending our existing health interface If we add support for query parameters named "ready" and "live" to our existing health endpoint, then we can provide a backwards compatible interface which leverages the existing "/health" endpoint. We would support both POST and GET for maximum compatibility. Sample queries: ``` curl -XPOST "http://localhost:8088/health?ready" OR curl "http://localhost:8088/health?ready" curl -XPOST "http://localhost:8088/health?live" OR curl "http://localhost:8088/health?live" ``` You could only query for one at a time (which I think is fine). Note: Our health check returns a 200 with a JSON payload. It isn't easy to process that in k8s. Especially when you can't install useful utilities into your container image (like curl, jq, etc...) because of security and CVEs. When the router is not live or ready, return a 503 which k8s interprets as a probe fail. Fixes #3233 <!-- start metadata --> **Checklist** Complete the checklist (and note appropriate exceptions) before a final PR is raised. - [x] Changes are compatible[^1] - [ ] Documentation[^2] completed - [ ] Performance impact assessed and acceptable - Tests added and passing[^3] - [ ] Unit Tests - [ ] Integration Tests - [ ] Manual Tests **Exceptions** *Note any exceptions here* **Notes** [^1]. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. [^2]. Configuration is an important part of many changes. Where applicable please try to document configuration examples. [^3]. Tick whichever testing boxes are applicable. If you are adding Manual Tests: - please document the manual testing (extensively) in the Exceptions. - please raise a separate issue to automate the test and label it (or ask for it to be labeled) as `manual test` --------- Co-authored-by: Bryn Cooke <[email protected]> Co-authored-by: Pierre Tondereau <[email protected]> Co-authored-by: Avery Harnish <[email protected]> Co-authored-by: bryn <[email protected]> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Geoffroy Couprie <[email protected]> Co-authored-by: Simon Sapin <[email protected]> Co-authored-by: o0Ignition0o <[email protected]> Co-authored-by: Jesse Rosenberger <[email protected]> Co-authored-by: Jeremy Lempereur <[email protected]> Co-authored-by: Dylan Anthony <[email protected]>
Describe the bug
The default readiness check behaves strangely. If the router gets a SIGINT/SIGTERM it cleans up all remaining connections, however it keeps responding
{"status":"UP"}
throughout the whole termination process which makes it get new connections that it wont be able to terminate in time before being killed by kubernetes. This is fine in the liveness probe part but not in the readiness however.To Reproduce
Steps to reproduce the behavior:
Expected behavior
As soon as the router gets a termination signal it should stop answering
{"status":"UP"}
on the readiness probe checks. Still fine to do it on the liveness probe though. Would be nice to be able to configure how long this termination grace period should be in the YAML.Additional context
This is when running on GKE v1.24.12
The text was updated successfully, but these errors were encountered: