-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate if Liveness Probing is Production Grade #72
Comments
Just a small correction. e2e tests check for liveness of all pods - https://github.com/cockroachdb/cockroach-operator/blob/master/e2e/assert.go#L138 |
@chrisseto can we close this? |
I'm tempted to say that we should leave in this open? In CC, we actually don't have liveness probes installed. We've found that the probes timeout at high loads, so k8s kills the pods making the issue even worse. |
Can we close this? |
Noting here that the upstream manifests have disabled liveness probes and we experienced liveness probe failures under heavy load causing more latency in the cluster. Support referenced these: |
During Monday's (7/13) demo of the Operator, it was identified that our liveness probing for Kubernetes may need to improve. This will be helpful to have to have proper end to end testing.
For our end to end testing, we verify that clusters are setup properly. But we don't check to see if they're running properly. We don't run any SQL queries, and we don't check for liveness.
The text was updated successfully, but these errors were encountered: