Skip to content
This repository has been archived by the owner on Aug 25, 2021. It is now read-only.

PodDisruptionBudget calculation is not working? #71

Closed
Typositoire opened this issue Nov 28, 2018 · 7 comments
Closed

PodDisruptionBudget calculation is not working? #71

Typositoire opened this issue Nov 28, 2018 · 7 comments

Comments

@Typositoire
Copy link

I have a 3 replicas setup and for some reason the PDB was set to 0.

ceil (sub (div (int .Values.server.replicas) 2) 1) looks good to me but it's possible the calculation is off in GoTemplate ?

@Typositoire Typositoire changed the title PodDisruptionBudget calculation are not working? PodDisruptionBudget calculation is not working? Nov 28, 2018
@Typositoire
Copy link
Author

I had to manually delete the PDB and recreate it as you can't update a PDB.

@adilyse
Copy link
Contributor

adilyse commented Dec 4, 2018

@Typositoire In this case, this is the expected behavior. It is recommended to run with 3-5 Consul servers, so this calculation gives the desired value of 0 allowed voluntary disruptions with 3 server replicas.

As for the calculation itself, the behavior is a bit unintuitive since it's using integer division, which evaluations 3/2 as 1. We've accounted for this, though, so the calculation is correct.

Hopefully that answers your question!

@adilyse adilyse closed this as completed Dec 4, 2018
@Typositoire
Copy link
Author

The problem happens during rolling upgrades.

Given 3 nodes with 1 consul-server each. I can never upgrade gracefully as this would go against the PDB (node drain). In a setup with KOPS this is what is happening and this is annoying.

With 3 nodes you can survive a 1 node failure, I'm not sure why you force it to be 0.

@Typositoire
Copy link
Author

I1204 14:49:44.381510   13119 request.go:874] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Cannot evict pod as it would violate the pod's disruption budget.","reason":"TooManyRequests","details":{"causes":[{"reason":"DisruptionBudget","message":"The disruption budget consul-server needs 3 healthy pods and has 3 currently"}]},"code":429}
I1204 14:49:49.382064   13119 request.go:874] Request Body: {"kind":"Eviction","apiVersion":"policy/v1beta1","metadata":{"name":"consul-server-1","namespace":"consul","creationTimestamp":null},"deleteOptions":{}}

@adilyse
Copy link
Contributor

adilyse commented Dec 4, 2018

Thank you for the additional information! I'll do some more digging and see what we can do about this.

@adilyse adilyse reopened this Dec 4, 2018
@s3than
Copy link
Contributor

s3than commented Dec 5, 2018

@adilyse I've been chasing this issue up today as well, It appears that the sprig library doesn't currently support floats so what happens is the following

3/2 = 1
1-1 = 0
ceil(0) = 0

My current fix is as below, I can put together a pull request and update tests for you hopefully this evening.
{{- define "consul.pdb.maxUnavailable" -}} {{- if eq (int .Values.server.replicas) 1 -}} {{ 0 }} {{- else if .Values.server.disruptionBudget.maxUnavailable -}} {{ .Values.server.disruptionBudget.maxUnavailable -}} {{- else -}} {{- div (sub (div (mul (int .Values.server.replicas) 10) 2) 1) 10 -}} {{- end -}} {{- end -}}

@s3than
Copy link
Contributor

s3than commented Dec 5, 2018

@adilyse Hopefully this solves this issue I implemented something different for the calculations and provided tests.

Any problems let me know

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants