Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollout is not scaling down old replicasets properly #70

Closed
jessesuen opened this issue May 14, 2019 · 3 comments
Closed

Rollout is not scaling down old replicasets properly #70

jessesuen opened this issue May 14, 2019 · 3 comments
Labels
bug Something isn't working

Comments

@jessesuen
Copy link
Member

jessesuen commented May 14, 2019

Here is a rollout which is in a Suspended state, but with four replicasets scaled to two:

image

The expectation is in a steady state (Suspended), there should only be two replicasets scaled higher than 0 (active and preview)

I ran a diff against the last three revisions (12, 11, 9) of the ReplicaSet. I'm not sure what happened to ReplicaSet revision 10. Notice that the only differences are in metadata and status. The replicaset spec is the same, which means the pod template is the same. However the bug is that the replicaset hash name is not the same.

$ diff rs-12 rs-11
6,8c6,7
<     rollout.argoproj.io/revision: '12'
<     rollout.argoproj.io/revision-history: '10'
<   creationTimestamp: '2019-05-14T21:54:49Z'
---
>     rollout.argoproj.io/revision: '11'
>   creationTimestamp: '2019-05-14T22:16:39Z'
16c15
<     rollouts-pod-template-hash: 65c456b799
---
>     rollouts-pod-template-hash: 7d58696fd9
18c17
<   name: web-service-integration-65c456b799
---
>   name: web-service-integration-7d58696fd9
27c26
<   resourceVersion: '95549146'
---
>   resourceVersion: '95556597'
29,30c28,29
<     /apis/apps/v1/namespaces/fdp-connectivity-web-service-integration-usw2-ppd-qal/replicasets/web-service-integration-65c456b799
<   uid: e584ea76-7692-11e9-9427-0a985b86565a
---
>     /apis/apps/v1/namespaces/fdp-connectivity-web-service-integration-usw2-ppd-qal/replicasets/web-service-integration-7d58696fd9
>   uid: f1c6de3d-7695-11e9-9427-0a985b86565a
36c35
<       rollouts-pod-template-hash: 65c456b799
---
>       rollouts-pod-template-hash: 7d58696fd9
57c56
<         rollouts-pod-template-hash: 65c456b799
---
>         rollouts-pod-template-hash: 7d58696fd9
$ diff rs-12 rs-9
6,7c6
<     rollout.argoproj.io/revision: '12'
<     rollout.argoproj.io/revision-history: '10'
---
>     rollout.argoproj.io/revision: '9'
16c15
<     rollouts-pod-template-hash: 65c456b799
---
>     rollouts-pod-template-hash: 748b545485
18c17
<   name: web-service-integration-65c456b799
---
>   name: web-service-integration-748b545485
27c26
<   resourceVersion: '95549146'
---
>   resourceVersion: '95539841'
29,30c28,29
<     /apis/apps/v1/namespaces/fdp-connectivity-web-service-integration-usw2-ppd-qal/replicasets/web-service-integration-65c456b799
<   uid: e584ea76-7692-11e9-9427-0a985b86565a
---
>     /apis/apps/v1/namespaces/fdp-connectivity-web-service-integration-usw2-ppd-qal/replicasets/web-service-integration-748b545485
>   uid: e58000cc-7692-11e9-9427-0a985b86565a
36c35
<       rollouts-pod-template-hash: 65c456b799
---
>       rollouts-pod-template-hash: 748b545485
57c56
<         rollouts-pod-template-hash: 65c456b799
---
>         rollouts-pod-template-hash: 748b545485

This implies that the pod template hash may be getting different hashes for the same pod template.

During this time, we know from talking to the user, that the rollout's spec.template.spec was changed to only modify resource requests/limits to equivalent values (e.g. 2000m -> '2'). I suspect that the underlying issue is when we call: controller.ComputeHash(), it does not correctly considering these values to be the same, and results in different pod template hashes.

@jessesuen jessesuen added the bug Something isn't working label May 14, 2019
@jessesuen
Copy link
Member Author

We confirmed the pod template hash is sensitive to resource differences. The solution is to remarshal the object to normalize it before computing the pod template hash.

@jessesuen
Copy link
Member Author

pod template hash inconsistency has been resolved in #75.

A second fix is needed to scale down old replicasets.

@jessesuen
Copy link
Member Author

Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant