Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Istio Subset- Level traffic splitting cannot be used for Rolling Update #1405

Closed
gyz2009 opened this issue Aug 10, 2021 · 9 comments
Closed
Labels
bug Something isn't working

Comments

@gyz2009
Copy link

gyz2009 commented Aug 10, 2021

Summary

Update process does not Scale down the old RS, Its renewal process is more like blue-green.
It waits for all new ReplicaSet pods to started and scale down the old ReplicaSet Pods to 0.

strategy:
  canary:
    # maxSurge: "25%"
    # maxUnavailable: "25%"
    scaleDownDelaySeconds: 30
    trafficRouting:
      istio:
        virtualService:
          name: helloworld-vsvc # required
          routes:
            - primary # optional if there is a single route in VirtualService, required otherwise
        destinationRule:
          name: helloworld-destrule # required
          canarySubsetName: canary # required
          stableSubsetName: stable # required
    steps:
      - setWeight: 20
      - pause: {duration: 20s}
      - setWeight: 40
      - pause: {duration: 20s}
      - setWeight: 60
      - pause: {duration: 20s}
      - setWeight: 80
      - pause: {duration: 20s}

Diagnostics

What version of Argo Rollouts are you running?
1.0.4


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

@gyz2009 gyz2009 added the bug Something isn't working label Aug 10, 2021
@huikang
Copy link
Member

huikang commented Aug 10, 2021

Hi, @gyz2009 , could you paste the rollout status in the middle of an update, e.g., after step 5 (setWeight: 60)?

@huikang
Copy link
Member

huikang commented Aug 10, 2021

Never mind; I can reproduce this bug. Will look into this. Thanks.

NAME                                                         KIND         STATUS        AGE   INFO
⟳ rollouts-demo-istio                                        Rollout      ॥ Paused      2m4s  
├──# revision:2                                                                               
│  ├──⧉ rollouts-demo-istio-6b5cbf7465                       ReplicaSet   ✔ Healthy     107s  canary
│  │  ├──□ rollouts-demo-istio-6b5cbf7465-5djdj              Pod          ✔ Running     107s  ready:1/1
│  │  └──□ rollouts-demo-istio-6b5cbf7465-9blgw              Pod          ✔ Running     99s   ready:1/1
│  └──α rollouts-demo-istio-6b5cbf7465-2-1                   AnalysisRun  ✔ Successful  106s  ✔ 1
│     └──⊞ d5c11d15-7bfa-4555-93c1-72c13bc858c2.sleep-job.1  Job          ✔ Successful  106s  
└──# revision:1                                                                               
   └──⧉ rollouts-demo-istio-6668595956                       ReplicaSet   ✔ Healthy     2m4s  stable
      ├──□ rollouts-demo-istio-6668595956-jd7s7              Pod          ✔ Running     2m4s  ready:1/1
      ├──□ rollouts-demo-istio-6668595956-mdsq7              Pod          ✔ Running     2m4s  ready:1/1
      ├──□ rollouts-demo-istio-6668595956-sb8ql              Pod          ✔ Running     2m4s  ready:1/1
      ├──□ rollouts-demo-istio-6668595956-w4fqc              Pod          ✔ Running     2m4s  ready:1/1
      └──□ rollouts-demo-istio-6668595956-wgrxw              Pod          ✔ Running     2m4s  ready:1/1

@huikang
Copy link
Member

huikang commented Aug 10, 2021

@jessesuen , checking the following

if rollout.Spec.Strategy.Canary.TrafficRouting != nil {
return desiredNewRSReplicaCount, rolloutSpecReplica
}

which seems cause the oldRs or ohterRS not to be scaled down. Is this the desired behavior for traffic routing?

@harikrongali
Copy link
Contributor

@huikang https://github.com/argoproj/argo-rollouts/pull/341/files#r363676969

@huikang
Copy link
Member

huikang commented Aug 11, 2021

@harikrongali , thanks so much for pointing me to the change. So this is the expected behavior.

@gyz2009 , I think we can close this one per design.

@gyz2009
Copy link
Author

gyz2009 commented Aug 11, 2021

@huikang Ok, I have some RS Replicas 100+, due to insufficient resources, update will be in pending state.

@harikrongali
Copy link
Contributor

@gyz2009 #1029 solve what you are seeing and should be available in 1.1 release

@jessesuen
Copy link
Member

Duplicate of #1029

@jessesuen jessesuen marked this as a duplicate of #1029 Aug 17, 2021
@gyz2009
Copy link
Author

gyz2009 commented Sep 28, 2021

@harikrongali I tested 1.1.0-RC1 and there was no change

strategy:
  canary: 
    dynamicStableScale: true
    # maxSurge: "25%"
    # maxUnavailable: "25%"
    # scaleDownDelaySeconds: 30
    trafficRouting:
      istio:
        virtualService:
          name: helloworld-vsvc
          routes:
            - primary
        destinationRule:
          name: helloworld 
          canarySubsetName: canary # required
          stableSubsetName: stable # required
    steps:
      - setCanaryScale:
          matchTrafficWeight: true
      - setCanaryScale:
          weight: 30
      - pause: {duration: 60s}
      - setCanaryScale:
          weight: 60
      - setCanaryScale:
          weight: 100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants