Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

argo deletes persistant volume claims prematurely when parallelism is set lower than the number of items in loop #11119

Closed
2 of 3 tasks
abrabah opened this issue May 23, 2023 · 2 comments · Fixed by #11138
Closed
2 of 3 tasks
Labels
P3 Low priority type/bug

Comments

@abrabah
Copy link
Contributor

abrabah commented May 23, 2023

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

Submitting a workflow looping over items in parallel using persistent volume claims causes the whole workflow to hang if workflow.spec.parallelism is set too low. Logs indicate that the workflow gets cancelled due to "max parallelism reached", which again causes the pvc's to get garbage collected. I expected argo workflow to honor the parallelism field and not launch more sub tasks than allowed by throttling execution.

Version

v3.4.7

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  name: loop-fail-3
spec:
  entrypoint: constrained-loops-with-volumes
  parallelism: 1
  volumeClaimTemplates:
  - metadata:
      name: one
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi
  - metadata:
      name: two
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi
  
  
  templates:
  - name: constrained-loops-with-volumes
    steps:
    - - name: write-and-read-msg-from-pvc
        template: write-and-read-msg-from-pvc
        arguments:
          parameters:
          - name: pvc
            value: '{{item}}'
        withItems: 
          - one
          - two
  - name: write-and-read-msg-from-pvc
    inputs:
      parameters:
      - name: pvc
    steps:
    - - name: write-msg
        template: write-msg
        arguments:
          parameters:
          - name: pvc
            value: '{{inputs.parameters.pvc}}'        
    - - name: read-msg
        template: read-msg
        arguments:
          parameters:
          - name: pvc
            value: '{{inputs.parameters.pvc}}'
  - name: write-msg
    inputs:
      parameters:
      - name: pvc
    script:
      image: busybox:stable
      command: ["sh"]
      source: | 
        echo "Hello! i'm writing to pvc; '{{inputs.parameters.pvc}}'" >> /pvc/msg
      volumeMounts:
      - name: '{{inputs.parameters.pvc}}'
        mountPath: /pvc
  - name: read-msg
    inputs:
      parameters:
      - name: pvc
    script:
      image: busybox:stable
      command: ["sh"]
      source: | 
        echo "Found the following message;"
        cat /pvc/msg
      volumeMounts:
      - name: '{{inputs.parameters.pvc}}'
        mountPath: /pvc

Logs from the workflow controller

workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.621Z" level=info msg="Processing workflow" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.627Z" level=info msg="Updated phase  -> Running" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.627Z" level=info msg="Creating pvc loop-fail-3-one" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.640Z" level=info msg="Creating pvc loop-fail-3-two" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.646Z" level=info msg="Steps node loop-fail-3 initialized Running" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.646Z" level=info msg="StepGroup node loop-fail-3-3565738732 initialized Running" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.647Z" level=info msg="Steps node loop-fail-3-1631888154 initialized Running" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.647Z" level=info msg="StepGroup node loop-fail-3-4073678796 initialized Running" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.651Z" level=info msg="Pod node loop-fail-3-37876413 initialized Pending" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.665Z" level=info msg="Created pod: loop-fail-3[0].write-and-read-msg-from-pvc(0:one)[0].write-msg (loop-fail-3-write-msg-37876413)" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.665Z" level=info msg="Workflow step group node loop-fail-3-4073678796 not yet completed" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.666Z" level=info msg="workflow active pod spec parallelism reached 1/1" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.666Z" level=info msg="Workflow step group node loop-fail-3-3565738732 not yet completed" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.666Z" level=info msg="TaskSet Reconciliation" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.666Z" level=info msg=reconcileAgentPod namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:07.740Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=2544 workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:17.667Z" level=info msg="Processing workflow" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:17.667Z" level=info msg="Task-result reconciliation" namespace=default numObjs=0 workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:17.667Z" level=info msg="node changed" namespace=default new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=loop-fail-3-37876413 old.message= old.phase=Pending old.progress=0/1 workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:17.669Z" level=info msg="workflow active pod spec parallelism reached 1/1" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:17.669Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:17.669Z" level=info msg="Deleting PVC loop-fail-3-one" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:17.675Z" level=info msg="Deleting PVC loop-fail-3-two" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:17.682Z" level=info msg="Removing PVC \"kubernetes.io/pvc-protection\" finalizer" claimName=loop-fail-3-one namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:17.695Z" level=info msg="Removing PVC \"kubernetes.io/pvc-protection\" finalizer" claimName=loop-fail-3-two namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:17.709Z" level=info msg="Deleted 2/2 PVCs" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:17.726Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=2586 workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.726Z" level=info msg="Processing workflow" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.727Z" level=info msg="Task-result reconciliation" namespace=default numObjs=0 workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.727Z" level=info msg="node changed" namespace=default new.message= new.phase=Succeeded new.progress=0/1 nodeID=loop-fail-3-37876413 old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.727Z" level=info msg="Creating pvc loop-fail-3-one" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.734Z" level=info msg="Creating pvc loop-fail-3-two" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.747Z" level=info msg="Step group node loop-fail-3-4073678796 successful" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.747Z" level=info msg="node loop-fail-3-4073678796 phase Running -> Succeeded" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.747Z" level=info msg="node loop-fail-3-4073678796 finished: 2023-05-23 09:43:27.747649033 +0000 UTC" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.747Z" level=info msg="StepGroup node loop-fail-3-4140642177 initialized Running" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.747Z" level=info msg="SG Outbound nodes of loop-fail-3-37876413 are [loop-fail-3-37876413]" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.748Z" level=info msg="Pod node loop-fail-3-3172455969 initialized Pending" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.767Z" level=info msg="Created pod: loop-fail-3[0].write-and-read-msg-from-pvc(0:one)[1].read-msg (loop-fail-3-read-msg-3172455969)" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.767Z" level=info msg="Workflow step group node loop-fail-3-4140642177 not yet completed" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.767Z" level=info msg="workflow active pod spec parallelism reached 1/1" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.767Z" level=info msg="Workflow step group node loop-fail-3-3565738732 not yet completed" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.767Z" level=info msg="TaskSet Reconciliation" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.767Z" level=info msg=reconcileAgentPod namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.831Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=2623 workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:27.838Z" level=info msg="cleaning up pod" action=labelPodCompleted key=default/loop-fail-3-write-msg-37876413/labelPodCompleted
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.754Z" level=info msg="Processing workflow" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.754Z" level=info msg="Task-result reconciliation" namespace=default numObjs=0 workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.755Z" level=info msg="node changed" namespace=default new.message= new.phase=Succeeded new.progress=0/1 nodeID=loop-fail-3-3172455969 old.message= old.phase=Pending old.progress=0/1 workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.756Z" level=info msg="SG Outbound nodes of loop-fail-3-37876413 are [loop-fail-3-37876413]" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.756Z" level=info msg="Step group node loop-fail-3-4140642177 successful" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.756Z" level=info msg="node loop-fail-3-4140642177 phase Running -> Succeeded" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.756Z" level=info msg="node loop-fail-3-4140642177 finished: 2023-05-23 09:43:37.756936656 +0000 UTC" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.756Z" level=info msg="Outbound nodes of loop-fail-3-3172455969 is [loop-fail-3-3172455969]" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.757Z" level=info msg="Outbound nodes of loop-fail-3-1631888154 is [loop-fail-3-3172455969]" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.757Z" level=info msg="node loop-fail-3-1631888154 phase Running -> Succeeded" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.757Z" level=info msg="node loop-fail-3-1631888154 finished: 2023-05-23 09:43:37.757031699 +0000 UTC" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.757Z" level=info msg="Checking daemoned children of loop-fail-3-1631888154" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.757Z" level=info msg="Steps node loop-fail-3-1336303057 initialized Running" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.757Z" level=info msg="StepGroup node loop-fail-3-1269459701 initialized Running" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.757Z" level=info msg="Pod node loop-fail-3-2820639080 initialized Pending" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.774Z" level=info msg="Created pod: loop-fail-3[0].write-and-read-msg-from-pvc(1:two)[0].write-msg (loop-fail-3-write-msg-2820639080)" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.774Z" level=info msg="Workflow step group node loop-fail-3-1269459701 not yet completed" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.774Z" level=info msg="Workflow step group node loop-fail-3-3565738732 not yet completed" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.774Z" level=info msg="TaskSet Reconciliation" namespace=default workflow=loop-fail-3
workflow-controller-575d78d78f-kb4b5 workflow-controller time="2023-05-23T09:43:37.774Z" level=info msg=reconcileAgentPod namespace=default workflow=loop-fail-3

Logs from in your workflow's wait container

N/A
@ajoskowski
Copy link

ajoskowski commented May 25, 2023

I can also confirm that bug exists in 3.4.7.

We upgraded Argo Workflows from 3.3.8 to 3.4.7 version.
Now when we run build with parallelism set to 2 we can observe a situation where:

  • First two steps start
  • PersistentVolumeClaim is created
  • PersistentVolume is created
  • Parallelism limit is reached
  • PersistenvVolumeClaim is removed
  • Next steps are not able to start due to "unable to attach volume" due to missing PersistenVolumeClaim

Logs from Workflow Controller:

time="2023-05-25T10:30:17.804Z" level=info msg="resolved artifact repository" artifactRepositoryRef=default-artifact-repository
time="2023-05-25T10:30:17.804Z" level=info msg="Updated phase  -> Running" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.807Z" level=info msg="Creating pvc day2ops-cassandra-nodetool-4lhbx-workdir" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.811Z" level=info msg="Create events 201"
time="2023-05-25T10:30:17.823Z" level=info msg="Create persistentvolumeclaims 201"
time="2023-05-25T10:30:17.824Z" level=info msg="Steps node day2ops-cassandra-nodetool-4lhbx initialized Running" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.824Z" level=info msg="StepGroup node day2ops-cassandra-nodetool-4lhbx-4189665980 initialized Running" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.824Z" level=info msg="Pod node day2ops-cassandra-nodetool-4lhbx-3577700546 initialized Pending" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.848Z" level=info msg="Create pods 201"
time="2023-05-25T10:30:17.849Z" level=info msg="Created pod: day2ops-cassandra-nodetool-4lhbx[0].welcome (day2ops-cassandra-nodetool-4lhbx-welcome-msg-3577700546)" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.849Z" level=info msg="Pod node day2ops-cassandra-nodetool-4lhbx-77669838 initialized Pending" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.879Z" level=info msg="Create pods 201"
time="2023-05-25T10:30:17.880Z" level=info msg="Created pod: day2ops-cassandra-nodetool-4lhbx[0].clone-dok-infrastructure-tooling-repository (day2ops-cassandra-nodetool-4lhbx-git-clone-77669838)" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.880Z" level=info msg="workflow active pod spec parallelism reached 2/2" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.880Z" level=info msg="workflow active pod spec parallelism reached 2/2" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.880Z" level=info msg="Workflow step group node day2ops-cassandra-nodetool-4lhbx-4189665980 not yet completed" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.880Z" level=info msg="TaskSet Reconciliation" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.880Z" level=info msg=reconcileAgentPod namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.880Z" level=info msg="Workflow to be dehydrated" Workflow Size=11429
time="2023-05-25T10:30:17.900Z" level=info msg="Update workflows 200"
time="2023-05-25T10:30:17.901Z" level=info msg="Workflow update successful" namespace=vct-dok-argo-workflows phase=Running resourceVersion=187417323 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:17.909Z" level=info msg="Create events 201"
time="2023-05-25T10:30:17.928Z" level=info msg="Get leases 200"
time="2023-05-25T10:30:17.932Z" level=info msg="Create events 201"
time="2023-05-25T10:30:17.942Z" level=info msg="Update leases 200"
time="2023-05-25T10:30:22.947Z" level=info msg="Get leases 200"
time="2023-05-25T10:30:22.954Z" level=info msg="Update leases 200"
time="2023-05-25T10:30:27.853Z" level=info msg="Processing workflow" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.853Z" level=info msg="Task-result reconciliation" namespace=vct-dok-argo-workflows numObjs=1 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.853Z" level=info msg="task-result changed" namespace=vct-dok-argo-workflows nodeID=day2ops-cassandra-nodetool-4lhbx-3577700546 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.853Z" level=info msg="node changed" namespace=vct-dok-argo-workflows new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=day2ops-cassandra-nodetool-4lhbx-77669838 old.message= old.phase=Pending old.progress=0/1 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.853Z" level=info msg="node changed" namespace=vct-dok-argo-workflows new.message= new.phase=Succeeded new.progress=0/1 nodeID=day2ops-cassandra-nodetool-4lhbx-3577700546 old.message= old.phase=Pending old.progress=0/1 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.858Z" level=info msg="Pod node day2ops-cassandra-nodetool-4lhbx-2423292963 initialized Pending" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.874Z" level=info msg="Create pods 201"
time="2023-05-25T10:30:27.875Z" level=info msg="Created pod: day2ops-cassandra-nodetool-4lhbx[0].clone-dok-infrastructure-tooling-state (day2ops-cassandra-nodetool-4lhbx-git-clone-2423292963)" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.875Z" level=info msg="workflow active pod spec parallelism reached 2/2" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.875Z" level=info msg="Workflow step group node day2ops-cassandra-nodetool-4lhbx-4189665980 not yet completed" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.875Z" level=info msg="TaskSet Reconciliation" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.875Z" level=info msg=reconcileAgentPod namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.876Z" level=info msg="Workflow to be dehydrated" Workflow Size=12433
time="2023-05-25T10:30:27.894Z" level=info msg="Update workflows 200"
time="2023-05-25T10:30:27.896Z" level=info msg="Workflow update successful" namespace=vct-dok-argo-workflows phase=Running resourceVersion=187417554 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:27.901Z" level=info msg="Create events 201"
time="2023-05-25T10:30:27.907Z" level=info msg="Create events 201"
time="2023-05-25T10:30:27.959Z" level=info msg="Get leases 200"
time="2023-05-25T10:30:27.966Z" level=info msg="Update leases 200"
time="2023-05-25T10:30:32.897Z" level=info msg="cleaning up pod" action=deletePod key=vct-dok-argo-workflows/day2ops-cassandra-nodetool-4lhbx-welcome-msg-3577700546/deletePod
time="2023-05-25T10:30:32.930Z" level=info msg="Delete pods 200"
time="2023-05-25T10:30:32.973Z" level=info msg="Get leases 200"
time="2023-05-25T10:30:32.979Z" level=info msg="Update leases 200"
time="2023-05-25T10:30:37.878Z" level=info msg="Processing workflow" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:37.878Z" level=info msg="Task-result reconciliation" namespace=vct-dok-argo-workflows numObjs=1 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:37.878Z" level=info msg="node changed" namespace=vct-dok-argo-workflows new.message= new.phase=Running new.progress=0/1 nodeID=day2ops-cassandra-nodetool-4lhbx-2423292963 old.message= old.phase=Pending old.progress=0/1 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:37.878Z" level=info msg="node changed" namespace=vct-dok-argo-workflows new.message= new.phase=Running new.progress=0/1 nodeID=day2ops-cassandra-nodetool-4lhbx-77669838 old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:37.879Z" level=info msg="workflow active pod spec parallelism reached 2/2" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:37.879Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:37.879Z" level=info msg="Deleting PVC day2ops-cassandra-nodetool-4lhbx-workdir" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:37.888Z" level=info msg="Delete persistentvolumeclaims 200"
time="2023-05-25T10:30:37.888Z" level=info msg="Removing PVC \"kubernetes.io/pvc-protection\" finalizer" claimName=day2ops-cassandra-nodetool-4lhbx-workdir namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:37.892Z" level=info msg="Get persistentvolumeclaims 200"
time="2023-05-25T10:30:37.901Z" level=info msg="Update persistentvolumeclaims 200"
time="2023-05-25T10:30:37.902Z" level=info msg="Deleted 1/1 PVCs" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:37.902Z" level=info msg="Workflow to be dehydrated" Workflow Size=12403
time="2023-05-25T10:30:37.914Z" level=info msg="Update workflows 200"
time="2023-05-25T10:30:37.916Z" level=info msg="Workflow update successful" namespace=vct-dok-argo-workflows phase=Running resourceVersion=187417716 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:37.924Z" level=info msg="Create events 201"
time="2023-05-25T10:30:37.930Z" level=info msg="Create events 201"
time="2023-05-25T10:30:37.986Z" level=info msg="Get leases 200"
time="2023-05-25T10:30:37.995Z" level=info msg="Update leases 200"
time="2023-05-25T10:30:43.000Z" level=info msg="Get leases 200"
time="2023-05-25T10:30:43.006Z" level=info msg="Update leases 200"
time="2023-05-25T10:30:47.919Z" level=info msg="Processing workflow" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:47.919Z" level=info msg="Task-result reconciliation" namespace=vct-dok-argo-workflows numObjs=1 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:47.919Z" level=info msg="node unchanged" namespace=vct-dok-argo-workflows nodeID=day2ops-cassandra-nodetool-4lhbx-2423292963 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:47.919Z" level=info msg="node unchanged" namespace=vct-dok-argo-workflows nodeID=day2ops-cassandra-nodetool-4lhbx-77669838 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:47.919Z" level=info msg="Creating pvc day2ops-cassandra-nodetool-4lhbx-workdir" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:47.942Z" level=info msg="Create persistentvolumeclaims 201"
time="2023-05-25T10:30:47.942Z" level=info msg="workflow active pod spec parallelism reached 2/2" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:47.942Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:47.942Z" level=info msg="Deleting PVC day2ops-cassandra-nodetool-4lhbx-workdir" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:47.951Z" level=info msg="Delete persistentvolumeclaims 200"
time="2023-05-25T10:30:47.951Z" level=info msg="Removing PVC \"kubernetes.io/pvc-protection\" finalizer" claimName=day2ops-cassandra-nodetool-4lhbx-workdir namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:47.964Z" level=info msg="Get persistentvolumeclaims 200"
time="2023-05-25T10:30:47.979Z" level=info msg="Update persistentvolumeclaims 200"
time="2023-05-25T10:30:47.979Z" level=info msg="Deleted 1/1 PVCs" namespace=vct-dok-argo-workflows workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:47.980Z" level=info msg="Workflow to be dehydrated" Workflow Size=12404
time="2023-05-25T10:30:48.001Z" level=info msg="Update workflows 200"
time="2023-05-25T10:30:48.003Z" level=info msg="Workflow update successful" namespace=vct-dok-argo-workflows phase=Running resourceVersion=187417826 workflow=day2ops-cassandra-nodetool-4lhbx
time="2023-05-25T10:30:48.012Z" level=info msg="Get leases 200"
time="2023-05-25T10:30:48.021Z" level=info msg="Update leases 200"
time="2023-05-25T10:30:53.026Z" level=info msg="Get leases 200"
time="2023-05-25T10:30:53.032Z" level=info msg="Update leases 200"
time="2023-05-25T10:30:58.039Z" level=info msg="Get leases 200"
time="2023-05-25T10:30:58.045Z" level=info msg="Update leases 200"

@abrabah abrabah changed the title argo deletes persistant volume claims prematurely when paralellism is set lower than the number of items in loop argo deletes persistant volume claims prematurely when parallelism is set lower than the number of items in loop May 26, 2023
@sarabala1979 sarabala1979 added the P3 Low priority label Jun 1, 2023
@stale
Copy link

stale bot commented Jun 18, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is a mentoring request, please provide an update here. Thank you for your contributions.

@stale stale bot added the problem/stale This has not had a response in some time label Jun 18, 2023
JPZ13 pushed a commit to pipekit/argo-workflows that referenced this issue Jul 4, 2023
@agilgur5 agilgur5 removed the problem/stale This has not had a response in some time label Aug 28, 2023
dpadhiar pushed a commit to dpadhiar/argo-workflows that referenced this issue May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 Low priority type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants