Daemonset stuck in progressing #16951

JBodkin-Amphora · 2024-01-22T19:28:12Z

Describe the bug

The daemonset is stuck in the progressing phase according to ArgoCD but the daemonset is running on each node (2) in the spot node pool.

Clicking on the daemonset, shows the following message for the health details as Waiting for daemon set "opentelemetry-collector-agent" rollout to finish: 0 of 3 updated pods are available...

The status field on the live manifest is:

status:
  currentNumberScheduled: 2
  desiredNumberScheduled: 2
  numberAvailable: 2
  numberMisscheduled: 0
  numberReady: 2
  observedGeneration: 3
  updatedNumberScheduled: 2

The output of kubectl -n opentelemetry rollout status daemonset/opentelemetry-collector-agent is daemon set "opentelemetry-collector-agent" successfully rolled out

To Reproduce

Deploy the OpenTelemetry Collector as an application with two node pools on Azure:

System Node Pool
Spot Node Pool

---
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  namespace: argocd
  name: opentelemetry-collector
spec:
  project: default
  source:
    chart: opentelemetry-collector
    repoURL: https://open-telemetry.github.io/opentelemetry-helm-charts
    targetRevision: 0.78.1
    helm:
      valuesObject:
        mode: daemonset
        tolerations:
          - key: kubernetes.azure.com/scalesetpriority
            operator: Equal
            value: spot
            effect: NoSchedule
  destination:
    server: https://kubernetes.default.svc
    namespace: opentelemetry
  syncPolicy:
    automated:
      prune: true
    syncOptions:
      - CreateNamespace=true

Expected behavior

The daemonset should be marked as healthy because it is running two pods, one each on each of the spot nodes. The tolerations do not allow the daemonset to run on the system node pool, as the pod does not have the critical addons toleration. As daemonsets are out-of-the-box in Kubernetes, I would expect this to work without having to implement a custom health check.

Screenshots

Version

argocd: v2.9.3+6eba5be
  BuildDate: 2023-12-01T23:24:09Z
  GitCommit: 6eba5be864b7e031871ed7698f5233336dfe75c7
  GitTreeState: clean
  GoVersion: go1.21.4
  Compiler: gc
  Platform: windows/amd64
argocd-server: v2.9.4+bb06722

The text was updated successfully, but these errors were encountered:

Samir-NT · 2024-02-20T15:31:08Z

Same issue here (v2.10.1+a79e0ea), but this looks like an (earlier) duplicate of #17208.

andrii-korotkov-verkada · 2024-11-11T12:49:09Z

ArgoCD versions 2.10 and below have reached EOL. Can you upgrade and let us know if the issue is still present, please?

JBodkin-Amphora added the bug Something isn't working label Jan 22, 2024

reggie-k added component:application-controller component:health-check labels Sep 29, 2024

andrii-korotkov-verkada added the version:EOL Latest confirmed affected version has reached EOL label Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Daemonset stuck in progressing #16951

Daemonset stuck in progressing #16951

JBodkin-Amphora commented Jan 22, 2024

Samir-NT commented Feb 20, 2024 •

edited

Loading

andrii-korotkov-verkada commented Nov 11, 2024

Daemonset stuck in progressing #16951

Daemonset stuck in progressing #16951

Comments

JBodkin-Amphora commented Jan 22, 2024

Samir-NT commented Feb 20, 2024 • edited Loading

andrii-korotkov-verkada commented Nov 11, 2024

Samir-NT commented Feb 20, 2024 •

edited

Loading