-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Application controller should be more resilient to network latency or K8s API server hiccups #7692
Closed
jannfis opened this issue
Nov 12, 2021
· 0 comments
· Fixed by #16154 · May be fixed by aborilov/argo-cd#3
Closed
Application controller should be more resilient to network latency or K8s API server hiccups #7692
jannfis opened this issue
Nov 12, 2021
· 0 comments
· Fixed by #16154 · May be fixed by aborilov/argo-cd#3
Labels
component:core
Syncing, diffing, cluster state cache
enhancement
New feature or request
type:scalability
Issues related to scalability and performance related issues
type:supportability
Enhancements that help operators to run Argo CD
Comments
jannfis
added
enhancement
New feature or request
component:core
Syncing, diffing, cluster state cache
type:scalability
Issues related to scalability and performance related issues
type:supportability
Enhancements that help operators to run Argo CD
labels
Nov 12, 2021
This was referenced Oct 27, 2023
alexmt
pushed a commit
that referenced
this issue
Nov 2, 2023
* add retry logic for k8s client Signed-off-by: Pavel Aborilov <[email protected]> * add docs for retry logic and envs to manifests Signed-off-by: Pavel Aborilov <[email protected]> --------- Signed-off-by: Pavel Aborilov <[email protected]> Signed-off-by: Pavel <[email protected]>
jmilic1
pushed a commit
to jmilic1/argo-cd
that referenced
this issue
Nov 13, 2023
* add retry logic for k8s client Signed-off-by: Pavel Aborilov <[email protected]> * add docs for retry logic and envs to manifests Signed-off-by: Pavel Aborilov <[email protected]> --------- Signed-off-by: Pavel Aborilov <[email protected]> Signed-off-by: Pavel <[email protected]> Signed-off-by: jmilic1 <[email protected]>
aborilov
added a commit
to aborilov/argo-cd
that referenced
this issue
Nov 21, 2023
* add retry logic for k8s client Signed-off-by: Pavel Aborilov <[email protected]> * add docs for retry logic and envs to manifests Signed-off-by: Pavel Aborilov <[email protected]> --------- Signed-off-by: Pavel Aborilov <[email protected]> Signed-off-by: Pavel <[email protected]>
vladfr
pushed a commit
to vladfr/argo-cd
that referenced
this issue
Dec 13, 2023
* add retry logic for k8s client Signed-off-by: Pavel Aborilov <[email protected]> * add docs for retry logic and envs to manifests Signed-off-by: Pavel Aborilov <[email protected]> --------- Signed-off-by: Pavel Aborilov <[email protected]> Signed-off-by: Pavel <[email protected]>
tesla59
pushed a commit
to tesla59/argo-cd
that referenced
this issue
Dec 16, 2023
* add retry logic for k8s client Signed-off-by: Pavel Aborilov <[email protected]> * add docs for retry logic and envs to manifests Signed-off-by: Pavel Aborilov <[email protected]> --------- Signed-off-by: Pavel Aborilov <[email protected]> Signed-off-by: Pavel <[email protected]>
alexmt
pushed a commit
to alexmt/argo-cd
that referenced
this issue
Jan 19, 2024
* add retry logic for k8s client Signed-off-by: Pavel Aborilov <[email protected]> * add docs for retry logic and envs to manifests Signed-off-by: Pavel Aborilov <[email protected]> --------- Signed-off-by: Pavel Aborilov <[email protected]> Signed-off-by: Pavel <[email protected]>
alexmt
pushed a commit
to alexmt/argo-cd
that referenced
this issue
Jan 19, 2024
* add retry logic for k8s client Signed-off-by: Pavel Aborilov <[email protected]> * add docs for retry logic and envs to manifests Signed-off-by: Pavel Aborilov <[email protected]> --------- Signed-off-by: Pavel Aborilov <[email protected]> Signed-off-by: Pavel <[email protected]>
lyda
pushed a commit
to lyda/argo-cd
that referenced
this issue
Mar 28, 2024
* add retry logic for k8s client Signed-off-by: Pavel Aborilov <[email protected]> * add docs for retry logic and envs to manifests Signed-off-by: Pavel Aborilov <[email protected]> --------- Signed-off-by: Pavel Aborilov <[email protected]> Signed-off-by: Pavel <[email protected]> Signed-off-by: Kevin Lyda <[email protected]>
aborilov
added a commit
to aborilov/argo-cd
that referenced
this issue
Apr 29, 2024
* add retry logic for k8s client Signed-off-by: Pavel Aborilov <[email protected]> * add docs for retry logic and envs to manifests Signed-off-by: Pavel Aborilov <[email protected]> --------- Signed-off-by: Pavel Aborilov <[email protected]> Signed-off-by: Pavel <[email protected]>
Hariharasuthan99
pushed a commit
to AmadeusITGroup/argo-cd
that referenced
this issue
Jun 16, 2024
* add retry logic for k8s client Signed-off-by: Pavel Aborilov <[email protected]> * add docs for retry logic and envs to manifests Signed-off-by: Pavel Aborilov <[email protected]> --------- Signed-off-by: Pavel Aborilov <[email protected]> Signed-off-by: Pavel <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
component:core
Syncing, diffing, cluster state cache
enhancement
New feature or request
type:scalability
Issues related to scalability and performance related issues
type:supportability
Enhancements that help operators to run Argo CD
Summary
On network links with a very high latency, or which are very slow, sometimes a remote Kubernetes API drops the connection before a certain request (e.g. fetching remote resources, list of available APIs, etc) could be completed. This currently leads to comparison errors in the application or in aborting other operations (e.g. retrieving manifests using
argocd app manifests
). Effectually, this can prevent a sync from happening when such a hiccup occurs within a given operation, because the errors are considered fatal.This can be observed whenever the network link between the Argo CD control plane and the managed cluster may be a little flaky.
Motivation
Improve stability on high latency networks and don't fail operations on the first hiccup.
Proposal
On certain, non-permanent errors when accessing remote Kubernetes API endpoints, we should retry the request when it fails.
The text was updated successfully, but these errors were encountered: