Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application Controller reconciling apps every 15s - even though timeout.reconcilliation=30m #17165

Closed
2 of 3 tasks
diranged opened this issue Feb 9, 2024 · 1 comment
Closed
2 of 3 tasks
Labels
bug Something isn't working

Comments

@diranged
Copy link

diranged commented Feb 9, 2024

Checklist:

  • I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • I've included steps to reproduce the bug.
  • I've pasted the output of argocd version.

Describe the bug

We're monitoring our argocd-application-controller-0 pod and noticing that it is reconciling Applications much more frequently than we would expect. We use Github Webhooks to trigger updates, and we set the timeout.reconciliation: 30m flag to prevent unnecessary work. We have hundreds of applications, so the work adds up quite quickly.

In our logs we can see that the reconciliation happens every 15s across every single application we have. Here's a small snapshot from one particular application:

time="2024-02-09T19:07:58Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:07:58Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:13Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:13Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="Refreshing app status (comparison expired, requesting refresh. reconciledAt: 2024-02-09 18:38:26 +0000 UTC, expiry: 30m0s), level (2)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="Refreshing app status (comparison expired, requesting refresh. reconciledAt: 2024-02-09 18:38:26 +0000 UTC, expiry: 30m0s), level (2)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:43Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:43Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV

The comparison expired ... refresh was valid ... it had been 30m since the status.reconciledAt field on this application had been updated:

Before that particular refresh:

% k get applications APP-app-APPENV -o yaml | yq .status.reconciledAt
2024-02-09T18:38:26Z

After 19:08:28...

% k get applications APP-app-APPENV -o yaml | yq .status.reconciledAt
2024-02-09T19:08:28Z

I started to dig through the code and based on the Refreshing app status (controller refresh requested), level (1) message, I believe we are hitting this part of appcontroller.go that is calling ctrl.isRefreshRequested(app.QualifiedName()). When I read that code though - I don't really understand the goal.

func (ctrl *ApplicationController) isRefreshRequested(appName string) (bool, CompareWith) {
ctrl.refreshRequestedAppsMutex.Lock()
defer ctrl.refreshRequestedAppsMutex.Unlock()
level, ok := ctrl.refreshRequestedApps[appName]
if ok {
delete(ctrl.refreshRequestedApps, appName)
}
return ok, level
}

The isRefreshRequested code seems to determine if the AppName is in a list of apps that need to be refreshed. If it is, it deletes the app from that list, and then returns true signaling a refresh is required. To sanity check my understanding I went over to goland and wrote up this quick app:

// You can edit this code!
// Click here and start typing.
package main

import (
	"fmt"
	"math"
	"strconv"
)

type CompareWith int

const (
	CompareWithLatestForceResolve CompareWith = 3
	CompareWithLatest             CompareWith = 2
	CompareWithRecent             CompareWith = 1
	ComparisonWithNothing         CompareWith = 0
)

func (a CompareWith) Max(b CompareWith) CompareWith {
	return CompareWith(math.Max(float64(a), float64(b)))
}

func (a CompareWith) Pointer() *CompareWith {
	return &a
}

type ApplicationController struct {
	refreshRequestedApps map[string]CompareWith
}

func main() {
	fmt.Println("Hello, 世界")

	ctrl := ApplicationController{
		refreshRequestedApps: make(map[string]CompareWith),
	}

	// example: app not there
	level, ok := ctrl.refreshRequestedApps["foo"]
	if ok {
		delete(ctrl.refreshRequestedApps, "foo")
	}
	fmt.Printf("Foo Level: %d, OK: %s\n", level, strconv.FormatBool(ok))

	// example: app is there
	ctrl.refreshRequestedApps["foobar"] = CompareWithRecent
	level, ok = ctrl.refreshRequestedApps["foobar"]
	if ok {
		delete(ctrl.refreshRequestedApps, "foobar")
	}
	fmt.Printf("FooBar Level: %d, OK: %s\n", level, strconv.FormatBool(ok))

	level, ok = ctrl.refreshRequestedApps["foobar"]
	fmt.Printf("FooBar Level (second check): %d, OK: %s\n", level, strconv.FormatBool(ok))
}

When we run it, we get the expected output:

Hello, 世界
Foo Level: 0, OK: false
FooBar Level: 1, OK: true
FooBar Level (second check): 0, OK: false

Program exited.

With this understanding, it implies that our APP-app-APPENV application is being added to the ctrl.refreshRequestedApps map out of this codepath, and then this codepath is iterating every 15s and determining "oh, i do need to refresh" because the app has been put into the map.

For the life of me though, I cannot figure out where else in the code it's being added to the map. This search yields no outside callers that I can find..

Logs

Here's a more complete log:

time="2024-02-09T19:07:58Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:07:58Z" level=info msg="Comparing app state (cluster: https://XXX.us-west-2.eks.amazonaws.com, namespace: APP)" application=argocd/APP-app-APPENV
time="2024-02-09T19:07:58Z" level=info msg="getRepoObjs stats" application=argocd/APP-app-APPENV build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=7 unmarshal_ms=7 version_ms=0
time="2024-02-09T19:07:58Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/APP-app-APPENV
time="2024-02-09T19:07:58Z" level=info msg="No status changes. Skipping patch" application=argocd/APP-app-APPENV
time="2024-02-09T19:07:58Z" level=info msg="Reconciliation completed" application=argocd/APP-app-APPENV dedup_ms=0 dest-name= dest-namespace=APP dest-server="https://XXX.us-west-2.eks.amazonaws.com" diff_ms=15 fields.level=1 git_ms=7 health_ms=3 live_ms=12 patch_ms=0 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=73
time="2024-02-09T19:07:58Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:07:58Z" level=info msg="Comparing app state (cluster: https://XXX.us-west-2.eks.amazonaws.com, namespace: APP)" application=argocd/APP-app-APPENV
time="2024-02-09T19:07:58Z" level=info msg="getRepoObjs stats" application=argocd/APP-app-APPENV build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=9 unmarshal_ms=8 version_ms=0
time="2024-02-09T19:07:58Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/APP-app-APPENV
time="2024-02-09T19:07:58Z" level=info msg="No status changes. Skipping patch" application=argocd/APP-app-APPENV
time="2024-02-09T19:07:58Z" level=info msg="Reconciliation completed" application=argocd/APP-app-APPENV dedup_ms=0 dest-name= dest-namespace=APP dest-server="https://XXX.us-west-2.eks.amazonaws.com" diff_ms=10 fields.level=1 git_ms=9 health_ms=3 live_ms=18 patch_ms=0 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=68
time="2024-02-09T19:08:13Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:13Z" level=info msg="Comparing app state (cluster: https://XXX.us-west-2.eks.amazonaws.com, namespace: APP)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:13Z" level=info msg="getRepoObjs stats" application=argocd/APP-app-APPENV build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=7 unmarshal_ms=7 version_ms=0
time="2024-02-09T19:08:13Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:13Z" level=info msg="No status changes. Skipping patch" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:13Z" level=info msg="Reconciliation completed" application=argocd/APP-app-APPENV dedup_ms=0 dest-name= dest-namespace=APP dest-server="https://XXX.us-west-2.eks.amazonaws.com" diff_ms=12 fields.level=1 git_ms=7 health_ms=3 live_ms=15 patch_ms=0 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=78
time="2024-02-09T19:08:13Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:13Z" level=info msg="Comparing app state (cluster: https://XXX.us-west-2.eks.amazonaws.com, namespace: APP)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:13Z" level=info msg="getRepoObjs stats" application=argocd/APP-app-APPENV build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=7 unmarshal_ms=7 version_ms=0
time="2024-02-09T19:08:13Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:13Z" level=info msg="No status changes. Skipping patch" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:13Z" level=info msg="Reconciliation completed" application=argocd/APP-app-APPENV dedup_ms=0 dest-name= dest-namespace=APP dest-server="https://XXX.us-west-2.eks.amazonaws.com" diff_ms=14 fields.level=1 git_ms=7 health_ms=3 live_ms=14 patch_ms=0 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=75
time="2024-02-09T19:08:28Z" level=info msg="Refreshing app status (comparison expired, requesting refresh. reconciledAt: 2024-02-09 18:38:26 +0000 UTC, expiry: 30m0s), level (2)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="Comparing app state (cluster: https://XXX.us-west-2.eks.amazonaws.com, namespace: APP)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="getRepoObjs stats" application=argocd/APP-app-APPENV build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=197 unmarshal_ms=196 version_ms=0
time="2024-02-09T19:08:28Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="Update successful" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="Reconciliation completed" application=argocd/APP-app-APPENV dedup_ms=0 dest-name= dest-namespace=APP dest-server="https://XXX.us-west-2.eks.amazonaws.com" diff_ms=113 fields.level=2 git_ms=197 health_ms=3 live_ms=15 patch_ms=43 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=410
time="2024-02-09T19:08:28Z" level=info msg="Refreshing app status (comparison expired, requesting refresh. reconciledAt: 2024-02-09 18:38:26 +0000 UTC, expiry: 30m0s), level (2)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="Comparing app state (cluster: https://XXX.us-west-2.eks.amazonaws.com, namespace: APP)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="getRepoObjs stats" application=argocd/APP-app-APPENV build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=17 unmarshal_ms=16 version_ms=0
time="2024-02-09T19:08:28Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="Update successful" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="Reconciliation completed" application=argocd/APP-app-APPENV dedup_ms=0 dest-name= dest-namespace=APP dest-server="https://XXX.us-west-2.eks.amazonaws.com" diff_ms=100 fields.level=2 git_ms=17 health_ms=3 live_ms=13 patch_ms=28 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=189
time="2024-02-09T19:08:28Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="Comparing app state (cluster: https://XXX.us-west-2.eks.amazonaws.com, namespace: APP)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="getRepoObjs stats" application=argocd/APP-app-APPENV build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=7 unmarshal_ms=7 version_ms=0
time="2024-02-09T19:08:28Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="No status changes. Skipping patch" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:28Z" level=info msg="Reconciliation completed" application=argocd/APP-app-APPENV dedup_ms=0 dest-name= dest-namespace=APP dest-server="https://XXX.us-west-2.eks.amazonaws.com" diff_ms=9 fields.level=1 git_ms=7 health_ms=3 live_ms=10 patch_ms=0 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=57
time="2024-02-09T19:08:43Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:43Z" level=info msg="Comparing app state (cluster: https://XXX.us-west-2.eks.amazonaws.com, namespace: APP)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:43Z" level=info msg="getRepoObjs stats" application=argocd/APP-app-APPENV build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=7 unmarshal_ms=7 version_ms=0
time="2024-02-09T19:08:43Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:43Z" level=info msg="No status changes. Skipping patch" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:43Z" level=info msg="Reconciliation completed" application=argocd/APP-app-APPENV dedup_ms=0 dest-name= dest-namespace=APP dest-server="https://XXX.us-west-2.eks.amazonaws.com" diff_ms=14 fields.level=1 git_ms=7 health_ms=3 live_ms=13 patch_ms=0 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=73
time="2024-02-09T19:08:43Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:43Z" level=info msg="Comparing app state (cluster: https://XXX.us-west-2.eks.amazonaws.com, namespace: APP)" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:43Z" level=info msg="getRepoObjs stats" application=argocd/APP-app-APPENV build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=7 unmarshal_ms=7 version_ms=0
time="2024-02-09T19:08:43Z" level=info msg="Skipping auto-sync: application status is Synced" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:43Z" level=info msg="No status changes. Skipping patch" application=argocd/APP-app-APPENV
time="2024-02-09T19:08:43Z" level=info msg="Reconciliation completed" application=argocd/APP-app-APPENV dedup_ms=0 dest-name= dest-namespace=APP dest-server="https://XXX.us-west-2.eks.amazonaws.com" diff_ms=9 fields.level=1 git_ms=7 health_ms=3 live_ms=13 patch_ms=0 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=60

Version

argocd-server: v2.9.5+f943664
@diranged diranged added the bug Something isn't working label Feb 9, 2024
@diranged
Copy link
Author

diranged commented Feb 9, 2024

Ok. I am closing this bug - but leaving it around as a reference for others. What I missed was searching for the callers of the requestAppRefresh function. When you search for that, you can find this bit with the Requesting app refresh caused by object update log message.. flipping on debug logs makes it really clear.

In our case, it's mostly the HorizontalPodAutoscaler and ScaledObject resources causing these updates... which lead me to #13912 (comment).

@diranged diranged closed this as completed Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant