-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Sending resource to Azure even if there are no configuration changes #2590
Comments
The behaviour you're seeing is by-design - but we need to do something to prevent the throttling you're experiencing. ASO is treating the custom resources in your cluster as the goal state and is issuing PUTs of those resources to Azure to ensure they haven't drifted from that goal state. Much earlier in the life of ASO, we were only issuing updates to Azure when the Custom Resource was modified, which resulted in problems for some customers when changes were made in Azure and the resources drifted from the desired configuration. We also discovered that this was contrary to the expected behavour of a Kubernetes Operator. We removed the use of a spec-hash in #2022. Our desired behaviour is to be smarter about how we reconcile - there's discussion on this in #1491.
I'll check in with the rest of the ASO team - I would have expected the reconciles to be somewhat spread out in time. |
As @theunrepentantgeek mentioned, it is by design that ASO issues requests to Azure periodically even when in steady state. This is to correct drift on the Azure side if changes have been made there without the operators knowledge. As you correctly determined, the If you really want ASO to do nothing while in steady state, you can set the We also just recently changed the default syncPeriod from 15m to 1h (#2578) due to throttling concerns. This will be included in the beta.4 release that is upcoming. It is unlikely that ASO will ever recommend issuing no requests to Azure while you're in steady state. With that said, we are definitely aware that throttling is a problem with this pattern and in general. The changes I see requested here are:
What you can do in the meantime: The other thing to check for is a resource stuck in a bad state triggering much faster requests (ASO logs or metrics can help you find this). We had another customer report something like this, and an improvement is coming in beta.4 as well (see #2575) |
Version of Azure Service Operator
v2.0.0-beta.0 , v2.0.0-beta.3
AKS version - 1.24.6
Describe the bug
Each reconcile cycle have been finishing with an action to send on azure BeginCreateOrUpdate even if there are no changes in configuration since the previous reconcile cycle.
In our case with a huge amount of resources managed by ServiceOperator (>300), we got an error
Number of write requests for subscription '***' exceeded the limit of '1200' for time interval '01:00:00'. Please try again after '303' seconds.
Changing azureSyncPeriod didn't resolved our issue, because on the next cycle all resources were updated at the same time
Expected behavior
Shouldn't be any requests to Azure if the resource configuration was not changed
Screenshots
1115 14:58:12.587253 1 generic_controller.go:281] controllers/StorageAccountController "msg"="Reconcile invoked" "azureName"="storageName" "name"="storageName" "namespace"="namespaceName" "generation"=1 "kind"="*v1beta20210401storage.StorageAccount" "resourceVersion"="103415751"
1115 14:58:12.587312 1 azure_generic_arm_reconciler_instance.go:157] controllers/StorageAccountController "msg"="DetermineCreateOrUpdateAction" "azureName"="storageName" "name"="storageName" "namespace"="namespaceName" "condition"="Condition [Ready], Status = "True", ObservedGeneration = 1, Severity = "", Reason = "Succeeded", Message = "", LastTransitionTime = "2022-11-15 14:30:15 +0000 UTC"" "pollerID"="" "resumeToken"=""
1115 14:58:12.587340 1 azure_generic_arm_reconciler_instance.go:65] controllers/StorageAccountController "msg"="Reconciling resource" "azureName"="storageName" "name"="storageName" "namespace"="namespaceName" "action"="BeginCreateOrUpdate"
1115 14:58:12.587648 1 azure_generic_arm_reconciler_instance.go:288] controllers/StorageAccountController "msg"="About to send resource to Azure" "azureName"="storageName" "name"="storageName" "namespace"="namespaceName"
1115 14:58:13.672958 1 azure_generic_arm_reconciler_instance.go:306] controllers/StorageAccountController "msg"="Successfully sent resource to Azure" "azureName"="storageName" "name"="storageName" "namespace"="namespaceName" "id"="/subscriptions/subscriptionId/resourceGroups/rgName/providers/Microsoft.Storage/storageAccounts/storageName"
1115 14:58:13.673040 1 azure_generic_arm_reconciler_instance.go:378] controllers/StorageAccountController "msg"="Resource successfully created" "azureName"="storageName" "name"="storageName" "namespace"="namespaceName" "resourceID"="/subscriptions/subscriptionId/resourceGroups/rgName/providers/Microsoft.Storage/storageAccounts/storageName"
1115 14:58:13.673257 1 recorder.go:103] events "msg"="Normal" "message"="Successfully sent resource to Azure with ID "/subscriptions/subscriptionId/resourceGroups/rgName/providers/Microsoft.Storage/storageAccounts/storageName"" "object"={"kind":"StorageAccount","namespace":"namespaceName","name":"storageName","uid":"b4a1asd0-52d6-4795-a1cc-474e11b20ab4","apiVersion":"storage.azure.com/v1beta20210401storage","resourceVersion":"103415751"} "reason"="BeginCreateOrUpdate"
1115 14:58:13.727399 1 secrets_retriever.go:51] controllers/StorageAccountController "msg"="Retrieving secrets from Azure" "azureName"="storageName" "name"="storageName" "namespace"="namespaceName"
1115 14:58:13.908235 1 secrets_retriever.go:57] controllers/StorageAccountController "msg"="Successfully retrieved secrets" "azureName"="storageName" "name"="storageName" "namespace"="namespaceName" "SecretsToWrite"=1
The text was updated successfully, but these errors were encountered: