-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource group should get updated if tags are added #1696
Comments
I'm assuming no one is working on this as of now |
Signed-off-by: Karuppiah Natarajan <[email protected]>
@karuppiah7890 go for it, let me know if you have any questions |
@CecileRobertMichon Are there tests for the
and based on the comment from here - #1667 (comment) and this issue's comments, looks like the code needs changes over here - cluster-api-provider-azure/azure/services/groups/spec.go Lines 51 to 55 in 47caf96
I was thinking I could check some existing tests and write a test accordingly by following it and keep things consistent, but couldn't find any. I was thinking of adding a unit test with mocks to Azure Client or integration test, as E2E test seemed too heavy for this situation, or it could be added as an additional check as part of some update test in E2E tests. How are the |
I could add a unit test for now for the updation of additional tags if there are any in cluster-api-provider-azure/azure/services/groups/spec.go Lines 51 to 55 in 47caf96
spec_test.go . Parameters seems like the most smallest unit that can be tested compared to testing the other levels / functions / methods. I'll also implement the feature to fix the failing test that I write. And then I'll think about how to test the etag part and add the test and then implement that too. I'll then checkout the CAPZ managed vs CAPZ non-managed part next
|
Also, I noticed the test in |
I have a question regarding updates - should the update only allow updation of additional tags of resource groups? Atleast for this issue. Should we ensure we don't update the other data? I see that we use
So we can only change tags and we need to keep location as is from the existing resource group cluster-api-provider-azure/azure/services/groups/spec.go Lines 56 to 65 in 47caf96
If we are not changing the location, do we need to pass it because of the |
Okay, looks like |
For CAPZ managed vs CAPZ non-managed, we figure this out through the tags, right? So, we will check if the tags here cluster-api-provider-azure/azure/services/groups/spec.go Lines 58 to 64 in 47caf96
infrav1.Build to get the exact tags with prefixes etc cluster-api-provider-azure/api/v1alpha4/tags.go Lines 180 to 197 in 47caf96
|
I guess for checking if the group is managed or not we could just use cluster-api-provider-azure/azure/services/groups/groups.go Lines 104 to 117 in 47caf96
azure/services/groups/groups.go . Maybe we could use a similar code in azure/services/groups/client.go to avoid calling spec.Parameters here cluster-api-provider-azure/azure/services/groups/client.go Lines 145 to 151 in 47caf96
spec.Parameters code itself in cluster-api-provider-azure/azure/services/groups/spec.go Lines 50 to 55 in 47caf96
spec.Parameters
Later we will have to see if we want to remove duplicate code which checks if a group is managed or not and where to place the common code and call is accordingly |
Question: Let's say a cluster is created and managed using CAPZ, and an additional tag is added to the resource group from outside (by the user or something) without using CAPZ, now, when we manage the resource group, do we manage the extra tag? Do we remove it and only ensure that the CAPZ's |
Fixes kubernetes-sigs#1696 Signed-off-by: Karuppiah Natarajan <[email protected]>
About checking if the resource group is managed or not, looking at the code it looks like we want to move lot of the logic into cluster-api-provider-azure/azure/services/groups/groups.go Lines 115 to 116 in 47caf96
|
Signed-off-by: Karuppiah Natarajan <[email protected]>
cc @CecileRobertMichon ^ |
Following the question on managed resource groups and declarative behavior, let's say a resource group only has the tag |
I created a draft / WIP PR so that one can see the changes live as I'm working on it. #1721 cc @CecileRobertMichon |
About the checking of ETag , looks like the Azure API does not provide ETag for resource groups - https://docs.microsoft.com/en-us/rest/api/resources/resource-groups/get - GET But yeah, other APIs like Network Security Group have ETag in the response This is also reflected in the Azure Go SDK where |
Any thoughts on how we can go about avoiding race conditions for resource group updates? |
One possible option is to use https://docs.microsoft.com/en-us/rest/api/resources/tags/update-at-scope where the scope is the resource group and the PATCH operation would only contain the tags we would like to mutate. |
Interesting! Thanks for sharing @devigned ! I'll check it out in detail and also check the corresponding Azure Go SDK API for using it. One tricky thing is, this would mean that we would detect resource group tag updates separately and use separate API calls for it and deviate from the path of using the Create Or Update API of resource group which is currently being used as part of the reconciliation. Are we okay with that? |
It will introduce more complexity, but I don't see another way around the possibility of wiping out tags applied between GET and PUT. I think the trade off of slightly more complexity in update is better than the harm of possible data loss. |
Complexity - cool 👍 And I'm still trying to grok the problem and the solution we are aiming for. This is what I understand from the race condition problem - If entity A does a GET on resource group and does some processing (checking diff etc and deciding to update or not) and then does a PUT finding out that it needs to update tags In the meanwhile exactly after entity A does GET but before it does PUT, another entity B updates the resource group tags. So entity A is working with stale data (a dirty read problem) while it's doing a PUT I have a few questions here - Is the above understanding a correct explanation of the problem? If yes, what are the possible actors that represent entity A and entity B? Two different instances of CAPZ controller? or something like CAPZ controller and an external entity, say a user using Azure Web UI or Azure CLI updating the resource group tags
I'm assuming you want to avoid this, correct? Based on I also want to ask what's the strategy of handling tags - like, the declarative behavior of resource group tags - what kind of behavior are we looking for? I had asked a similar question here - #1696 (comment) , #1696 (comment) |
We;re actually already updating tags using update-at-scope for VMs, we could do something similar for resource groups: https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/main/azure/scope/machine.go#L114 |
If A and B GET at the same time they see the same RG. If A then updates the RG and we now have RG' stored, then B may be working with stale data. Let's say that B only cares about making sure that RG exists, and that tags When updating a resource group, the operator should not call PUT on the resource group. I don't think there is anything else we can change. If the resource group exists, CAPZ should only call PATCH on the tags for the resource group, which will scope the data mutation to only the data we own, thus eliminating the possibility of overwriting tags. In the end, the stale read will not affect the end result we are seeking. |
yeah, I was gonna ask about what all data can we change on resource groups. Clearly location is constant, name is the identifier, we are left out with tags. So create uses create or update API and tags update can use the tags API And about updating tags - are we saying that we want to do a PATCH and only manage the tags that CAPZ owns, in case the resource group is CAPZ managed and not meddle with any other tags that the external entities add or update that are not part of |
Sorry about the list of questions. I'm basically trying to understand this - |
This is what I understand - if different entities are managing the resource group's tags, then each of them would have their own set of tags to manage which most probably won't collide. For this reason, just using PUT won't work as it would completely replace the existing set of tags when each entity tries to do an update using PUT, so instead we try to do a PATCH for doing the updates I'm only curious about what these entities are if not CAPZ controllers. Also, if the other entities change the CAPZ managed tags, then CAPZ would change it back to the value that it has as part of reconciliation, right? |
Could be just about anything. For example, tags are used as a facet in billing which allows folks to set tags and break up billing for a given tag. These tags are often managed through outside systems or manually. There's lots of uses outside of what we use.
Yes. This is the intended behavior. |
Cool ! 👍 |
If you do go with the update-at-scope tag service, could you please add a comment in
|
What about deletion of tags? Do we handle that too? I see that the existing code here seems to handle it and I guess we expect that to be the case for resource group too? |
Sure @CecileRobertMichon ! I'll try to follow pattern of the code that you referred in here #1696 (comment) - and use tag service from https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/main/azure/services/tags/tags.go |
yes, if the tags are managed, I think we want the tag list to == our default tags + additional tags from spec the only difference here will be to only do this if the RG is managed (for VMs there is no concept of managed/unmanaged so we update tags no matter what), but you can use tags to determine this (the owned tag specifically). |
Okay! 👍 |
@CecileRobertMichon @devigned Since we want to manage the resource group's tag with updates and deletes too, I'm gonna be using the existing code in https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/main/azure/services/tags/tags.go which uses Tags Create or Update At Scope API rather than using Tags Update At Scope API along with
Tags Create or Update At Scope API helps with create, update and delete of tags and it manages only the tags it owns by using the concept of storing state in the last-applied annotation
For machines, the annotation name is this -
cluster-api-provider-azure/api/v1alpha4/tags.go Lines 134 to 138 in c878bb9
I hope it's okay to go ahead and use the existing tags service code and use an annotation in the Azure Cluster K8s custom resource to track the resource group's tag's last applied state using an annotation name like |
I was also wondering how to do the testing for this. I did some manual testing, but for automated test, I wasn't sure where to put it. Currently I noticed I might have to do too much mocking to get even existing tests to work because of the way I have implemented. Maybe could someone take a look at the code in #1721 and let me know how the code looks? The placement of logic etc and then I can also start looking at how to make the code easily testable. Currently it seems a bit hard to test the part of checking if resource group tags are updated as it requires more mocks (mocking of groups scope, mocking of azure client etc) |
Signed-off-by: Karuppiah Natarajan <[email protected]>
Signed-off-by: Karuppiah Natarajan <[email protected]>
Signed-off-by: Karuppiah Natarajan <[email protected]>
Signed-off-by: Karuppiah Natarajan <[email protected]>
Signed-off-by: Karuppiah Natarajan <[email protected]>
Signed-off-by: Karuppiah Natarajan <[email protected]>
Signed-off-by: Karuppiah Natarajan <[email protected]>
Signed-off-by: Karuppiah Natarajan <[email protected]>
Signed-off-by: Karuppiah Natarajan <[email protected]> Signed-off-by: Karuppiah Natarajan <[email protected]>
/kind bug
[Before submitting an issue, have you checked the Troubleshooting Guide?]
Currently, we only create the resource group if it doesn't already exist:
cluster-api-provider-azure/azure/services/groups/spec.go
Line 53 in 47caf96
Someone working on this issue might want to take a look at how network security groups look at existing rules and only update them if some are missing, and use an etag to make sure there is no race with others also updating the NSG from other controllers:
cluster-api-provider-azure/azure/services/securitygroups/spec.go
Line 63 in 7a76e16
Environment:
kubectl version
):/etc/os-release
):The text was updated successfully, but these errors were encountered: