-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make private dns reconcile/delete async #2007
Conversation
03a1542
to
8dc9b37
Compare
zoneClient async.Creator | ||
vnetLinkClient async.Creator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this because deleteZone
and deleteLink
requires to do a GET
to check if the resource is managed or not. It seems a bit weird to use Creator
to do a GET, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here's how I dealt with this for vnets: https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/1921/files#diff-f45b26f09f47e6be08607d767760708b985d428a1dc4b9a81f0640ee3e7ee656R56
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you consider refactoring Creator
// Creator is a client that can create or update a resource asynchronously.
type Creator interface {
FutureHandler
CreateOrUpdateAsync(ctx context.Context, spec azure.ResourceSpecGetter, parameters interface{}) (result interface{}, future azureautorest.FutureAPI, err error)
Getter
}
type Getter interface {
Get(ctx context.Context, spec azure.ResourceSpecGetter) (result interface{}, err error)
}
so that we don't have to duplicate this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it because, Get
in this case is not part of async, that you created a new Getter
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you consider refactoring Creator
I hadn't, but that's a great idea
I was mostly trying to refactor the Client interface to make it more specific since we only need Get now and everything else is being handled by async.Reconciler
type Client interface {
Get(context.Context, string, string) (network.VirtualNetwork, error)
CreateOrUpdate(context.Context, string, string, network.VirtualNetwork) error
Delete(context.Context, string, string) error
CheckIPAddressAvailability(context.Context, string, string, string) (network.IPAddressAvailabilityResult, error)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored to follow virtual networks pattern because it seems more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implemented your suggestion in f29b0ff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah we swapped ideas 🙂 I'll rebase it once that pr is merged.
isManaged, err := s.isPrivateDNSManaged(ctx, s.Scope.ResourceGroup(), zoneSpec.ZoneName) | ||
if err != nil && !azure.ResourceNotFound(err) { | ||
return errors.Wrapf(err, "could not get private DNS zone state of %s in resource group %s", zoneSpec.ZoneName, s.Scope.ResourceGroup()) | ||
} | ||
// If resource is not found, it means it should be created and hence setting isVnetLinkManaged to true | ||
// will allow the reconciliation to continue | ||
if err != nil && azure.ResourceNotFound(err) { | ||
isManaged = true | ||
} | ||
if !isManaged { | ||
log.V(1).Info("Skipping reconciliation of unmanaged private DNS zone", "private DNS", zoneSpec.ZoneName) | ||
log.V(1).Info("Tag the DNS manually from azure to manage it with capz."+ | ||
"Please see https://capz.sigs.k8s.io/topics/custom-dns.html#manage-dns-via-capz-tool", "private DNS", zoneSpec.ZoneName) | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have moved this logic to spec.Parameters
since we already do a GET to fetch the existing resource. Same for vnet links.
azure/services/privatedns/spec.go
Outdated
|
||
// Parameters returns the parameters for the private dns zone. | ||
func (z ZoneSpec) Parameters(existing interface{}) (params interface{}, err error) { | ||
_, log, done := tele.StartSpanWithLogger(context.TODO(), "privatedns.ZoneSpec.Parameters") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to pass context.TODO
to get a logger since we don't pass context here. Open to suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts @CecileRobertMichon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given my comment below about the logging being temporary, I'd vote to keep this as a context.TODO() for now and just remove the context when we remove the logging in an upcoming release.
/test pull-cluster-api-provider-azure-e2e |
8dc9b37
to
6206725
Compare
/retest |
34814ce
to
4ac9f38
Compare
/test pull-cluster-api-provider-azure-e2e |
@shysank --
Is the link in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good! I like that the Reconciler is split into helper functions, and in general it looks way easier to read!
azure/services/privatedns/spec.go
Outdated
|
||
// Parameters returns the parameters for the private dns zone. | ||
func (z ZoneSpec) Parameters(existing interface{}) (params interface{}, err error) { | ||
_, log, done := tele.StartSpanWithLogger(context.TODO(), "privatedns.ZoneSpec.Parameters") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts @CecileRobertMichon?
azure/services/privatedns/spec.go
Outdated
log.V(1).Info("Tag the DNS manually from azure to manage it with capz."+ | ||
"Please see https://capz.sigs.k8s.io/topics/custom-dns.html#manage-dns-via-capz-tool", "private DNS", zone.Name) | ||
return nil, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we return nil, nil
at the end of this block anyway if the resource exists? If that's the case, then the check for isManaged
would be redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this, but didn't want to change the current implementation. wdyt about doing it in a followup pr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be fine in either case since if we don't update, we'd just be returning the existing resource anyway. In general, I've set up Parameters() to return nil, nil if we find an existing resource and also added any checks to the fields, i.e. for a subnet, if the VNet exists when it's not managed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we discussed about this earlier and created a followup for it. I'm fine with doing it here; I left it for now because we want to focus on other things. Happy to do it either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to return nil,nil
if zone exists. I have updated the pr description o reflect that. ptal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay sounds good.
azure/services/privatedns/spec.go
Outdated
log.V(2).Info("Skipping vnet link reconciliation for unmanaged vnet link", "vnet link", | ||
l.Name, "private dns zone", l.ZoneName) | ||
return nil, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same goes for here, I'm not sure if we still need these checks if we return nil, nil
to skip updating an existing spec anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above, didn't want to change current implementation. wdyt about doing it in a followup pr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
4ac9f38
to
76b0c85
Compare
To accomodate #2093, I've made the following changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I normally wouldn't want to split up a ready condition into 3, but in this case I think it makes sense with how the code is split into 3 separate reconcilers, clients, and specs. Also in #2146 we changed the Reconcilers by adding a Name()
and Managed()
function. I think it makes sense to consider the resource managed if one of them is managed, which is what we're doing in 2146.
} | ||
|
||
// isPrivateDNSManaged returns true if the private DNS has an owned tag with the cluster name as value, | ||
// meaning that the DNS lifecycle is managed. | ||
func (s *Service) isPrivateDNSManaged(ctx context.Context, resourceGroup, zoneName string) (bool, error) { | ||
zone, err := s.client.GetZone(ctx, resourceGroup, zoneName) | ||
func (s *Service) isPrivateDNSManaged(ctx context.Context, spec azure.ResourceSpecGetter) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you're saying about the redundant Get() calls too, it looks like we have this pattern on several services like VNets, where we need to fetch the existing resource to see if it's managed or not. I'm open to amending the interface, did you have any ideas? Since the tags are all the same, I was thinking we could have some an interface function where it could cache the result for the next Get() call for Create(). Wdyt @shysank @CecileRobertMichon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined towards merging this one as is and updating the interface in a followup. Since it affects only private clusters, this shouldn't cause huge performance degradation imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes +1 to caching as a separate issue, the scope is big enough here. #2146 (the second commit for IsManaged) should also help with that. I had tried to incorporate caching as part of https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/1684/files#diff-b74c6acb4629abe58f8286da9939563cc4404e09fe9966eb61dbde9b87274189R110 but ended up reverting it to keep the changes small.
d1d7676
to
c55f4aa
Compare
/test pull-cluster-api-provider-azure-e2e |
@CecileRobertMichon @Jont828 I think I have addressed all review comments. PTAL, whenever you get a chance. |
/assign @Jont828 |
c55f4aa
to
e2b6523
Compare
} | ||
|
||
if !managed { | ||
log.V(1).Info("Skipping reconciliation of unmanaged private DNS zone", "private DNS", zoneSpec.ResourceName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be log.V(2)
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't want to change the current implementation
log.V(1).Info("Skipping reconciliation of unmanaged private DNS zone", "private DNS", zoneSpec.ZoneName) |
return nil | ||
|
||
err = s.reconcileRecords(ctx, records) | ||
s.Scope.UpdatePutStatus(infrav1.PrivateDNSRecordReadyCondition, serviceName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we only want to update the put status if we actually have specs. In this case, we could only call UpdatePutStatus
if the number of specs > 0. The same goes for the delete status too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check for zero specs in
if zoneSpec == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay sounds good. In that case I think it LGTM.
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shysank can you please squash?
e2b6523
to
e78ae0b
Compare
@shysank: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CecileRobertMichon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind cleanup
/kind bug
What this PR does / why we need it:
This pr refactors privatedns reconciliation to use async pattern set in #1541. A few things to note:
Zone
,VirtualNetworkLink
, andRecords
. This is done by treating each resource as a separate spec, and callingcreateResource
/deleteResource
independently for each of them.PrivateDnsReadyCondition
is updated to indicate status of reconciliation.Delete
is NOT implemented forRecords
because it was not implemented in the current version. Implementing this is not trivial because deleting record requires a special parameter calledRecordType
apart from the usual suspects (resourceName, resourceGroupName, ownerResourceName). For create, I did a work around by reverse engineering the record type fromRecordSetProperties
set in parameters.createOrUpdateAsync
forRecords
is Synchronous operation. This is because azure api for records set does not have support for futures.client.go
into 3 files for each of the sub resources mentioned above as it became too large and difficult to navigate.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #1715 #1983 #1872
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs:
Release note: