-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DNS entry for ovsdbserver-sb- services #154
Add DNS entry for ovsdbserver-sb- services #154
Conversation
Hi @averdagu. Thanks for your PR. I'm waiting for a openstack-k8s-operators member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a drive-by, incomplete review since it's a draft PR for now. I am happy to hear this seems to work. I hope the initial comments may be useful in your polishing work. Kudos!
3f4ac62
to
bcb7341
Compare
09ceab7
to
cc683aa
Compare
cc9a5a0
to
f7ce6d7
Compare
api/v1beta1/ovndbcluster_types.go
Outdated
return instance.Status.DBAddress, nil | ||
//return instance.Status.DBAddress, nil | ||
//return "tcp:ovsdbserver-sb.openstack.svc:6642", nil | ||
dns_hostname := "tcp:ovsdbserver-sb." + instance.Namespace + ".svc:6642" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with this as all the places going to be using dns name would be good to update Status.DBAddress and Status.InternalDBAddress itself for both NB/SB DBs?
I recall from some discussion .cluster.local suffix was required for connecting to the db endpoints, /me don't know why though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it could be good to update DBAddres (with ovsdbserver-sb." + instance.Namespace + ".svc) and InternalDBAddress with "ovsdbserver-sb." + instance.Namespace + ".svc.cluster.local" (/me needs to look exactly how coreDNS sets DNS)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please update the fields in db.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall from some discussion .cluster.local suffix was required for connecting to the db endpoints, /me don't know why though
OVS DNS support (openvswitch/ovs@4d55a36 but I expect it's unbound that doesn't support this) doesn't look at search_domains in /etc/resolv.conf so must use fully-qualified DNS names. Also looks like it only queries A/AAAA records so CNAME wouldn't work either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, even when pulling in resolv.conf, it doesn't appear to append search domains.
>>> import unbound
>>> ctx = unbound.ub_ctx()
>>> ctx.resolvconf("/etc/resolv.conf")
0
>>> status, result = ctx.resolve("source")
>>> result.havedata
0
>>> status, result = ctx.resolve("source.redhat.com", unbound.RR_TYPE_A, unbound.RR_CLASS_IN)
>>> result.havedata
1
>>> result.data.address_list
['52.247.118.198']
As far as CNAMEs, that should be fine. www.redhat.com
is a CNAME and:
>>> status, result = ctx.resolve("www.redhat.com", unbound.RR_TYPE_A)
>>> result.data.address_list
['96.16.244.143']
>>> status, result = ctx.resolve("www.redhat.com", unbound.RR_TYPE_AAAA)
>>> result.data.address_list
['38.0.20.4.100.0.22.150.0.0.0.0.0.0.13.68', '38.0.20.4.100.0.22.137.0.0.0.0.0.0.13.68']
It may be configurable via an unbound.conf to set a search domain, and we can pass those via an environment variable. But it seems like a good practice to use the FQDN.
var dnsName string | ||
var dnsIP string | ||
var hostnames []string | ||
dnsName = "dns-" + ovnPod.Name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be better to create 1 dnsdata per DBCluster instead of per pod?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may complicate logic when scaling up/down. I can't see any benefit of having only 1 DNSData per cluster instead per service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With one per pod will scale down need explicit logic to delete the DNSData?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, would need new logic to retrieve the DNSData, iterate over all hosts and delete those which had been scaled-down and update the resource.
Currently I just delete the DNSData CR associated to the scaled-down pod/service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this case i.e with keeping 1 dnsdata per DBCluster, i think not required to fetch/delete but can just update dnsdata with current set of pods in case of scale up/down.
oops, it was not my intention to push anything into your repo (why do I even have permissions there?) I will revert to original. |
ed2ba16
to
dc661ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is very close, but some code cleanup is still due.
One major point of contention is, as Yatin pointed out, - it's unclear why we even maintain -%d DNS records for each pod of the cluster. Seems redundant (and hence belongs to the chopping block).
There were some comments of mine from the previous iterations that are not addressed. They are not marked Completed so it should be easy to find them. Let me know if you can't.
pkg/ovndbcluster/dnsdata.go
Outdated
headlessDnsHostname = ServiceNameNB + "." + instance.Namespace + ".svc" | ||
} | ||
if dbAddress == "" { | ||
// Since DNS is used for DBAddress value will be the same for each iteration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"iteration"? I don't think it makes sense in the context of the helper. Let the caller avoid busy work by calling to the helper once; the helper is a pure function returning the result based on input, it doesn't care how you call it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what I think would be helpful to document is why you think Ports[0] is the right thing to do. (There is some assumption here, that only one port is set per service - which is probably true, but I'd like to know why it's true and why it's guaranteed to be true.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I move this comment outside of dnsdata.go, and it makes sense again (why we only set once dbAddress on loop).
Regarding Ports[0] is something that it was done before, I should talk to @karelyatin for that
kuttl failures still seem to suggest that the kind for DNSData is not known to the operator controller. This will have to be explored. |
openstack-k8s-operators/install_yamls#686 will handle that. I posted some comments there, once it's ready i think can do Depends-On and that should work |
With the OVN DNS PR (openstack-k8s-operators/ovn-operator#154) OVN consumes the infranetworkv1.DNSData CDR so the infra operator is needed to run the OVN kuttl tests.
dc661ca
to
2500b87
Compare
/test ovn-operator-build-deploy-kuttl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits, nothing major. I am ready to merge it once CI is happy.
tests/functional/base_test.go
Outdated
if nad != "" { | ||
ovndbcluster.Status.DBAddress = dbaddr | ||
endpoint := "" | ||
// If cluster NAD is empty, ExtenralEndpoint will never be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extenral -> External
tests/functional/base_test.go
Outdated
ovndbcluster.Status.DBAddress = dbaddr | ||
endpoint := "" | ||
// If cluster NAD is empty, ExtenralEndpoint will never be | ||
// set, instead look at internalEndpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should check "instead". internal can always be checked; external when nad is set.
tests/functional/base_test.go
Outdated
func GetDNSDataHostnameIP(dnsDataName string, namespace string, dnsHostname string) string { | ||
dnsEntry := GetDNSData(types.NamespacedName{Name: dnsDataName, Namespace: namespace}) | ||
// DNSData contains Hosts (an array), this array will contain | ||
// ovsdbserver-sb-0.ns.svc and ovsdbserver-sb.ns.svc, need to get |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is no longer correct, right? you removed -%d records?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, forgot to remove comment, good catch
6003fbf
to
18efd9a
Compare
Depends-On not working with kuttl job, i validated this locally and approved other patch |
} | ||
// Delete DNS records for deleted services/pods | ||
namespace := helper.GetBeforeObject().GetNamespace() | ||
err = deleteDNSData(ctx, helper, fullServiceName, namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the dnsdata no longer exists for fulServiceName so this call and definition of function deleteDNSData not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it does exist.
For each service a DNSData will be created. Inside this DNSData entry it will be the resolution ovsdbserver-sb.openstack.svc -> SB_POD_INTERNALAPI_IP
But at the end you'll have the same number of entries as services. What it was deleted was the resolution ovsdbserver-sb-X.openstack.svc -> SB_POD_INTERNALAPI_IP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okk i expected it to be single dnsdata per service from past discussion, keeping dnsdata per service just looks unnecessary at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's something that can be done, but it would miss dp3. Can create jira to do a follow up if necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok can be checked/done later
pkg/ovndbcluster/dnsdata.go
Outdated
} else { | ||
headlessDnsHostname = ServiceNameSB + "." + instance.Namespace + ".svc" | ||
} | ||
return fmt.Sprintf("tcp:%s:%d", headlessDnsHostname, svc.Spec.Ports[0].Port) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same seems can be done for port instead of passing full service object.
ctx, | ||
helper, | ||
svc, | ||
dnsIP, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be fixed as not handling ips from multiple replica, just the ip of last pod will persist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't, This will create a DNSData object and it will call CreateOrPatch. So since this is called on every loop iteration, there will be N objects of DNSData, each one resolving that pod's IP to ovsdbserver-(nb|sb).namespace.svc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, reading it again got it as those created with podname. But looks unnecessary to maintain those extra dnsdata per pod. If it could be avoid it's good else can also cleanup later if others are ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok can be checked/done later
10ff92f
to
e754613
Compare
When the ovsdbserver-sb/nb pod gets deleted we can't ensure that it will be recreated with the same internalapi IP, since that IP is popullated to the EDPM nodes during ansibleee-deployment_phase, if the IP changes during reboot of the pod EDPM won't know until user retriggers ansibleee-deployment_phase. This will mean that meanwhile ovn_controller and neutron-ovn-metadata won't have connectivity to the SB DB. In order to fix this instead of using a string of IPs on the ovn-remote a single DNS entry will be used. Every service will add an entries to the openstack-dnsmasq: - ovsdbserver-xb.openstack.svc The last one will be the one popullated to the EDPM node, as querying it will return one IP from all the SB pods initialized at that moment (dns will use sequential round-robin to fulfill the request). Depends-on: openstack-k8s-operators/install_yamls#686 Resolves: OSPRH-660
e754613
to
5882e62
Compare
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: averdagu, karelyatin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
bd7f402
into
openstack-k8s-operators:main
Storage service hostnames need to be resolvable on the dataplane nodes as well. They are used within the Swift rings to be able to allow node IP changes without updating and redistributing Swift rings. This patch creates DNSData entries for every storage service pod, similar to [1]. It only creates DNS for the storage network, as this is the only one that should be used within the storage backend services. There is no delete function to scale down, as we don't support scaling down for SwiftStorage instances. However, the created DNSData CRs are owned by the SwiftStorage instance, thus being deleted properly if the instance is deleted. [1] openstack-k8s-operators/ovn-operator#154
Storage service hostnames need to be resolvable on the dataplane nodes as well. They are used within the Swift rings to be able to allow node IP changes without updating and redistributing Swift rings. This patch creates DNSData entries for every storage service pod, similar to [1]. It only creates DNS for the storage network, as this is the only one that should be used within the storage backend services. There is no delete function to scale down, as we don't support scaling down for SwiftStorage instances. However, the created DNSData CRs are owned by the SwiftStorage instance, thus being deleted properly if the instance is deleted. [1] openstack-k8s-operators/ovn-operator#154
Storage service hostnames need to be resolvable on the dataplane nodes as well. They are used within the Swift rings to be able to allow node IP changes without updating and redistributing Swift rings. This patch creates DNSData entries for every storage service pod, similar to [1]. It only creates DNS for the storage network, as this is the only one that should be used within the storage backend services. There is no delete function to scale down, as we don't support scaling down for SwiftStorage instances. However, the created DNSData CRs are owned by the SwiftStorage instance, thus being deleted properly if the instance is deleted. [1] openstack-k8s-operators/ovn-operator#154
Storage service hostnames need to be resolvable on the dataplane nodes as well. They are used within the Swift rings to be able to allow node IP changes without updating and redistributing Swift rings. This patch creates DNSData entries for every storage service pod, similar to [1]. It only creates DNS for the storage network, as this is the only one that should be used within the storage backend services. There is no delete function to scale down, as we don't support scaling down for SwiftStorage instances. However, the created DNSData CRs are owned by the SwiftStorage instance, thus being deleted properly if the instance is deleted. [1] openstack-k8s-operators/ovn-operator#154
Storage service hostnames need to be resolvable on the dataplane nodes as well. They are used within the Swift rings to be able to allow node IP changes without updating and redistributing Swift rings. This patch creates DNSData entries for every storage service pod, similar to [1]. It only creates DNS for the storage network, as this is the only one that should be used within the storage backend services. There is no delete function to scale down, as we don't support scaling down for SwiftStorage instances. However, the created DNSData CRs are owned by the SwiftStorage instance, thus being deleted properly if the instance is deleted. [1] openstack-k8s-operators/ovn-operator#154
When the ovsdbserver-sb pod gets deleted we can't ensure that it will be recreated with the same internalapi IP, since that IP is popullated to the EDPM nodes during ansibleee-deployment_phase, if the IP changes during reboot of the pod EDPM won't know until user retriggers ansibleee-deployment_phase. This will mean that meanwhile ovn_controller and neutron-ovn-metadata won't have connectivity to the SB DB.
In order to fix this instead of using a string of IPs on the ovn-remote a single DNS entry will be used. Every service will add two entries to the openstack-dnsmasq:
The last one will be the one popullated to the EDPM node, as querying it will return one IP from all the SB pods initialized at that moment (dns will use sequential round-robin to fulfill the request).
TODO:
Depends-On: openstack-k8s-operators/install_yamls#686