-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[internal-dns] remove lookup_ipv6 in favor of lookup_socket_v6 #6320
Conversation
`lookup_ipv6` is both buggy and easy to misuse: * it sends an AAAA query for a domain which should have a SRV record - this works only because #4051 means the SRV record is incorrectly returned, along with the desired AAAA in Additionals * it looks up an IPv6 address from a SRV record *but ignores the port*. in places `lookup_ipv6` was used, it was paired consistently with the hardcoded port NEXUS_INTERNAL_PORT and matched what should be in the resolved SRV record. if we for example wanted to move Nexus' port (or start a test Nexus on an atypical port), the authoritative port number in the SRV response would be ignored for the hardcoded port. lets just use the port that we told DNS we're at! we may still want a bare IPv6 address for a service if we're testing network reachability, for example, but we're not doing that today. and if we need a service's IPv6 address to use with an alternate port to access a different API, we *probably* should have a second SRV record for that API to use instead? (`lookup_ipv6` would be simple enough to fix and retain, but after looking at its uses it seems it might be more trouble than it's worth right now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- it looks up an IPv6 address from a SRV record but ignores the port. in places
lookup_ipv6
was used, it was paired consistently with the hardcoded port NEXUS_INTERNAL_PORT and matched what should be in the resolved SRV record. if we for example wanted to move Nexus' port (or start a test Nexus on an atypical port), the authoritative port number in the SRV response would be ignored for the hardcoded port.lets just use the port that we told DNS we're at!
Yea, that's always been the goal so glad to see this!
Dunno about the CRDB admin service case, but for the other two:
- wicket assumes that the techport proxy can be done on the same IP as Nexus' API, just a different port
Ah yea, the dropshot instance that serves the external api for the techport proxy listens on the underlay network by binding to the same address as the internal nexus api. An explicit SRV/service entry for it though wouldn't be a bad idea to get rid of hardcoding ports everywhere.
- we look up addresses for MGS here via
ServiceName::Dendrite
, but there's aServiceName::ManagementGatewayService
that - if it doesn't already - seems like it should have a SRV record too.
Yea, I think that's just a typo. Looking up ManagementGatewayService
should work.
PR looks good with one suggestion. The specific service fixes can be done as a follow up.
3acffa4
to
0dcec7a
Compare
OK. this should actually pass (finally). failures were not because of changes at any point here, they were because a combination of #6204, this branch being made before #6204, and the illumos used here by buildomat being upgraded to include the change that motivated #6204. so entirely incidentally to this PR, merging main pulls in #6204 and with a ✔️ i'll merge. |
lookup_ipv6
is both buggy and easy to misuse:lookup_ipv6
was used, it was paired consistently with the hardcoded port NEXUS_INTERNAL_PORT and matched what should be in the resolved SRV record. if we for example wanted to move Nexus' port (or start a test Nexus on an atypical port), the authoritative port number in the SRV response would be ignored for the hardcoded port.lets just use the port that we told DNS we're at!
we may still want a bare IPv6 address for a service if we're going to test network reachability, for example, but we're not doing that with this function today. this all is distinct from helpers like
lookup_all_ipv6
.if we need a service's IPv6 address to use with an alternate port to access a different API, we probably should have a distinct SRV record for that lookup to use instead? i've found three instances of this:
ServiceName::Dendrite
, but there's aServiceName::ManagementGatewayService
that - if it doesn't already - seems like it should have a SRV record too.there are some uses of
lookup_all_ipv6
that make a lot of sense still, where we're discovering the rack's network and really do not care about the port that Dendrite happens to be on.(this all said,
lookup_ipv6
would be simple enough to fix and retain. but after looking at its uses it seems it might be more trouble than it's worth right now)this is the counterpart to #6308 which fixes clients (so far as the test suite and i can tell) to not depend on our internal DNS server returning answers we didn't ask for. we could land this first and #6308 at some point later, or i think since Oximeter is the only place we depended on
lookup_ipv6
, i'm pretty sure we can land both?