Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dns): Support alt domains #5940

Merged
merged 15 commits into from
Jun 27, 2019

Conversation

akshayganeshen
Copy link
Contributor

Alternate DNS domains are supplied with the -alt-domain flag or alt_domains configuration parameter.

The DNS resolver will attempt to parse and match each domain during resolution.

See #4165 for use-case.

Add flags for `alt-domain`, and configuration for `alt_domains`.
Support `alt_domains` in `handleQuery` and `resolveCNAME`.
@freddygv
Copy link
Contributor

Thanks for the PR @akshayganeshen!

At first glance it seems like one of the new tests is consistently failing in CI: TestDNS_AltDomains_Overlap

Is this test passing locally?

@akshayganeshen
Copy link
Contributor Author

Hey @freddygv, that test does not pass locally. It's an example of the sort of edge-cases that my implementation does not handle:

# alt_domains = [ "dc.consul.", "consul." ]
dig @127.0.0.1 -p 8600 web.service.dc.consul
# is the domain `consul`, and the datacenter `dc`?
# or is the domain `dc.consul`?

Without that test, I think it fulfills the desired use-case, but this edge-case should have a well-defined expectation. I believe it can be resolved in two ways:

  1. Simply do not allow these ambiguous alt_domains. Raise a validation error in agent/config/builder.go when such a scenario is detected. (In my opinion this allows the greatest flexibility to change/define the behaviour later on.)
  2. Prioritize domain matching in some way to break ties between ambiguous domain matches. For example, prefer matching domains that result in queries without a datacenter. (That would resolve the scenario I have in the test, but there are other possible edge-cases depending on the prioritization method).

I was hoping to get some feedback before assuming which was the desired solution, or if there's another solution entirely that I'm missing.

agent/config/config.go Outdated Show resolved Hide resolved
agent/config/builder.go Show resolved Hide resolved
agent/dns.go Show resolved Hide resolved
agent/dns.go Show resolved Hide resolved
Change `alt_domains` from a string-list to a single string.

Revert changes to loop over all domains in DNS dispatch. Handle
ambiguous matches by taking the longer matching domain/suffix.
@akshayganeshen
Copy link
Contributor Author

I've updated the implementation to use only a single alt_domain. However, I kept the test cases, which includes a failing case, even with the updated trimDomain. (Of course, I had to change my input slightly to demonstrate the problem I initially observed.)

Did I miss something there? Or is this as expected?

Copy link
Member

@hanshasselberg hanshasselberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes! This PR looks pretty good now! I found another small thing and it would be also great if you could add the flag and the config option to website/source/docs/agent/options.html.md. 🙏 👍

We are still discussing internally what to do with the issue, when the datacenter is part of the domain.

agent/dns.go Outdated Show resolved Hide resolved
@clly
Copy link
Contributor

clly commented Jun 12, 2019

@akshayganeshen: Thank you so much for working on this.

My personal thought for ambiguous domains is that the primary domain as opposed to the alternative one should be used. I had not thought of this case when originally writing my branch.

A second option is to keep a list of the available datacenters to check a potential datacenter.

I can see a risk where not preferring the previous behavior can cause easily missed issues where a service suddenly fails to discover services that it expected to find in a remote datacenter.

@hanshasselberg
Copy link
Member

hanshasselberg commented Jun 12, 2019

@akshayganeshen @clly I think we found a solution for the ambiguous domains. I suggest we are validating the altDomain in config/builder.go like I did in this playground: https://play.golang.org/p/pWJO35B3pV-. That should hopefully resolve the problems. We discovered that not only the datacenter, but also other keywords could potentially cause troubles.

This validation should also make TestDNS_AltDomains_Overlap pass because the altDomain used in this test fails the checks.

Thanks again, for bearing with us!

@akshayganeshen
Copy link
Contributor Author

@i0rek @clly Thanks for the tips! I'll do the validation in config/builder.go, since that's more in line with what I was thinking originally.

(I'll also update the docs appropriately.)

Validate `alt_domain` when building runtime config.

Prevents situations where domain resolution is ambiguous.
Update to account for config check that disallows ambiguous domains.
Still demonstrate that this successfully resolves overlapping domains.
@akshayganeshen
Copy link
Contributor Author

I've updated the implementation and the tests pass!

I'm still updating the documentation though. I'll also add a test case for the invalid config.

Copy link
Member

@hanshasselberg hanshasselberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, everything seems to work perfectly! Thanks for bearing with us!

agent/config/builder.go Show resolved Hide resolved
Copy link
Member

@hanshasselberg hanshasselberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akshayganeshen
Copy link
Contributor Author

Good catch. Will fix.

@freddygv
Copy link
Contributor

@akshayganeshen given that we have a release coming up shortly, I went ahead and made the remaining finishing touches to your PR.

I hope you don't mind us carrying the PR across the finish line. We appreciate your contribution!

Copy link
Member

@hanshasselberg hanshasselberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@hanshasselberg hanshasselberg merged commit 98a35fb into hashicorp:master Jun 27, 2019
@akshayganeshen
Copy link
Contributor Author

@freddygv @i0rek thank you both! 🙏

Sorry I couldn't get around to this sooner.

@akshayganeshen akshayganeshen deleted the feature/dns-alt-domain branch June 28, 2019 01:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants