Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redo DNS configuration #2034

Merged
merged 1 commit into from
Aug 19, 2024
Merged

Conversation

kradalby
Copy link
Collaborator

@kradalby kradalby commented Jul 26, 2024

This PR aims to redo (simplify and correct) DNS configuration to match what is possible with Tailscale.

TLDR: currently we have several settings that are either incompatible together, dont do what they sound like, or some dont work the way we thought at the time we, or a contributor implemented them.

DNS has been one of these things that mostly worked and had not been revisited until I got #1963 which I looked at, thought, I can easily knock this one out, and then I messed up everything. DNS config was very much a patchwork that had been changed over time and a lot of this work was from before we had integration tests.
It is embarrassing that it made it into the beta, but I suppose its a very efficient way to prioritise work...

This screwup has caused me to revisit the DNS config and I discovered that:

  • A lot of the settings is hard to understand what does
  • Some of them does not make sense in a Tailscale context
  • Some combination might not work at all.

Notable changes for people coming here from the CHANGELOG:

dns_config is now dns.

override_local_dns: has been removed, if magic_dns or the other dns options are used, Tailscale needs to take over the DNS configuration. Clients can opt out by passing --accept-dns=false (https://tailscale.com/kb/1235/resolv-conf)

domains has been renamed to search_domains to describe what it actually is.

nameservers has become a root key, where:
old nameservers is now nameservers.global and
restricted_nameservers is now nameservers.split to describe the fact that it was configuring Split DNS.

These settings now reflect what Tailscale (https://login.tailscale.com/admin/dns) allows you to do, but note, we still have some additional features like extra_records which has been preserved under the same key ;).

TODO:

  • Cobra can't read ENV vars for new config, breaking integration tests
  • more unit tests
  • more integration tests
  • updated CHANGELOG.md

Reverts DNS search path part of #1987

Fixes #2026
Fixes #2047
Fixes #1936
Fixes #2025
Fixes #2029
Updates #2024
Updates #1963

@kradalby
Copy link
Collaborator Author

This PR is in progress, I think the DNS part is working as it should but I ran into some issues with Viper and loading config from Environment variables which prevented me from writing all the tests that was missing and is needed.

If someone takes it for a spin, that would be great, I will continue next week.

Copy link

@nickbp nickbp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm new here but the new structure looks good to me, just had nitpicks with the example config

config-example.yaml Outdated Show resolved Hide resolved
config-example.yaml Outdated Show resolved Hide resolved
config-example.yaml Show resolved Hide resolved
- 1.1.1.1
- 1.0.0.1
- 2606:4700:4700::1111
- 2606:4700:4700::1001
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this cloudflare example it looks like they have a DoH endpoint at: https://cloudflare-dns.com/dns-query (from https://developers.cloudflare.com/1.1.1.1/encryption/dns-over-https/make-api-requests/)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think that it would work, but I have not actually tested it, happy to include it if someone can test it, but that can be a follow up PR too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question: by default resolv.conf only takes 3 name servers. How does tailscale handle more?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolv.conf is a "fallback" for tailscale, it will try a number of ways before using resolv.conf.

In this case, when you override DNS, the DNS server in resolv.conf will be 100.100.100.100, which lives inside the tailscale daemon, and it will proxy everything.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kradalby kradalby force-pushed the kradalby/redo-dns-config branch 3 times, most recently from 052f780 to ccb2cc7 Compare July 29, 2024 08:48
@Leseratte10
Copy link

Leseratte10 commented Jul 30, 2024

I found this PR linked from bug 2029. After updating to beta1, tailscale started messing with my /etc/resolv.conf and broke internet connectivity under Linux and every client lost its ability to resolve any DNS records ...

I was able to get around that by using --accept-dns=false but that requires manual config on every Linux machine. Will there be a setting in this PR that allows me to be like "don't mess with DNS at all and leave the resolv.conf alone" that doesn't require client-side settings, or do I need to add that switch to every single client?

@kradalby
Copy link
Collaborator Author

Will there be a setting in this PR that allows me to be like "don't mess with DNS at all and leave the resolv.conf alone" that doesn't require client-side settings, or do I need to add that switch to every single client?

You will be able to set a null/empty value to all settings under the dns key. If any of the settings is set that requires Tailscale to take control of DNS, the client will do that as it sees fit, and the way to opt out of that is --accept-dns=false.

I would like to add that the breakage in beta1 was unfortunate, but when the fix is in, I would not expect it to fail in that way as you likely have had Tailscale take over your connection up until this point.

@masterwishx
Copy link

It's a little confusing new setup, was:
server_url: https://headscale.site.com
base_domain: headscale.site.com
Now should be?

@kradalby
Copy link
Collaborator Author

kradalby commented Aug 3, 2024

It's a little confusing new setup, was:

server_url: https://headscale.site.com

base_domain: headscale.site.com

Now should be?

They need to be different, so you could do clients.headscale.site.com for example

@masterwishx
Copy link

If you still need to test it please tell me if this OK?

git clone --branch kradalby/redo-dns-config https://github.com/kradalby/headscale
cd headscale
docker build -f Dockerfile.debug  -t headscale/headscale:redo-dns-config

@kradalby
Copy link
Collaborator Author

@masterwishx how did this work for you?

@kradalby kradalby marked this pull request as ready for review August 16, 2024 10:06
magic_dns: true

# Defines the base domain to create the hostnames for MagicDNS.
# This domain _must_ be different from the server_url domain.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different as in different subdomains or different TLDs?

Would headscale.example.com and scale.example.com work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

different (sub)domains, your example works.

Could also do
headscale.net as the server
clients.headscale.net as clients

Technically could do
server.headscale.net as the server
headscale.net as clients (as long as you dont have a node called server)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont need any changes for Nginx proxy Manger also ?
if was :
server_url: https://headscale.site.com
base_domain: headscale.site.com

and will be :

server_url: https://headscale.site.com
base_domain: clients.headscale.site.com

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think so, but I dont know anything about nginx proxy manager.

Copy link

@Minion3665 Minion3665 Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a real shame - is there a technical reason why this can't be allowed, or is it a case of being too difficult/risky to support over the rewrite?

I guess what I'm really asking is: is there a possibility of me (or someone else) making a patch to change this down the line?

Nevermind, I've reread the code and noticed the error it gives you

server_url cannot contain the base_domain, this will cause the headscale server and embedded DERP to become unreachable from the Tailscale node.

Apologies for the noise 🙏

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, It is one of these things that "we dont want to add this", but it is how upstream works, and we dont really have any choice, and I understand why it happens upstream too. We just initially implemented it wrongly.

config-example.yaml Show resolved Hide resolved
hscontrol/types/config.go Outdated Show resolved Hide resolved
hscontrol/types/config.go Show resolved Hide resolved
hscontrol/types/config.go Outdated Show resolved Hide resolved
"HEADSCALE_DNS_USE_USERNAME_IN_MAGIC_DNS": "true",

// TODO(kradalby): Figure out how to pass these as env vars
// "HEADSCALE_DNS_NAMESERVERS_SPLIT": `{foo.bar.com: ["1.1.1.1"]}`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe something like?

HEADSCALE_DNS_NAMESERVERS_SPLIT=foo.bar.com=[1.1.1.1,1.0.0.1],buzz.bar.com=[8.8.8.8,8.4.4.8]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, doesnt seem to work either, now that we dont add them to the search path, it does have limited value, so happy to leave it for now.

integration/dns_test.go Outdated Show resolved Hide resolved
@kradalby kradalby force-pushed the kradalby/redo-dns-config branch from 0b2c39a to a240842 Compare August 16, 2024 12:43
@kradalby
Copy link
Collaborator Author

Thanks @SuperSandro2000, think I addressed all comments.

@masterwishx
Copy link

masterwishx commented Aug 16, 2024

@masterwishx how did this work for you?

Sorry not checked yet , i see you added more changes here ?
i Also added some time ago Adguard-home DNS to my config so i have not usual setup as befor ..

@kradalby kradalby force-pushed the kradalby/redo-dns-config branch from 23881cf to 63941c0 Compare August 19, 2024 07:25
this commit changes and streamlines the dns_config into a new
key, dns. It removes a combination of outdates and incompatible
configuration options that made it easy to confuse what headscale
could and could not do, or what to expect from ones configuration.

Signed-off-by: Kristoffer Dalby <[email protected]>
@kradalby kradalby force-pushed the kradalby/redo-dns-config branch from 63941c0 to 3a098e5 Compare August 19, 2024 09:13
// - MagicDNS (BaseDomain) for users is on a *.ts.net domain per tailnet (e.g. tail-scale.ts.net)
//
// TODO(kradalby): remove dnsConfig.UserNameInMagicDNS check when removed.
if !dnsConfig.UserNameInMagicDNS && dnsConfig.BaseDomain != "" && strings.Contains(serverURL, dnsConfig.BaseDomain) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong, but I wonder if the string.Contains check might be too strict of a test. I want to use (for example) https://bar.foo.com as my server_url, and just foo as my dns.base_domain, but this check rejects that. I've manually removed the check and this seems to work without issue.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 see: #2210

@motiejus
Copy link
Contributor

Thanks for #2210 @quite!

Since the change got more restrictive, I also posted a question whether it's safe to change base_domain on an existing HS instance with connected clients: #2210 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
10 participants