-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNS forwarding with dnsmasq under systemd #4155
Comments
@sandstrom You could probably just disable the systemd-resolved DNS listeners like
I think this should cause systemd-resolved to generate a resolv.conf like normal but not bind to 127.0.0.1:53 so that dnsmasq can. dnsmasq should be either polling or using inotify to monitor changes to resolv.conf so any systemd-resolved updates show up with dnsmasq. Let me know if that works for you and if so I can update the docs. |
@mkeeler I tried that, but There may be some additional command/configuration required to cause systemd-resolved to update the Also, there is no binding conflict (that's not the issue), because the systemd-resolved stub will bind to One ostensible solution would be to disable systemd-resolved's control over resolv.conf (while still having the stub running, bound to 127.0.0.53, and then update resolve.conf to point to dnsmasq, which in turn could be configured to fallback to 127.0.0.53 for namespaces it isn't configured for (non-consul queries). However, I don't think that'll work. systemd-resolved accept dns queries via three interfaces. (1) a bus API, (2) a glibc api and (3) the stub listener on 127.0.0.53. So with the apparent solution above, one would still run the risk of having dns queries for services/applications handled by systemd (which is basically all of them) bypass the dnsmasq layer. So instead we must find a way to configure systemd-resolved to query dnsmasq (or consul dns api directly) before any queries are forwarded up the default dns server chain (usually obtained via dhcp). I hope this helps explain the issue in more detail! |
Friendly ping @mkeeler 😄 |
@sandstrom Just getting back from a vacation and will look more into this soon. Disabling systemd-resolved in some cases where systemd-networkd is in use would break pushing down nameservers and domains via DHCP so its not a good solution. I have a hunch that something may be able to be done with nss configuration. As then all getaddrinfo/gethostbyname queries can be routed to the proper resolver. I still need to look into this some more especially as it relates to Ubuntu. |
Another possible area to look into is network manager configuration (assuming you are using it). I think it has various functionality to use resolved or dnsmasq built into it. |
@mkeeler I've been digging some more. I agree that disabling systemd-resolved is a bad idea and should be avoided. Idea 1One avenue I've tried is to configure this only via systemd-network and its routing-only domain. However, according to the top of the file networks can only be configured with one block, i.e. this would override other network settings (there doesn't seem to be a way to inject the routing-only domain setting into all networks, without disturbing the other, existing, network configuration). # /etc/systemd/network/00-test.network
[Match]
# match everything
[Network]
DNS=127.0.0.54
Domains=~consul
Idea 2Another idea is to put systemd-resolved into the fourth mode described in its man page, where other packages are responsible for Thing I haven't figured out yet though, is how to let dnsmasq know about the upstream resolver (received via DHCP) that it should use for all queries that shouldn't be forwarded to consul. With NetworkManager there was an option for this, but Ubuntu 18.04 doesn't seem to be using NetworkManager any more. Any thoughts on how to let dnsmasq know about the upstream server? |
@sandstrom For dnsmasq to know about the upstreams you either have to manually put them in its configuration or have something add them to resolv.conf. It will (unless configured otherwise) watch for changes to resolv.conf and update its nameserver lists. |
There are hooks for dhclient which could be used, for example something like this. Basically one would then:
Does that sound like a good flow? Something you'd be willing to add instructions around to your guides on dns forwarding? |
@mkeeler friendly ping 😄 |
@sandstrom That sounds like a decent flow. However I think I found an alternative. In /etc/systemd/resolved.conf you can have:
Then have dnsmasq serving on 127.0.0.1:53 and doing its normal thing. This should force systemd-resolved to send everything within the *.consul domain to the local system resolver and ignore resolvers configured via DHCP or via systemd-networkd per-link configurations. I ran some tests with wireshark running and no DNS requests for .consul domains are ever sent out to any resolver other than 127.0.0.1:53. To me this seems like the least invasive approach. What are your thoughts? |
I did try to put this in resolved.conf but it doesn't parse
Apparently systemd-resolved doesn't want port numbers in the DNS config. Looking at the systemd-resolved source code the specifically validate that the full string is an ipv4 or ipv6 address. Maybe if I find some time I might try to get a PR into systemd to fix that and then there would be no need for dnsmasq to run at all but instead we would have a direct systemd-resolved section of the configuration. |
@mkeeler Yeah, I know (I tried something related, with non-53 ports). But you can have consul bind to port 53 on 127.0.0.54 and then config with Need to add this permission to your systemd unit file for consul to bind to low ports. [Unit]
Description=Consul Agent
# …
[Service]
# …
AmbientCapabilities=CAP_NET_BIND_SERVICE
# …
[Install]
WantedBy=multi-user.target To answer the other question: using your suggested |
Yeah having systemd-resolved and dnsmasq involved seems a little bit of a hack when we can do it with just systemd-resolved. I think the best course of action is to add a systemd-resolved section to that forwarding DNS doc and specify how to set it up with two options.
|
Instead of IP-tables I'd suggest using a systemd socket. I'll admit I haven't used it myself yet, it's the systemd method for handling low ports. So since we're in systemd land already (with systemd-resolved) it will make sense to use that mechanism. Some more details:
|
@sandstrom I dont' think systemd socket activation is applicable here. You may be able to not use the "activation" part and still use systemd sockets with long running services but then it would require systemd specific modifications to Consul as well as modifications to the miekg/dns library we use to handle unix/file sockets and probably a few other things. |
@mkeeler Yeah, I think you are right (read some more on sockets). Would require some modifications to Consul to accept sockets. Although I think socket support would be useful, I don't think it should block this issue. IP-tables forwarding would certainly work. |
@sandstrom I have a PR open with the documentation fixes if you wanted to take a peek. It probably wont hit the website until the end of June. |
@mkeeler I'll have a look and also try this myself. PTR records leaking out is just an effect of how reverse dns lookups work, right? (and those aren't part of "normal" consul usage) |
@sandstrom Actually Consul will handle the .arpa domain for PTR record queries in addition to the configured consul domain. It will then use the configured recursors to resolve PTR records recursively if the IP is not known to Consul. However with just systemd-resolved and Consul it isn't really possible to not expose all PTR queries to the main nameservers without manually configuring the recursors for Consul (which kind of defeats the purpose). For the purposes of the guide it seemed a decent trade off since no consul specific information is being exposed to other nameservers. Just IP addresses. |
@mkeeler I'll close this issue, the changes you PR:ed seems to solve this issue for us. Although we haven't rolled this into production just yet, I'm pretty sure it'll work (can reopen this issue in case it doesn't). I still have my initial caveat on
But I guess that description isn't valid when the Thanks for working with me on solving this! 💯 (I've added a few minor comments on your PR) |
@sandstrom Yes assuming I am reading the docs right (and what I saw with wireshark is generally indicative of its normal behavior) using the |
After running with this setup for about a day I'm seeing this issue in the logs (below). Seems like queries to resolve s3.eu-west-1.amazonaws.com is going through consul somehow, which I hadn't expected them to. @mkeeler Any thoughts? Not 100% sure this is related to the changes above, but most likely. Perhaps we missed something in our configuration.
|
@mkeeler I think the issue may be that the configuration we've discussed will forward all lookups to consul, not only those under the The thing is, we need recursors for some externally configured services, such as AWS RDS (hosted databases) and we don't want to stick a public dns in there (such as 8.8.8.8) since I don't think it will resolve AWS internal services correctly. So we're sort of back at square one here 😄 (at least I am, perhaps you can see a way out of here?) |
@sandstrom Given those error messages it looks like Consul is trying to use 127.0.0.1:53 as a recursor. This would be bad as 127.0.0.1:53 would be port forwarded back to itself creating an endless loop (and causing things to run out of sockets and have high CPU utilization). I would think in the scenario with systemd-resolved consul shouldn't have any recursors configured as systemd-resolved should be handling all queries and then just forwarding some off to Consul. Also, in this configuration Consul will be receiving all queries for this system. Without recursors set up it should just return a NXDOMAIN response and allow the other per-link resolvers to present a real answer. |
@mkeeler Yes, exactly. Dropping Are there any downsides to having consul configured without recursors? (i.e. does it have any use-case besides being a fallback when all DNS queries are routed through consul) |
@sandstrom recursors is just so Consul can forward queries for domains it doesn't know about to an upstream so you could use it for your primary DNS. So if you had a statically IPed system with a set of known name servers that don't change you could put the recursors into the Consul config and have the system only use 127.0.0.1:8600 for its primary DNS. When fronting Consul with another DNS server there isn't much point to Consul also serving it out. Other downsides include increased network traffic and systemd load due to more concurrent queries being sent to upstream DNS servers. systemd-resolved will concurrently issue queries to both its upstream servers and Consul. Consul will then recurse and issue queries to more upstreams which should be unnecessary. |
@sandstrom can't just add the VPC resolver for example? That should take care of RDS etc, if using Chef: default['consul']['config']['recursors'] = [node['ipaddress'].split('.')[0..1].join('.').concat('.0.2')] |
@scalp42 Yeah I know, we looked at that, it's a good idea! 😄 Works in the VPC, but wouldn't work on a vagrant machine (dev env). It also feels a bit hacky, would like to avoid it if we can (the idea with DHCP-provided DNS servers is a good one, we'd like to stick with it if we can). |
In case anyone is stumbling upon this, I hit this issue in a local vagrant setup. I definitely had to setup a recursor to get |
This is what I did, for anyone that happens upon this. I think it's a better solution than what is presented above. The idea is to bind dnsmasq to a different IP address, and run dnsmasq and systemd-resolved in parallel, with systemd-resolved referring to dnsmasq as it's DNS server. Note the
And we change the listen address for
And you need to tell dnsmasq to ignore systemd-resolved to prevent a loop:
|
Thanks so much @SwitchedToGitlab To anybody here because they want to integrate with consul, add the following:
|
@SwitchedToGitlab Thanks for sharing your approach. Worked for me, and solved reverse lookup issue when queries round robin between consul and upstream DNS. One thing to add, I had to uncomment |
Thanks for all this. On my system, which was built for AWS using packer and terraform module As Shaggy said, "it wasn't me." Which all makes sense with only
(UPDATE) |
@tpdownes do you have a use case that warrants installing both |
#6462 describes one here. Not my use case, but it appears to be the only way to ensure reverse lookup never goes through consul. |
FWIW - i would describe that as a |
@sarkis there's a use case for having |
Just a FYI, it's possible to use https://gist.github.com/kquinsland/5cdc63614a581d9b392f435740b58729 |
Hey there, This issue has been automatically locked because it is closed and there hasn't been any activity for at least 30 days. If you are still experiencing problems, or still have questions, feel free to open a new one 👍. |
Overview of the Issue
The consul guides explain how to run dnsmasq as root, forwarding certain requests to consul. This worked well before systemd (e.g. ubuntu 14.04 and 16.04), but with more recent linux distros running systemd (ubuntu 17.04 and 18.04 LTS) the current guide won't work.
systemd-resolved will control
/etc/resolv.conf
(it's just a symlink now), so dig will use systemd-resolved by default, instead of dnsmasq.I see two solutions here:
All three of them would require guide updates. A related issue is this one (#3945).
One partial workaround is to configure the following:
In that case resolved will ask dnsmasq (indirectly consul) simultaneously with asking its upstream dns (usually from dhcp), which seems to work. However, that would leak consul dns queries outside the local machine. It will probably also be brittle, i.e. it relies on the response from dnsmasq coming in before those from the per-link dns servers.
This leaking feels like a rather big downside.
Reproduction Steps
The text was updated successfully, but these errors were encountered: