-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question - DNS Resolution from Consul inside of Mesh Network #8343
Comments
Does this network\dns stanza help? For example, attempting to replace
|
Thanks @mocofound - I have been playing with that stanza but to no avail. The the thing is the consul agent is running on the VM (say 192.168.50.91) but the docker container in the bridge has no way (that I have been able to find) to essentially do DNS from the host VM (not the docker ip). Since I'm unable to even ping 192.168.50.91 from within the container I dont seem to have any way of going back down the chain. btw, this is what I tried with no luck config {
image = "nicholasjackson/fake-service:v0.12.0"
dns_servers = [ "${attr.unique.network.ip-address}", "8.8.8.8" ]
} If I look at /etc/resolv.conf I see the 2 values I expect there, but since I can't even hit the VM from the container - it's no dice :( |
If I'm reading this right, you're kind of trying to use Nomad to access Consul DNS 'indirectly' by relying on the host DNS and going up and down the stack. I want to point out that Nomad has some native Consul integrations, via the This example would pull IP for the
This is discussed in more detail in this other issue: #8137 (comment) |
Same issue here! We’d really like consul dns to work inside a nomad-created network namespace, all our code depends on consul DNS working, especially with native integration since we’re using dynamic upstreams - so we don’t know the list of service names ahead of time to use a template. |
Thanks @mocofound - we actually use templating quite a bit for resolution of secrets, nodes, etc in our jobs - we could ask teams to do a template like you proposed but that opens up a bunch of other things we're trying to avoid (services restarting as services move around, or having watchers on internal files, etc). I am starting to "see" the ingress gateway and terminating gateway value as we move down this path - it's just a bit of a new journey for us so we have a bit of trial and error (which I'd like to avoid as much as possible). |
An alternative solution if you're using a Linux-based container with consul connect is to add a template stanza like template {
destination = "local/resolv.conf"
data = <<EOF
nameserver {{ env "attr.unique.network.ip-address" }}
nameserver 8.8.8.8
nameserver 8.8.4.4
EOF
} and then add volumes [
"local/resolv.conf:/etc/resolv.conf"
] to the task config stanza |
Hey @alexhulbert - ty for that approach - we've also been toying with this driver = "docker"
config {
image = "<IMAGE>"
dns_servers = [ "127.0.0.1", "${attr.unique.network.ip-address}", "8.8.8.8" ]
} Looks like that had the same result as dropping the resolver too. Looks like we're both thinking similar lines. Thanks! |
Worth noting since docker scoops your host's However, if you're doing something like @idrennanvmware or @alexhulbert where consul is serving dns on |
This is also highly problematic for the java and exec drivers when attempting to leverage dnsmasq on the nodes (to merge consul dns with public dns). You get absolutely no DNS resolution, as there are no public dns servers listed in /etc/resolv.conf. You also cannot use an alternate local IP, as |
Oh beautiful.
To silently copy in the host /etc/resolv.conf, which our software owns. |
Using system DNS is the entire problem. If the node's DNS points to localhost, the mesh network can't see it. I have also verified that even hardcoding an alternate local IP address will not work, as again, the mesh network cannot see it. The only possible option is exposing Consul DNS to your internal network (if that's an option in your environment). |
Cross-linking #8900 which may have a related underlying cause. |
Hi folks, so I wanted to circle back to this issue as there's been a couple of improvmeents here since this issue was opened and I think we've got a workable situation.
I'm going to provide an example configuration one could use to expose Consul DNS, but we also have #10665 and #10705 open for further enhancements, and I'm sure my colleague @jrasell would love to hear your thoughts there. This example uses Starting with the Vagrantfile at the root of this repo, I've got the following Nomad configuration to Consul: # for DNS with systemd-resolved
consul {
address = "10.0.2.15:8500"
} Consul configuration: ui = true
bootstrap_expect = 1
server = true
log_level = "DEBUG"
data_dir = "/var/consul/data"
bind_addr = "10.0.2.15"
client_addr = "10.0.2.15"
advertise_addr = "127.0.0.1"
connect = {
enabled = true
}
ports = {
dns = 53
grpc = 8502
} My
Resulting
Let's verify Consul DNS is working (from the host) as expected by querying it directly for the Nomad client service:
And then verify we have our stub resolver forwarding configured correctly by looking up the same service via the usual path:
Ok, now let's run a Nomad Connect job. This job has two groups so that we can verify cross-allocation traffic. The
jobspecjob "example" {
datacenters = ["dc1"]
group "server" {
network {
mode = "bridge"
}
service {
name = "www"
port = "8001"
connect {
sidecar_service {}
}
}
task "task" {
driver = "docker"
config {
image = "busybox:1"
command = "httpd"
args = ["-v", "-f", "-p", "8001", "-h", "/local"]
ports = ["www"]
}
template {
data = "<html>hello, world</html>"
destination = "local/index.html"
}
}
}
group "client" {
network {
mode = "bridge"
}
service {
name = "client"
connect {
sidecar_service {
proxy {
upstreams {
destination_name = "www"
local_bind_port = 8080
}
}
}
}
}
task "task" {
driver = "docker"
config {
image = "0x74696d/dnstools"
command = "/bin/sh"
# of course we can just use localhost here as well, but
# this demonstrates we have working DNS!
args = ["-c", "sleep 5; while true; do curl -v http://www.service.dc1.consul:8080 ; sleep 10; done"]
}
}
}
}
Run the job:
See that we're successfully querying Consul DNS for the Connect endpoint (even though it's just localhost here) and that Connect is working:
Let's curl something on the public internet from that same container:
Our
So that all works, but note that if we use
For more involved setups, we'll probably want to set the I'm going to close this specific issue as resolved, but feel free to open new issues or Discuss posts to talk through this some more. |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
This may be an obvious answer but we have been stumped getting Mesh and Consul DNS resolution to work together.
Here's the scenario
We have 3 VMs (each running nomad and consul)
192.168.50.91
192.168.50.92
192.168.50.93
Originally we had 3 services, all running in docker with host network.
fake-service-service1-api
fake-service-service1-backend
fake-service-service2
a call to fake-service1-api resulted in:
api->backend->service2
An ssh session to each machine would work fine for the following dig
If we logged in to each container, that same command worked as well (since we were on the host network).
That all worked great, we got answer sections, and service teams were happy.
Now we are moving service1 (api and backend) to Mesh and bridge network. We are accepting and routing calls to the AP no problem, the api happily talks to the backend of mesh - but then.....the problem we are having is that now the backend service using Consul DNS for "fake-service-service2.service.consul" can no longer resolve that IP due to the bridge.
Is there a way to get the container now running on the bridge network to be able to resolve that name from the host it's running on?
Thanks!
The text was updated successfully, but these errors were encountered: