Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where is the HNS/HCS Endpoint default DNS config? #1940

Open
TBBle opened this issue Oct 21, 2023 · 5 comments
Open

Where is the HNS/HCS Endpoint default DNS config? #1940

TBBle opened this issue Oct 21, 2023 · 5 comments

Comments

@TBBle
Copy link
Contributor

TBBle commented Oct 21, 2023

Just trying to track down the source of an observed behaviour in a containerd/nerdctl/wincni stack, i.e. no running dockerd or similar.

As far as I can tell, by default, no DNS Servers are passed down to HNS in this stack, unless I manually add them to the CNI configuration. But running a container gets a DNS server address of 192.168.1.1. That might have been the DNS server on the local network at my old place, but I've moved home since then and have a different network setup, and I have no idea how to track down where that config might be, or how to change it.

I guess it's also possible that's a burned-in default if no DNS server is passed down to HNS, but I don't recall having this problem in the past, which is why I assume it's been copied from live network into a config at my old place, rather than being copied from the live network on each execution.

@TBBle
Copy link
Contributor Author

TBBle commented Oct 23, 2023

Hmm. Poking around (prompted by microsoft/Windows-Containers#216 (comment)), I noticed that when Docker creates a NAT network endpoint, it passes EnableInternalDNS: true in to endpoint creation by default, and WinCNI does not, so perhaps that IP address is actually an unconfigured default since I believe the NAT network in-question was created by Docker, so it probably never set up a DNS Server list as it expects to provide DNS services from dockerd itself.

So perhaps this is a relatively-untested code-flow? I wonder if I delete and recreate the NAT network, will it capture today's DNS server address? Or is 192.168.1.1 also the default for an unconfigured setup?

I also don't know what EnableInternalDNS does. I'm not sure if it just points the DNS server at the host for systems like Docker, or if it also provides a DNS bridge and so should probably be the default for the WinCNI nat driver too. I need to work out if I can slip it into the WinCNI config somewhere, or will need to expose it as a config option to experiment with it.

Edit: After being confused for quite a few minutes, I just realised EnableInternalDNS is a HNS thing (which Docker uses), but WinCNI is using HCN, which doesn't expose this particular flag as far as I can see, although I did see a EnableDnsProxy in the HCN Network API but not in the relevant hcsshim code, which looks to be hand-written, not schema-generated, and either out-of-date or just incomplete.

So I need to come back to this fresh later, and actually untangle HNS and HCN in my mind, as I had not realised that was two different things until now. I also need to work out what CLI tools exist to explore either or both, e.g., the HNS PowerShell module. I can see now I look that HCN is marked as HNS v2, and offers a full syscall API where HNS had just a single "HTTP call" endpoint...

hcsshim/hcn/hcn.go

Lines 15 to 20 in 434adf3

/// HNS V1 API
//sys SetCurrentThreadCompartmentId(compartmentId uint32) (hr error) = iphlpapi.SetCurrentThreadCompartmentId
//sys _hnsCall(method string, path string, object string, response **uint16) (hr error) = vmcompute.HNSCall?
/// HCN V2 API

And with all that, something I read earlier makes a lot more sense. "[HCN is] implemented as C API hosted by the Host Network Service (HNS) on the OnCore/VM." So it's still HNS underneath, just with a new API and JSON and OpenAPI and similar niceties.

Gosh, this has been a journey. But at least next time I come to this, I know what I don't know, which is a nice step forward.

And I have a couple of new free-time projects...

@TBBle TBBle changed the title Where is the HNS Endpoint default DNS config? Where is the HNS/HCS Endpoint default DNS config? Oct 24, 2023
@MikeZappa87
Copy link

For the dns resolver of 192.168.1.1 (assuming gateway). I found this:
https://github.com/moby/libnetwork/blob/3797618f9a38372e8107d8c06f6ae199e1133ae8/drivers/windows/windows.go#L50C7-L50C19

When I create the docker network, I use the -o com.docker.network.windowsshim.disable_gatewaydns=false so that doesn't use that as a dns server as its doesn't work. Not sure why that is the default behavior. I have the steps in the 216 issue.

@TBBle
Copy link
Contributor Author

TBBle commented Oct 24, 2023

Yeah, that DisableGatewayDNS flag in Docker is what becomes (inverted) EnableInternalDNS in the HNSEndpoint struct later in that file but only for nat networks. The weird thing is that 192.168.1.1 isn't currently a gateway address (or even a reachable subnet), so either that IP address was somehow saved when the HNS/HCN network was created (assuming 192.168.1.1 was my gateway at the time, I don't recall), or it's a hard-coded default somewhere I can't see.

I need to work out how to pass the equivalent thing into the HCN API, and from poking around, this might require both updating hcsshim's HCN wrapper to the current OpenAPI spec (assuming the docs are more up-to-date than the code) and exposing it from WinCNI, and possibly making it the default for WinCNI in NAT-mode network endpoints. And I don't know for sure that the same option is exposed in HCN.

@MikeZappa87
Copy link

Did you ever find this? I just don't understand the reasoning to set the default gateway as the primary dns server.

@TBBle
Copy link
Contributor Author

TBBle commented Nov 9, 2023

I haven't had a chance to chase this up yet and work out why it's using an old gateway (and how I can configure or fix the NAT CNI plugin to not use it at all), but I'm hoping to have time for that this weekend.

I suspect that what this flag did is capture the host's primary DNS server setting at the time the network was created (and 192.168.1.1 was probably my LAN gateway at the time, and also the network DNS server). When this flag is not set, dockerd needs to provide a DNS server of its own, which also lets it inject container names for inter-container networking support; I assume this is what happens when using custom networks rather than a default NAT network.

See moby/libnetwork#2021, found via discussion of the problem this solves in docker/for-win#397.

However, as I mentioned, I haven't actually traced through all the behaviours yet, so I might be conflating things here.

...

Thinking about it, I probably have conflated something, because when those flags and tickets are talking about a gateway, they mean the gateway on the NAT network, i.e. the host's interface on that network, where the dockerd DNS stub server lives. That would have been a different address (172.x.x.1 I suspect) if that was involved. So whatever's going on here is possibly not related to the EnableInternalDNS flag at all, and was just something that happens when creating such networks.

I suspect if I delete the network and let either Docker or the CNI plugin recreate it, it'll be fine. A quick registry search didn't turn up a "192.168.1.1" string so if I'm right about this, then wherever that address is stored/cached, it's not trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants