Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid instantiating Resolver when it is not necessary for DoH #1123

Merged
merged 2 commits into from
Aug 28, 2024

Conversation

URenko
Copy link
Contributor

@URenko URenko commented Aug 17, 2024

Currently, a Resolver with default settings is always instantiated when instantiating a DoH backend. e.g. :

class _HTTPTransport(httpx.AsyncHTTPTransport):
def __init__(
self,
*args,
local_port=0,
bootstrap_address=None,
resolver=None,
family=socket.AF_UNSPEC,
**kwargs,
):
if resolver is None:
# pylint: disable=import-outside-toplevel,redefined-outer-name
import dns.asyncresolver
resolver = dns.asyncresolver.Resolver()

The Resolver is used to resolve the IP address for the hostname of the DoH server address.

However, this is not always necessary.
For example, when bootstrap_address is present or hostname is already a IP address:

if dns.inet.is_address(host):
addresses.append(host)
elif self._bootstrap_address is not None:
addresses.append(self._bootstrap_address)
else:
timeout = _remaining(expiration)
family = self._family
if local_address:
family = dns.inet.af_for_address(local_address)
answers = await self._resolver.resolve_name(
host, family=family, lifetime=timeout
)
addresses = answers.addresses()

And it may potentially cause problems as the Resolver is instantiated with default settings.
For example, /etc/resolv.conf is not accessible in Termux (URenko/Accesser#187).

This pr partially mitigated this issue by avoiding instantiating the Resolver when bootstrap_address is present.
Further fixes are required to fully resolve this issue.

@rthalley
Copy link
Owner

If making a default resolver isn't appropriate for whatever reason, the API to https() methods has a resolver parameter. So in this case you could set resolver to dns.resolver.Resolver(configure=False), and then either set the nameservers manually or always require a bootstrap_address.

I'm considering merging this PR because it's consistent with the bootstrap_address approach and is simple, but for the other issue, namely "what if the hostname in the URL is an IP address?" I prefer the "pass in a resolver" solution rather than more complexity, as this situation is atypical.

Also if there is some other way we ought to be getting the DNS server addresses on Android, I'd consider having the resolver support that, much in the way that we get resolver info from the registry or WMI on Windows.

@URenko
Copy link
Contributor Author

URenko commented Aug 19, 2024

the API to https() methods has a resolver parameter

If we call https() directly, it will not be a problem. However, this resolver parameter is not available if we call it indirectly via Resolver.resolve or DoHNameserver.query.

The situation that the hostname in the URL is an IP address actually is a common trick to bypass the firewall in the mainland of China. And also the well-know 1.1.1.1.

@URenko URenko changed the title Avoid instantiating Resolver when bootstrap_address is present for DoH Avoid instantiating Resolver when it is not necessary for DoH Aug 19, 2024
@rthalley
Copy link
Owner

This is going further than I wanted, but still looks ok other than the issues in the CI runs ("make type") and these lines:

        if parsed.port is not None:
            port = parsed.port

I get why you unindented them, but they're going to cause problems in the case where a bootstrap address is provided, as then we will access parsed.port when we haven't parsed. I'm thinking these should be in the block doing the parsing. Also some additional work my be required to make mypy happy after fixing the thing it is currently unhappy about.

@URenko
Copy link
Contributor Author

URenko commented Aug 27, 2024

mypy seems failed to detect parsed.hostname cannot be None at that line, so I disabled it.

By the way, I feel confused that the (remote) port (it is an argument) only be used by h3, but not used by httpx?

@rthalley
Copy link
Owner

Mypy frequently misses situations where something can't be None or similar; rather than disabling I usually add an assert, e.g.

assert parsed.hostname is not None # for mypy

As this gives the info mypy needs and fixes the problem more clearly.

Re port I think the confusing thing is probably that we use it for H3 when there is a URL. AFAIK our intent was that the port is used if you specify an address and not a URL (i.e. we're going to construct the URL). In the case where there is a URL, we should probably not use port. No need for you to address this right now, I'll do something about it after I merge your PR.

@rthalley rthalley merged commit 3f4bf3d into rthalley:main Aug 28, 2024
9 checks passed
@rthalley
Copy link
Owner

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants