Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wsgi server: allow binding to ipv6 address #657

Merged
merged 1 commit into from
Feb 8, 2022

Conversation

martinetd
Copy link
Contributor

Reproduce the fix in python3.8 http.server by doing a dns lookup and
using the correct address family

Fixes #567

@martinetd
Copy link
Contributor Author

martinetd commented May 23, 2021

Not sure what policy here is with squashes, I'd rather keep the python 2 compat commit separate but ask if you want me to squash it down.

For python3.4 it's no longer installable on my distro and I cannot see circeci error without logging in, granting way more permissions that I'm comfortable with, so if someone would be kind enough to give me the error I'll try to fix it too... I tried to make it close to what's described on python doc but to no avail :/
EDIT: digged up an old VM for python3.4 and it works fine, and looking at commit history 3.4 tests have just been failing for a couple of weeks, so I'll ignore that and let someone either disable 3.4 tests or fix the error......

@csmarchbanks
Copy link
Member

Thanks, I will check this out when I get a moment, but will need to catch up on some of the context in discussed in the related issue.

Don't worry about the 3.4 failure, it is a known issue right now due to a bad version of a 3rd party library used in tests being pushed. That version was then yanked from pypi, but the tests are still pulling it for some reason. Specifically the issue is attrs 21.1.0 does not work on 3.4.

@martinetd
Copy link
Contributor Author

Thanks, no rush.

I recap'd what I had understood of the issue in the last post of the issue if that's good enough for you, but there are at least 3 other similar issues on the project so information is spread a bit everywhere with various links to python bugs that ultimately didn't really help or workarounds that aren't appropriate for merging or looked too kludgy (forcing AF_INET6 or detecting ipv6 by looking for colon character)
This PR does what python3.8 http.server does when invoked from command-line so hopefully it's good enough :)

Copy link
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! Generally I am ok with this approach, however I ran into a couple of issues in testing (linux only so far):

  1. It does not work if I do not specify an address. It tries to look up empty string as a DNS record.
  2. If I specify 'localhost' as my address, it may bind to IPv6, which means users going to 127.0.0.1 would no longer reach the service. I wonder if we should default to IPv4 if available?

Also, if you rebase from master the test failure for 3.4 will be fixed.

def start_wsgi_server(port, addr='', registry=REGISTRY):
"""Starts a WSGI server for prometheus metrics as a daemon thread."""
app = make_wsgi_app(registry)
ThreadingWSGIServer.address_family, addr = _get_best_family(addr, port)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than update the address_family on the class, could we make a new class variable in this function and change it on that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did what http/server.py does as a good monkey but they certainly don't care about what kind of side effects they have for python -m http invocations.
I'm however not familiar with how this part of python works -- I just tried assigning the class to a new variable and that doesn't copy it e.g. this doesn't work:

      LocalServer = ThreadingWSGIServer
      LocalServer.address_family, addr = _get_best_family(addr, port)
      print(addr, LocalServer.address_family, ThreadingWSGIServer.address_family)
      httpd = make_server(addr, port, app, LocalServer, handler_class=_SilentHandler)

will print AF_INET6 twice or AF_INET twice depending on what was set.

stackoverflow has some horrible things with type() but I'm not sure this is worth the effort/risk unless you know how to do that reliably

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:(, I will try taking a look when I get a chance. My fear with this solution is if someone sets up multiple servers the address_family would be shared between them and cause issues just as you saw here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just checked though calling start_http_server with different address families in a row works fine, e.g.:

    start_http_server(8000, addr='::1')
    start_http_server(8001, addr='127.0.0.1')

will work and listen to both as expected

Since these are started off the main thread in most cases, we could just always restore the class address_family to AF_INET afer we're done creating the server -- would that be acceptable?
If subservers are created in another thread though this would be racy with whatever happens in other threads, but I don't think that's quite common.. Not much way of knowing though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a library that effectively mutates global state just does not feel right to me. Restoring the address_family might be ok, but as you point out the thread safety that worries me. Certainly it is an edge case, but would be hard to debug if something does go wrong. I will think about this some more/try some things when I have time.

@martinetd
Copy link
Contributor Author

It does not work if I do not specify an address. It tries to look up empty string as a DNS record.

Ah, good point we should make addr=0.0.0.0 by default to keep the behaviour (I think it's that currently? need to check)
Interestingly for http.server it defaults to None, which with socket.AI_PASSIVE resolves as both 0.0.0.0 and :: in that order keeping only 0.0.0.0... But passing it for python2 (which doesn't allow kwargs) would mean something like socket.getaddrinfo(address, port, 0, socket.SOCK_STREAM, 0, socket.AI_PASSIVE)... Well, I guess argument meaning shouldn't change; but just changing the default addr value seems simpler.

If I specify 'localhost' as my address, it may bind to IPv6, which means users going to 127.0.0.1 would no longer reach the service. I wonder if we should default to IPv4 if available?

Hm, that will definitely bind to ::1 on most systems yes; this is more annoying.
In general I'd really rather default to ipv6, that's a behaviour configurable at OS level and something everyone should be used to by now, having it the other way around would be surprising for new users... vs. breakage for old users (it's not just localhost, if they were using a name with both local A and AAAAs it would previously default to the A) so keeping old behaviour...
Ultimately it's up to you but I guess minimizing breakage is more important? I'll wait for your decision on this.

@csmarchbanks
Copy link
Member

Well, I guess argument meaning shouldn't change; but just changing the default addr value seems simpler.

Could we check for empty string for addr and just default to IPv4 for that case without going through _get_best_family?

Ultimately it's up to you but I guess minimizing breakage is more important? I'll wait for your decision on this.

Yeah, typically minimizing breakage is a goal I have, but if it makes a solution too complicated some breakage is ok for edge cases like this. The address_family issue in the other thread is more of a concern to me.

@martinetd
Copy link
Contributor Author

Could we check for empty string for addr and just default to IPv4 for that case without going through _get_best_family?

Just setting the default to 0.0.0.0 will keep the current behaviour, I don't see much point of avoiding _get_best_family in that case?

Yeah, typically minimizing breakage is a goal I have, but if it makes a solution too complicated some breakage is ok for edge cases like this. The address_family issue in the other thread is more of a concern to me.

Ok, let's leave it as this (ipv6 first) then. If someone has a problem specifying localhost they can restore old behaviour by making it explicit (127.0.0.1) as immediate workarond, and we can reconsider making ipv4 first at that point

Copy link
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.0.0.0 as a default sounds good to me. Thanks for all the work, I need a bit of time to think if I am ok with the potential to have some thread safety issues/if anything else could work. The address_family is the only concern I have remaining though.

def start_wsgi_server(port, addr='', registry=REGISTRY):
"""Starts a WSGI server for prometheus metrics as a daemon thread."""
app = make_wsgi_app(registry)
ThreadingWSGIServer.address_family, addr = _get_best_family(addr, port)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a library that effectively mutates global state just does not feel right to me. Restoring the address_family might be ok, but as you point out the thread safety that worries me. Certainly it is an edge case, but would be hard to debug if something does go wrong. I will think about this some more/try some things when I have time.

@martinetd
Copy link
Contributor Author

Hello,

it's been a little while and I had totally forgotten about it until I upgraded the machine I use this on and my kludge got overwritten. Is there anything I can do to help move forward?

I appreciate that modifying the global ThreadingWSGIServer class is bad from a helper position, but unless we figure a way of "deep copying" the class I really don't see any other way about it. My suggestion of restoring the original value is as good as I can do.

@csmarchbanks
Copy link
Member

I have been thinking about this, and sadly just don't feel like there is a great solution right now :(. I don't want a library to be changing classes, so I guess the recommendation as it stands would be to use something else to stand up the server for IPv6. If I think of something else I will let you know though!

@martinetd
Copy link
Contributor Author

martinetd commented Feb 6, 2022

Hello, thanks for not forgetting about this! :)

My problem is that I'm not a direct user of this library, but just using exporters which do -- I can't very well go about telling the prometheus tor exporter, dht22 exporter and all other users that they must use something else because my setup requires ipv6 and it's not useable with this client, so it's much easier to patch this (as I do on my systems).

I've just had another look and it looks like we can just instantiate a new subclass locally, and that won't affect the parent class, so this is probably a better workaround than saving the old family and restoring it. How about this new version of the patch?

I've tested running twice in a row with different families of address and making sure the previous call didn't affect the current instance version.

Copy link
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good idea! I tested this a bit and am happy with it. One small comment just so that there is some context for future use.

prometheus_client/exposition.py Show resolved Hide resolved
Reproduce the fix in python3.8 http.server by doing a dns lookup and
using the correct address family

Fixes prometheus#567

Signed-off-by: Dominique Martinet <[email protected]>
@csmarchbanks csmarchbanks merged commit 39d2360 into prometheus:master Feb 8, 2022
@martinetd
Copy link
Contributor Author

Thanks!

@csmarchbanks
Copy link
Member

Thank you, and good idea to copy the class!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

start_http_server incapable to listen on IPv6 sockets
2 participants