Can't handle URLs with @ character in userinfo section #328

sethmlarson · 2019-09-08T22:33:46Z

>>> import httpx
>>> httpx.URL("https://[email protected]:[email protected]")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/sethmlarson/Desktop/http3/httpx/models.py", line 112, in __init__
    raise InvalidURL("No host included in URL.")
httpx.exceptions.InvalidURL: No host included in URL.

This is a common use-case for proxies and websites using email addresses as usernames for authentication.

This might require a change in rfc3986 or the way we parse URLs. :)

The text was updated successfully, but these errors were encountered:

cansarigol · 2019-09-09T19:43:51Z

Hi @sethmlarson I've pushed a PR about this topic. Could your review it?

tomchristie · 2020-07-27T10:01:41Z

So, something I think we may want to re-review prior to 1.0 is which internal URL parsing implementation is really our best option. This issue (and perhaps also #858, although maybe that's exclusively an issue on our side?) currently means that we're a little less robust on URL parsing than the wonderfully battle hardened urllib3, which is something we ought to resolve.

Making a change here wouldn't have any affect on our public API. That's tied down and looks that way we want. But this could potentially change which dependancies we choose to ship with.

Some options here are...

Stick with rfc3986. Because we're doing just fine really. Let's just make sure we're contributing our time to helping getting this issue resolved.
Consider irl.
Switch to the urllib3 implementation, possibly by vendoring just the URL parsing into a (properly licensed and credited) slimline stand-alone package that just implements plain functions for url_parse and url_join, that only deal with super plain named-tuple-like arguments.

Of those three options, the last seems like it'd be the most robust approach, since urllib3 have been doing this for years. Having said that, I don't really have enough context about which of these three options the urllib3 team themselves would consider to be most suitable.

No doubt @sigmavirus24 and @sethmlarson both have far more insight into this, tho I don't want to push for anyone's time here.

sethmlarson added the bug Something isn't working label Sep 8, 2019

yeraydiazdiaz mentioned this issue Sep 9, 2019

Implement HTTP proxies and config on Client #259

Merged

cansarigol mentioned this issue Sep 30, 2019

fixed userinfo regex for @ character in userinfo python-hyper/rfc3986#59

Closed

tomchristie mentioned this issue Jan 14, 2020

Handle @ characters in the userinfo section. python-hyper/rfc3986#62

Closed

tomchristie mentioned this issue Mar 11, 2020

Accept url-encoded characters in proxy authentication strings #858

Closed

tomchristie mentioned this issue Jul 27, 2020

Version 1.0, working notes. #1092

Closed

11 tasks

tomchristie mentioned this issue Aug 10, 2020

Handle URL quoting username and password components. #1159

Merged

tomchristie closed this as completed in #1159 Aug 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't handle URLs with @ character in userinfo section #328

Can't handle URLs with @ character in userinfo section #328

sethmlarson commented Sep 8, 2019 •

edited

Loading

cansarigol commented Sep 9, 2019

tomchristie commented Jul 27, 2020

Can't handle URLs with @ character in userinfo section #328

Can't handle URLs with @ character in userinfo section #328

Comments

sethmlarson commented Sep 8, 2019 • edited Loading

cansarigol commented Sep 9, 2019

tomchristie commented Jul 27, 2020

sethmlarson commented Sep 8, 2019 •

edited

Loading