Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --no-proxy Option #13051

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

martinezlc99
Copy link

Fixes #5378.

This PR was originally done in #12011, which I cancelled due to age and associated pain of rebasing.

Originally this PR stalled in addressing #12011 (comment), but since then I have noticed the --proxy=PROXY option also sets session.trust_env = False as well, so the behavior is consistent.

@martinezlc99
Copy link
Author

Hi everyone - am I missing anything on this PR?

Also, I am thinking about logging a message if both --proxy=<PROXY> and --no-proxy options are simultaneously set.

@notatallshaw
Copy link
Member

I'm going to put it on the 25.0 milestone so it doesn't get forgotten, but it will be up to a pip maintainer or release manager to review and decide whether it stays on there.

I don't have a lot of experience reviewing this type of PR but it looks simple and your note about existing use of session.trust_env makes sense to me (though I still think this should be documented somewhere, it at least keeps in line with existing behavior).

@notatallshaw notatallshaw added this to the 25.0 milestone Nov 7, 2024
@jdlangs
Copy link

jdlangs commented Nov 8, 2024

@martinezlc99, I just happened to be investigating this behavior today and it was a pleasant surprise to see this was active just yesterday. Thanks for driving this forward!

I did want to ask, wouldn't the ideal behavior be for the --no-proxy flag to take a value similar to the no_proxy env var, instead of turning off proxies entirely? I thought the general use case was people want to provide specific configuration scenarios of which index URLs require proxy use, just without the awkwardness of dynamically changing env vars. It also would be confusing IMO for a no_proxy option to be a boolean given how it's typically used.

Unfortunately, there seems to be buggy behavior in requests that means simple passing on a no_proxy value in session.proxies won't work today but that seems like the ideal behavior for me and my use case.

@notatallshaw
Copy link
Member

notatallshaw commented Nov 8, 2024

@jdlangs I think what this PR has going for it is its simplicity. Generally pip should rely on sourcing behavior from the standard or vendored libraries.

I notice that uv has not implemented --proxy and instead points users to environmental variables, which is probably the right call as having these CLI flags means pip needs to make its own choices.

There is no standard definition of what no_proxy or NO_PROXY should do: https://about.gitlab.com/blog/2021/01/27/we-need-to-talk-no-proxy/. So any implementation would break someone's expectations. IMO I think pip should be pointing people towards using http_proxy / HTTP_PROXY / https_proxy / HTTPS_PROXY / no_proxy / NO_PROXY and let the http library pip vendors deal with figuring out how to interpret that.

@jdlangs
Copy link

jdlangs commented Nov 8, 2024

Yeah, just using environment variables does seem like the most standard and robust method, but if pip provides a proxy argument, I think being able to configure no_proxy also seems reasonable.

I think what this PR has going for it is its simplicity

The simplest PR would actually be only passing a no_proxy value on to requests, I think, but unfortunately it seems that won't work until it's fixed upstream.

There is no standard definition of what no_proxy or NO_PROXY should do

Point taken, but is there any tool that treats no_proxy as a boolean flag? Maybe for this PR I'm just asking the name be reconsidered, especially if there's any chance in the future it would be reused to be a true no_proxy configuration

@notatallshaw
Copy link
Member

notatallshaw commented Nov 8, 2024

Point taken, but is there any tool that treats no_proxy as a boolean flag? Maybe for this PR I'm just asking the name be reconsidered, especially if there's any chance in the future it would be reused to be a true no_proxy configuration

Yeah, I think it's an unfortunate naming collision , maybe --no-proxy-env would be more accurate? I don't want to bikeshed this though, there was discussion in the previous PR #12011, I would personally leave it up to @martinezlc99 to decide, but ultimately a pip maintainer needs to review this.

@martinezlc99
Copy link
Author

martinezlc99 commented Nov 9, 2024

@notatallshaw @jdlangs my initial intent, and my understanding based on the original issue, was to bypass the proxy completely, so the boolean option made sense. But, I agree the --no-proxy option name is counter-intuitive to how the traditional no_proxy environmental variable is used.

I did a quick search to see how other tools handle this situation and found there is not a consensus:

  • curl has a --noproxy option, which appends to the no_proxy environmental variable:

--noproxy <no-proxy-list> List of hosts which do not use proxy

  • Chromium has a --no-proxy-server boolean option, which bypasses the proxy completely:

--no-proxy-server This tells Chrome not to use a Proxy. It overrides any other proxy settings provided.

The Chromium --no-proxy-server option name is at least distinct from the no_proxy environmental variable so this design is noteworthy. Chromium also has a --proxy-bypass-list which "tells chrome to bypass any specified proxy for the given semi-colon-separated list of hosts. This flag must be used (or rather, only has an effect) in tandem with --proxy-server.".

Chromium also supports --proxy-server=direct:// which will cause all connections to not use a proxy.

A nice medium may be adding support for --proxy=direct:// to completely bypass all proxies, and having --no-proxy behave inline with no_proxy environmental variable, with the requests limitation pointed out by @jdlangs .

Open to thoughts/suggestions.

@notatallshaw
Copy link
Member

notatallshaw commented Nov 9, 2024

my initial intent, and my understanding based on the original issue, was to bypass the proxy completely, so the boolean option made sense.

But in this implementation you still take the proxy options provided to pip (which I assume can be either CLI or config), so it doesn't bypass the proxy options completely. Which is fine, there is precedence for this in the existing options --no-binary :all: can be overridden with --only-binary <specific package> and vice versa.

I don't know how much other CLIs here are that informative as it's not exactly the same behavior. I would prefer to keep it something simple, but preferably a little more accurate, naming is hard, I'd stick to the current name unless something is obviously better. I really don't think pip should be trying to do it's own emulation of big complex tools, e.g. --proxy=direct://.

@martinezlc99
Copy link
Author

Thanks @notatallshaw - agree with all your points...forgot this implementation will take the proxy from the CLI as well 😎

So I think the name of the option is the only outstanding item. I like your suggestion of --no-proxy-env or even no-proxy-config. I have no preference, so I am fine with --no-proxy-env if its agreed to be the most clear/concise.

@pichuzhkin
Copy link

Hello everyone!
I'm also a corporate environment user and today noticed another issue with no_proxy, but not the candidate option - the environment variable.

When using pip and *.companydomain.com is in no_proxy env var - pip doesn't respect it upon trying to access host-i-need.companydomain.com and still runs it via proxy.
I finally added host-i-need.companydomain.com to no_proxy explicitly and it worked.

Maybe there are some PRs on this topic or should I submit one myself?

@notatallshaw
Copy link
Member

When using pip and *.companydomain.com is in no_proxy env var - pip doesn't respect it upon trying to access host-i-need.companydomain.com and still runs it via proxy.
I finally added host-i-need.companydomain.com to no_proxy explicitly and it worked.

This is unrelated, pip is using requests to make HTTP calls, which in turn largely relies on the standard library implementation of NO_PROXY, which does not use * for sub-domains: psf/requests#6343 (comment), you must specify it as .mydomain.com.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add option to bypass http proxy
4 participants