Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't list packages in current directory #7955

Merged
merged 1 commit into from
Apr 5, 2020

Conversation

deveshks
Copy link
Contributor

@deveshks deveshks commented Apr 1, 2020

Fixes and closes #7731, closes #2926, closes #3710 , closes #6558 and closes #7971

Also added tests which does the following:

Test 1

  1. Create a package with setup.py
  2. Run python setup.py egg_info from the package directory.
  3. Run python -m pip list from the package directory.
  4. Assert that the package is not present in the output

Test 2

  1. Create a package with setup.py
  2. Run python setup.py egg_info from the package directory.
  3. Add current directory to PYTHONPATH
  4. Run python -m pip list from the package directory.
  5. Assert that the package is present in the output

@deveshks deveshks requested a review from sbidoul April 2, 2020 07:36
@pradyunsg
Copy link
Member

Following on from what @sbidoul said, I think we should investigate why we're getting the absolute path on sys.path occasionally.

@brettcannon maybe you happen to have some idea about this?

@brettcannon
Copy link
Member

Depends on how you are executing pip and whether the site package is being run. Basically if you run pip the command and it's in the local directory it will look like the CWD is absolute when in fact it's just happens that the entry point/script location matches CWD. Otherwise with -m you get "". And the site package is what makes all entries other than "" absolute on sys.path.

@deveshks
Copy link
Contributor Author

deveshks commented Apr 3, 2020

Thanks for the response @brettcannon and @pradyunsg .

From the discussion, I can sense that we now have an understanding as to why cwd included in sys.path. So is the implementation of the PR fine, taking that in account, or are there changes needed in unit tests/logic to improve it?

@sbidoul
Copy link
Member

sbidoul commented Apr 3, 2020

I have two concerns with this PR:

  • unconditionally removing the current directory from sys.path will change the behavior when it has been added on purpose by the user, e.g. via PYTHONPATH
  • for consistency reasons, if we change something in this area, it should be done for all commands (list, freeze, check, and maybe more)

Maybe doing this in pip/__main__.py and only ever removing sys.path[0] if it is the same as os.cwd() is more robust. Maybe.

@deveshks
Copy link
Contributor Author

deveshks commented Apr 3, 2020

Hi @sbidoul

I just found out that we are inserting a os.path.dirname(os.path.dirname(__file__)) in sys.path in https://github.com/pypa/pip/blob/master/src/pip/__main__.py#L14 . Can that be the reason we see the cwd in sys.path in list, check etc, or that flow doesn't come to picture here?

I think you are mentioning something like this in pip/__main__.py

if __name__ == '__main__':
    if sys.path[0] == os.getcwd():
        sys.path.pop(0)
    sys.exit(_main())

Also there is a PR for freeze #7810 which does something similar as well. I think generalizing it will avoid such a scenario as well?

@uranusjr
Copy link
Member

uranusjr commented Apr 3, 2020

@deveshks The code block you see is irrelevant. It is only executed if you execute pip directly as a source directory, e.g. python src/pip. It does not effect python -m pip or pip (entry point), because in those cases __package__ would be 'pip', not an empty string.

@sbidoul
Copy link
Member

sbidoul commented Apr 3, 2020

Also there is a PR for freeze #7810 which does something similar as well. I think generalizing it will avoid such a scenario as well?

I have the same concerns for #7810 indeed.

I think you are mentioning something like this in pip/main.py

Yes.

@deveshks
Copy link
Contributor Author

deveshks commented Apr 3, 2020

Hi @uranusjr and @sbidoul

Thank you for the comments. I will try out that suggestion, but I am not sure if it will affect any other functionality of pip.

Is there any other specific place further down the code path which I can try:

if sys.path[0] == os.getcwd(): 
    sys.path.pop(0)

Or do you have any other suggestions on what I can try to do here?

@uranusjr
Copy link
Member

uranusjr commented Apr 4, 2020

I doubt anything later than __main__.py would work, since pkg_resources builds the default working set on import time, and modifying sys.path after that will not affect it.

This reminds me #7705 though. Maybe what we should do is to add a compatibility layer over pkg_resources, similar to how we use appdirs, and always use that instead of calling pkg_resources directly. This way we can build our own WorkingSet that does not contain cwd and be consistent everywhere. It would also be a good start to gradually switch to importlib-metadata.

@deveshks
Copy link
Contributor Author

deveshks commented Apr 4, 2020

Hi @uranusjr

So in that case, should I stick to making the change in __main__.py for this PR, and perhaps we can address the suggestion of adding a compatibility layer over pkg_resources in another PR?

Also I think I didn't follow your suggestion on how to add a compatibility layer, could you give me some more detailed pointers on what is it and where can I start looking for it?

@gutsytechster
Copy link
Contributor

Following the conversation, I initially tried to exclude the '' from sys.path in the vendor pkg_resource. The commit referring to that change can be see here. This also solved the problem for other commands as well since it is the nexus for all other commands. However, respecting this review comment #7810 (comment), I undo those changes.

I think @uranusjr is suggesting to perform those changes outside of pkg_resource but before actually calling get_installed_distributions.

@sbidoul
Copy link
Member

sbidoul commented Apr 4, 2020

Since the issue occurs only when running pip via python -m pip, I think doing this in __main__.py may be adequate. In that case, it seems that sys.path[0] is the current directory and that PYTHONPATH entries come after [1]. So popping sys.path[0] there if it is an empty string or os.cwd() should be safe.

@deveshks deveshks force-pushed the pip-list-not-contain-pkg-work-dir branch from 0e208e0 to 2536b19 Compare April 4, 2020 13:21
@deveshks
Copy link
Contributor Author

deveshks commented Apr 4, 2020

Thanks @sbidoul

I was by mistakenly removing sys.path[0] at run time by putting it within if __name__ == '__main__': instead of at import time.

I have made changes to the code according to your suggestion now, please have a look :)

Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this fix and this also closes the corresponding issue w/ pip freeze (if we haven't merged that already) by stemming the problem at the source.

I do feel like think we should only attempt this when run with "-m" but I don't think there's a good way to detect that.

@deveshks
Copy link
Contributor Author

deveshks commented Apr 4, 2020

I do feel like think we should only attempt this when run with "-m" but I don't think there's a good way to detect that.

Hi @pradyunsg

I tried adding a print statement right after the fix in src/pip/__main__.py, and saw that the print statement only gets exectured when using python -m pip <command> and not when we do pip <command>.

I am not sure about the reason of the behavior, but I believe that the import statements don't get called when using pip <command> and we hit the if __name__ == '__main__': directly. Which means that the added code will only be called when using -m.

@deveshks deveshks requested a review from pradyunsg April 4, 2020 18:53
Copy link
Member

@xavfernandez xavfernandez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the reason of the behavior, but I believe that the import statements don't get called when using pip <command> and we hit the if __name__ == '__main__': directly. Which means that the added code will only be called when using -m.

pip <command> directly calls pip._internal.cli.main.main function while python -m pip calls pip/__main__.py

src/pip/__main__.py Outdated Show resolved Hide resolved
@deveshks deveshks force-pushed the pip-list-not-contain-pkg-work-dir branch from 2536b19 to 38eaa4d Compare April 5, 2020 04:11
@deveshks deveshks requested a review from xavfernandez April 5, 2020 04:20
Copy link
Member

@sbidoul sbidoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor comments, otherwise LGTM.

news/7731.bugfix Outdated Show resolved Hide resolved
src/pip/__main__.py Outdated Show resolved Hide resolved
@gutsytechster
Copy link
Contributor

I can confirm that it also resolves #2926 :)

@deveshks deveshks force-pushed the pip-list-not-contain-pkg-work-dir branch from 38eaa4d to 56d50b2 Compare April 5, 2020 09:04
@deveshks deveshks requested a review from sbidoul April 5, 2020 09:11
@sbidoul
Copy link
Member

sbidoul commented Apr 5, 2020

Actually even the install command is affected by this.

@deveshks
Copy link
Contributor Author

deveshks commented Apr 5, 2020

Actually even the install command is affected by this.

HI @sbidoul ,

So I will mention both show and install in the NEWSFILE and the comment. But could you please expand on how install will be affected?

Also am I correct in assuming that these commands are not negatively affected by this change?

@sbidoul
Copy link
Member

sbidoul commented Apr 5, 2020

See also #6558 that talks about this in the context of pip install.

@sbidoul
Copy link
Member

sbidoul commented Apr 5, 2020

One more thing: we may want to add a test with a PYTHONPATH env var that includes the current directory. I that case, an .egg-info in current directory must be considered as an installed package. That test would have failed with the initial approach of this PR which unconditionally removed os.cwd from sys.path.

@deveshks
Copy link
Contributor Author

deveshks commented Apr 5, 2020

See also #6558 that talks about this in the context of pip install.

Got it @sbidoul , seems like both show and install will be helped by this behaviour.

@deveshks
Copy link
Contributor Author

deveshks commented Apr 5, 2020

One more thing: we may want to add a test with a PYTHONPATH env var that includes the current directory. I that case, an .egg-info in current directory must be considered as an installed package. That test would have failed with the initial approach of this PR which unconditionally removed os.cwd from sys.path.

Hi @sbidoul

So in this test, I will run the install command as python setup.py egg_info, and then verify with list that the package is indeed being shown. Is that the correct interpretation?

@deveshks deveshks force-pushed the pip-list-not-contain-pkg-work-dir branch from 56d50b2 to c7102c4 Compare April 5, 2020 11:09
@deveshks deveshks requested a review from sbidoul April 5, 2020 11:10
@deveshks deveshks closed this Apr 5, 2020
@deveshks deveshks reopened this Apr 5, 2020
@deveshks deveshks closed this Apr 5, 2020
@deveshks deveshks reopened this Apr 5, 2020
@deveshks deveshks force-pushed the pip-list-not-contain-pkg-work-dir branch from c7102c4 to 40dd286 Compare April 5, 2020 14:07
@deveshks
Copy link
Contributor Author

deveshks commented Apr 5, 2020

Once this is merged, I will also create another PR to add test cases for the behaviour of the other commands which will change based on whether the cwd is added to PYTHONPATH or not. Don't want to crowd this PR and have more iterations here.

tests/functional/test_list.py Outdated Show resolved Hide resolved
@deveshks deveshks force-pushed the pip-list-not-contain-pkg-work-dir branch from 40dd286 to feac595 Compare April 5, 2020 14:29
@deveshks
Copy link
Contributor Author

deveshks commented Apr 5, 2020

Thanks @sbidoul for the approval.

Hi @pradyunsg , @xavfernandez , could I get this PR merged as well :)

@deveshks
Copy link
Contributor Author

deveshks commented Apr 5, 2020

Hi @sbidoul , @pradyunsg

As mentioned in a comment above, I have created #7987 to add unit tests for the remaining 4 commands affected by this on the same principle. Would really appreciate if you can take a look at the same and share your thoughts 😊

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label May 5, 2020
@lock lock bot locked as resolved and limited conversation to collaborators May 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation
Projects
None yet
8 participants