Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python3] Bootstrap pip with ${CURRENT_INSTALLED_DIR} embedded #22386

Closed
wants to merge 1 commit into from
Closed

[python3] Bootstrap pip with ${CURRENT_INSTALLED_DIR} embedded #22386

wants to merge 1 commit into from

Conversation

mkhon
Copy link
Contributor

@mkhon mkhon commented Jan 6, 2022

[python3] Do not bootstrap pip: bootstrapped pip executables
have ${CURRENT_PACKAGES_DIR} embedded so that pip.exe even if
executed from ${CURRENT_INSTALLED_DIR} installs packages into
${CURRENT_PACKAGES_DIR}

pip should be bootstrapped in post-install step (which vcpkg
does not currently implement)

Users can always bootstrap pip manually by runing "python -m ensurepip"

Describe the pull request

  • What does your PR fix?

None

  • Which triplets are supported/not supported? Have you updated the CI baseline?

all

Yes

  • If you have added/updated a port: Have you run ./vcpkg x-add-version --all and committed the result?

Yes

If you are still working on the PR, open it as a Draft: https://github.blog/2019-02-14-introducing-draft-pull-requests/

@JackBoosY
Copy link
Contributor

I'm not sure about that.
@Hoikas what do you think about that?

@JackBoosY JackBoosY added the category:port-bug The issue is with a library, which is something the port should already support label Jan 7, 2022
@mkhon
Copy link
Contributor Author

mkhon commented Jan 7, 2022

Installing packages into ${CURRENT_PACKAGES_DIR} is incorrect. But pip.exe and friends has ${CURRENT_PACKAGES_DIR} built into it

@JackBoosY
Copy link
Contributor

CURRENT_PACKAGES_DIR is not the final destination, nor is CURRENT_INSTALLED_DIR.
We hope that all included paths are relative so that users can export and use them.

@mkhon
Copy link
Contributor Author

mkhon commented Jan 7, 2022

ensurepip embeds installation directory into the resulting pip.exe
that's why doing ensurepip during install phase into $CURRENT_PACKAGES_DIR is incorrect

@Hoikas
Copy link
Contributor

Hoikas commented Jan 7, 2022

I understand why deferring ensurepip might be desirable to avoid the wrong path being embedded into pip.exe. But, this breaks the workflow of the port - when Python3 is requested by find_package(Python3 COMPONENTS Interpreter Development), the python.exe returned should be functionally identical to one installed from python.org. In this way, you can have a complete python stack built from vcpkg for your project, if needed. If pip is not available, that contract is broken.

It would be better, in my opinion, to do as @JackBoosY suggests and fix pip.exe to be built with relative paths as a opposed to simply deleting pip.

@mkhon
Copy link
Contributor Author

mkhon commented Jan 11, 2022

The problem is actually not in Python code, but in pip: it hardcodes sys.executable (full path to current python executable, at the time when ensurepip is executed) into the resulting pip3.exe binary.

I suggested a fix upstream (pip) to use current python interpreter (as found in PATH) instead of hardcoded (during build phase) but they say current behavior is intentional:

See also pypa/pip#10442 pypa/pip#10700

Please also note that ensurepip ships with pre-built pip wheel, so modifying pip used by ensurepip (or building a patched pip wheel and not using ensurepip) is rather complicated and does not seem to be adequate during python build.

@JackBoosY
Copy link
Contributor

According to the maintainer's comment:

This is by design. When an executable wrapper is contained (either for pip, or for any other package that pip installs) we hard code the exact path to the Python interpreter that the package is installed into. This is to ensure that the command is run using the Python interpreter that has the necessary dependencies installed.
If you use the executable wrapper for pip (rather than the recommended approach of using py -m pip) then you should always use the wrapper installed in the Python installation you want to run pip against.

This shows that the pipe doesn't support "removability", so I think we can point the absolute path it uses to CURRENT_INSTALLED_DIR or CURRENT_HOST_INSTALLED_DIR?

@mkhon
Copy link
Contributor Author

mkhon commented Jan 11, 2022

@JackBoosY the absolute path is embedded during pip installation into the resulting .exe

While we can patch it to point to $CURRENT_INSTALLED_DIR instead of $CURRENT_PACKAGES_DIR, please note that python3 port does not execute "python -m ensurepip" during installation (so on Linux we don't have pip installed)

@Hoikas what do you think?

@mkhon
Copy link
Contributor Author

mkhon commented Jan 11, 2022

The new version enforces installed pip executables to have $CURRENT_INSTALLED_DIR embedded

@mkhon mkhon changed the title [python3] Do not bootstrap pip [python3] Bootstrap pip into ${CURRENT_INSTALLED_DIR} Jan 11, 2022
@mkhon mkhon changed the title [python3] Bootstrap pip into ${CURRENT_INSTALLED_DIR} [python3] Bootstrap pip with ${CURRENT_INSTALLED_DIR} embedded Jan 11, 2022
@dg0yt
Copy link
Contributor

dg0yt commented Jan 11, 2022

But CURRENT_INSTALLED_DIR at python build time is not CURRENT_INSTALLED_DIR when python is used, when installed from cache on another installation.
The binary artifacts must be relocateable, or the bootstrapping must be postponed to usage time.

@Hoikas
Copy link
Contributor

Hoikas commented Jan 11, 2022

please note that python3 port does not execute "python -m ensurepip" during installation (so on Linux we don't have pip installed)

The unix codepath passes --with-ensurepip, which, as I understand it, executes ensurepip as part of the build process. I really hope it's possible to use relative paths for pip.exe -- the Python for Windows installer somehow manages to not have this problem, so, to my mind, simply deleting pip from the port is not optimal.

@mkhon
Copy link
Contributor Author

mkhon commented Jan 12, 2022

@Hoikas

  1. oh, I haven't noticed --with-ensurepip also does the installation because pip executables are missing in Linux build for some reason

  2. Python for Windows installer installs pip the same way (using ensurepip) so the resulting pip binaries have python.exe location hardcoded in them (with the absolute path inside - you can check by looking at pip.exe - at the end of the file). There's no magic here

@mkhon
Copy link
Contributor Author

mkhon commented Jan 12, 2022

But CURRENT_INSTALLED_DIR at python build time is not CURRENT_INSTALLED_DIR when python is used, when installed from cache on another installation.

Yes, but I wanted to keep the contract at least during the build (when python3 is a dependency)

The binary artifacts must be relocateable, or the bootstrapping must be postponed to usage time.

That's what my original patch did. And that's what standard python installer does - it runs ensurepip at post-install phase

@JackBoosY
Copy link
Contributor

Python for Windows installer installs pip the same way (using ensurepip) so the resulting pip binaries have python.exe location hardcoded in them (with the absolute path inside - you can check by looking at pip.exe - at the end of the file). There's no magic here

We should find a way to patch this.

@mkhon
Copy link
Contributor Author

mkhon commented Jan 25, 2022

Can we push this fix as currently pip is not usable at all and more over - it installs modules into ${CURRENT_PACKAGES_DIR} (at the time of building)

Patching pip is not an easy thing to do and should be a subject of a different PR.

Also, please remove requires:author-response tag.

@JackBoosY JackBoosY added depends:vcpkg-team This PR depends on the vcpkg team to solve something and removed requires:author-response labels Jan 26, 2022
@JackBoosY
Copy link
Contributor

@strega-nil-ms @BillyONeal @ras0219-msft Can you please take a look at this PR?

Thanks.

@mkhon
Copy link
Contributor Author

mkhon commented May 5, 2022

Can this one be reviewed?

@ras0219-msft
Copy link
Contributor

In this way, you can have a complete python stack built from vcpkg for your project, if needed. If pip is not available, that contract is broken.

This is not necessarily the contract we should expose -- for example, if pip cannot be relocated, we cannot provide pip.

Based on the above comment thread, following the links, and doing separate research, I don't think we can provide pip in the package: it is fundamentally unrelocatable.

However, the python we install does have venv, which makes a pip-enabled virtual environment very easy to obtain:

$ ls ../installed/x64-linux/tools/python3/pip
ls: cannot access '../installed/x64-linux/tools/python3/pip': No such file or directory
$ ../installed/x64-linux/tools/python3/python3.10 -m venv .
$ ls bin
activate  activate.csh  activate.fish  Activate.ps1  pip  pip3  pip3.10  python  python3  python3.10
$ bin/pip --version
pip 21.2.4 from /workspaces/vcpkg/venv/lib/python3.10/site-packages/pip (python 3.10)

Because vcpkg is not a general purpose application package manager, I think this is the right contact for us to fulfill. Users should never mutate the vcpkg installed directory -- they should never install additional packages directly into vcpkg. Therefore a virtual environment is mandatory anyway.

I've opened #24906 which takes this alternate approach and instead documents how users can use venv to get their environment set up with a fully functional pip.

@ras0219-msft
Copy link
Contributor

I think with #24906 this can be closed? Thanks!

@JackBoosY JackBoosY removed the depends:vcpkg-team This PR depends on the vcpkg team to solve something label Jun 1, 2022
@mkhon mkhon deleted the fix/python3-pip branch September 28, 2022 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:port-bug The issue is with a library, which is something the port should already support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants