Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: (WIP) Add pypi_install() rule #1728

Draft
wants to merge 114 commits into
base: main
Choose a base branch
from

Conversation

philsc
Copy link
Contributor

@philsc philsc commented Jan 29, 2024

Adds an example/pypi_install directory to show-case multi-arch py binaries.

Caveats:

  • Doesn't use bzlmod yet.
  • Doesn't support multiple Python versions.
  • Need to re-lock requirements.txt files with the hermetic toolchain.
  • Probably more that I can't think of.

aignas added a commit to aignas/rules_python that referenced this pull request Mar 10, 2024
This is a variant of bazelbuild#1625 and was inspired by bazelbuild#1788. In bazelbuild#1625, we
attempt to parse the simple API HTML files in the same `pip.parse`
extension and it brings the follownig challenges:

* The `pip.parse` cannot be easily use in `isolated` mode and it may
  be difficult to implement the isolation if bazelbuild/bazel#20186
  moves forward.
* Splitting the `pypi_index` out of the `pip.parse` allows us to accept
  the location of the parsed simple API artifacts encoded as a bazel
  label.
* Separation of the logic allows us to very easily implement usage of
  the downloader for cross-platform wheels.
* The `whl` `METADATA` might not be exposed through older versions of
  Artifactory, so having the complexity hidden in this single extension
  allows us to not increase the complexity and scope of `pip.parse` too
  much.
* The repository structure can be reused for `pypi_install` extension
  from bazelbuild#1728.

TODO:
- [ ] Add unit tests for functions in `pypi_index.bzl` bzlmod extension if
  the design looks good.
- [ ] Changelog.

Out of scope of this PR:
- Further usage of the downloaded artifacts to implement something
  similar to bazelbuild#1625 or bazelbuild#1744. This needs bazelbuild#1750 and bazelbuild#1764.
- Making the lock file the same on all platforms - We would need
  to fully parse the requirements file.
- Support for different dependency versions in the `pip.parse` hub repos
  based on each platform - we would need to be able to interpret
  platform markers in some way, but `pypi_index` should be good already.
- Implementing the parsing of METADATA to detect dependency cycles.
- Support for `requirements` files that are not created via
  `pip-compile`.
- Support for other lock formats, though that would be reasonably
  trivial to add.

Open questions:
- Support for VCS dependencies in requirements files - We should
  probably handle them as `overrides` in the `pypi_index` extension and
  treat them in `pip.parse` just as an `sdist`, but I am not sure it
  would work without any issues.
Copy link
Collaborator

@aignas aignas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments for posterity.

@@ -0,0 +1,99 @@
load("@rules_python//python:defs.bzl", "py_library")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to have a test case like this in tests/pypi/circular-deps as handling circular dependencies is something that we don't do well yet on the main branch.


kwargs[arg_name] = select(select_dict)

def _accumulate_transitive_deps_inner(intermediate, configs, package, already_accumulated):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be useful to have - the accumulation of transitive dependencies could be very useful here. This would be a better way to solve the experimental_requirement_cycles. However, we need to be able to download METADATA and currently this is only possible from PyPI (and potentially very few Enterprise artifact registries), however we could potentially have other ways to do this, like vendoring of the METADATA files in the repo or something similar.

This would not be an issue if the lock file provided the deps that is extracted from the METADATA so maybe by the time we have time to look into this uv will have a universal lock file that has the necessary info.

deps_dict = _accumulate_transitive_deps(intermediate, all_configs, package)
deps = select({config: to_alias_refs(alias_repo_name, deps) for config, deps in deps_dict.items()})

py_wheel_library(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note about the performance of this. Because the extraction of the wheel is happening at build time, we will need to re-execute the action if the configuration of the build changes. This means that a cross platform wheel (*-py3-none-any.whl) would get extracted for each distinct configuration, hence if your repo wants to build a docker image with linux_x86_64 as the target platform and then later to run tests using py_test, you would get to copies of the extracted wheel, which is somewhat unfortunate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants