Skip to content

Commit

Permalink
feat(bzlmod): support patching 'whl' distributions (#1393)
Browse files Browse the repository at this point in the history
Before that the users had to rely on patching the actual wheel files and
uploading them as different versions to internal artifact stores if they
needed to modify the wheel dependencies. This is very common when
breaking dependency cycles in `pytorch` or `apache-airflow` packages.
With this feature we can support patching external PyPI dependencies via
pip.override tag class to fix package dependencies and/or a broken
`RECORD` metadata file.

Overall design:
* Split the `whl_installer` CLI into two parts - downloading and
extracting.
  Merged in #1487.
* Add a starlark function which extracts the downloaded wheel applies
patches
  and repackages a wheel (so that the extraction part works as before).
* Add a `override` tag_class to the `pip` extension and allow users to
pass patches
  to be applied to specific wheel files.
* Only the root module is allowed to apply patches. This is to avoid far
away modules
modifying the code of other modules and conflicts between modules and
their patches.

Patches have to be in `unified-diff` format.

Related #1076, #1166, #1120
  • Loading branch information
aignas authored Oct 20, 2023
1 parent 327b4e3 commit c0e18ed
Show file tree
Hide file tree
Showing 19 changed files with 593 additions and 10 deletions.
4 changes: 2 additions & 2 deletions .bazelrc
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
# This lets us glob() up all the files inside the examples to make them inputs to tests
# (Note, we cannot use `common --deleted_packages` because the bazel version command doesn't support it)
# To update these lines, run tools/bazel_integration_test/update_deleted_packages.sh
build --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/proto,tests/compile_pip_requirements,tests/compile_pip_requirements_test_from_external_workspace,tests/ignore_root_user_error,tests/pip_repository_entry_points
query --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/proto,tests/compile_pip_requirements,tests/compile_pip_requirements_test_from_external_workspace,tests/ignore_root_user_error,tests/pip_repository_entry_points
build --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/proto,tests/compile_pip_requirements,tests/compile_pip_requirements_test_from_external_workspace,tests/ignore_root_user_error,tests/pip_repository_entry_points
query --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/proto,tests/compile_pip_requirements,tests/compile_pip_requirements_test_from_external_workspace,tests/ignore_root_user_error,tests/pip_repository_entry_points

test --test_output=errors

Expand Down
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,11 @@ Breaking changes:
* (py_wheel) Produce deterministic wheel files and make `RECORD` file entries
follow the order of files written to the `.whl` archive.

### Added

* (bzlmod) Added `.whl` patching support via `patches` and `patch_strip`
arguments to the new `pip.override` tag class.

## [0.26.0] - 2023-10-06

### Changed
Expand Down
13 changes: 13 additions & 0 deletions examples/bzlmod/MODULE.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,19 @@ pip.parse(
"@whl_mods_hub//:wheel.json": "wheel",
},
)

# You can add patches that will be applied on the whl contents.
#
# The patches have to be in the unified-diff format.
pip.override(
file = "requests-2.25.1-py2.py3-none-any.whl",
patch_strip = 1,
patches = [
"@//patches:empty.patch",
"@//patches:requests_metadata.patch",
"@//patches:requests_record.patch",
],
)
use_repo(pip, "pip")

bazel_dep(name = "other_module", version = "", repo_name = "our_other_module")
Expand Down
4 changes: 4 additions & 0 deletions examples/bzlmod/patches/BUILD.bazel
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
exports_files(
srcs = glob(["*.patch"]),
visibility = ["//visibility:public"],
)
Empty file.
12 changes: 12 additions & 0 deletions examples/bzlmod/patches/requests_metadata.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
diff --unified --recursive a/requests-2.25.1.dist-info/METADATA b/requests-2.25.1.dist-info/METADATA
--- a/requests-2.25.1.dist-info/METADATA 2020-12-16 19:37:50.000000000 +0900
+++ b/requests-2.25.1.dist-info/METADATA 2023-09-30 20:31:50.079863410 +0900
@@ -1,7 +1,7 @@
Metadata-Version: 2.1
Name: requests
Version: 2.25.1
-Summary: Python HTTP for Humans.
+Summary: Python HTTP for Humans. Patched.
Home-page: https://requests.readthedocs.io
Author: Kenneth Reitz
Author-email: [email protected]
11 changes: 11 additions & 0 deletions examples/bzlmod/patches/requests_record.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
--- a/requests-2.25.1.dist-info/RECORD
+++ b/requests-2.25.1.dist-info/RECORD
@@ -17,7 +17,7 @@
requests/structures.py,sha256=msAtr9mq1JxHd-JRyiILfdFlpbJwvvFuP3rfUQT_QxE,3005
requests/utils.py,sha256=_K9AgkN6efPe-a-zgZurXzds5PBC0CzDkyjAE2oCQFQ,30529
requests-2.25.1.dist-info/LICENSE,sha256=CeipvOyAZxBGUsFoaFqwkx54aPnIKEtm9a5u2uXxEws,10142
-requests-2.25.1.dist-info/METADATA,sha256=RuNh38uN0IMsRT3OwaTNB_WyGx6RMwwQoMwujXfkUVM,4168
+requests-2.25.1.dist-info/METADATA,sha256=fRSAA0u0Bi0heD4zYq91wdNUTJlbzhK6_iDOcRRNDx4,4177
requests-2.25.1.dist-info/WHEEL,sha256=Z-nyYpwrcSqxfdux5Mbn_DQ525iP7J2DG3JgGvOYyTQ,110
requests-2.25.1.dist-info/top_level.txt,sha256=fMSVmHfb5rbGOo6xv-O_tUX6j-WyixssE-SnwcDRxNQ,9
requests-2.25.1.dist-info/RECORD,,
9 changes: 9 additions & 0 deletions examples/bzlmod/whl_mods/appended_build_content.BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,12 @@ write_file(
out = "generated_file.txt",
content = ["Hello world from requests"],
)

filegroup(
name = "whl_orig",
srcs = glob(
["*.whl"],
allow_empty = False,
exclude = ["*-patched-*.whl"],
),
)
3 changes: 3 additions & 0 deletions python/pip_install/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ bzl_library(
"//python/pip_install/private:srcs_bzl",
"//python/private:bzlmod_enabled_bzl",
"//python/private:normalize_name_bzl",
"//python/private:patch_whl_bzl",
"//python/private:render_pkg_aliases_bzl",
"//python/private:toolchains_repo_bzl",
"//python/private:which_bzl",
Expand Down Expand Up @@ -97,6 +98,8 @@ filegroup(
srcs = [
"//python/pip_install/tools/dependency_resolver:py_srcs",
"//python/pip_install/tools/wheel_installer:py_srcs",
"//python/private:repack_whl.py",
"//tools:wheelmaker.py",
],
visibility = ["//python/pip_install/private:__pkg__"],
)
Expand Down
25 changes: 25 additions & 0 deletions python/pip_install/pip_repository.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ load("//python/pip_install/private:generate_whl_library_build_bazel.bzl", "gener
load("//python/pip_install/private:srcs.bzl", "PIP_INSTALL_PY_SRCS")
load("//python/private:bzlmod_enabled.bzl", "BZLMOD_ENABLED")
load("//python/private:normalize_name.bzl", "normalize_name")
load("//python/private:patch_whl.bzl", "patch_whl")
load("//python/private:render_pkg_aliases.bzl", "render_pkg_aliases")
load("//python/private:toolchains_repo.bzl", "get_host_os_arch")
load("//python/private:which.bzl", "which_with_fail")
Expand All @@ -44,6 +45,7 @@ def _construct_pypath(rctx):
Args:
rctx: Handle to the repository_context.
Returns: String of the PYTHONPATH.
"""

Expand Down Expand Up @@ -542,6 +544,22 @@ def _whl_library_impl(rctx):
if not rctx.delete("whl_file.json"):
fail("failed to delete the whl_file.json file")

if rctx.attr.whl_patches:
patches = {}
for patch_file, json_args in patches.items():
patch_dst = struct(**json.decode(json_args))
if whl_path.basename in patch_dst.whls:
patches[patch_file] = patch_dst.patch_strip

whl_path = patch_whl(
rctx,
python_interpreter = python_interpreter,
whl_path = whl_path,
patches = patches,
quiet = rctx.attr.quiet,
timeout = rctx.attr.timeout,
)

result = rctx.execute(
args + ["--whl-file", whl_path],
environment = environment,
Expand Down Expand Up @@ -635,6 +653,13 @@ whl_library_attrs = {
mandatory = True,
doc = "Python requirement string describing the package to make available",
),
"whl_patches": attr.label_keyed_string_dict(
doc = """"a label-keyed-string dict that has
json.encode(struct([whl_file], patch_strip]) as values. This
is to maintain flexibility and correct bzlmod extension interface
until we have a better way to define whl_library and move whl
patching to a separate place. INTERNAL USE ONLY.""",
),
"_python_path_entries": attr.label_list(
# Get the root directory of these rules and keep them as a default attribute
# in order to avoid unnecessary repository fetching restarts.
Expand Down
2 changes: 2 additions & 0 deletions python/pip_install/private/srcs.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,6 @@ PIP_INSTALL_PY_SRCS = [
"@rules_python//python/pip_install/tools/wheel_installer:namespace_pkgs.py",
"@rules_python//python/pip_install/tools/wheel_installer:wheel.py",
"@rules_python//python/pip_install/tools/wheel_installer:wheel_installer.py",
"@rules_python//python/private:repack_whl.py",
"@rules_python//tools:wheelmaker.py",
]
14 changes: 13 additions & 1 deletion python/private/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,17 @@ bzl_library(
srcs = ["normalize_name.bzl"],
)

bzl_library(
name = "patch_whl_bzl",
srcs = ["patch_whl.bzl"],
deps = [":parse_whl_name_bzl"],
)

bzl_library(
name = "parse_whl_name_bzl",
srcs = ["parse_whl_name.bzl"],
)

bzl_library(
name = "py_cc_toolchain_bzl",
srcs = [
Expand Down Expand Up @@ -239,13 +250,14 @@ bzl_library(
exports_files(
[
"coverage.patch",
"repack_whl.py",
"py_cc_toolchain_rule.bzl",
"py_package.bzl",
"py_wheel.bzl",
"py_wheel_normalize_pep440.bzl",
"reexports.bzl",
"stamp.bzl",
"util.bzl",
"py_cc_toolchain_rule.bzl",
],
visibility = ["//:__subpackages__"],
)
Expand Down
68 changes: 66 additions & 2 deletions python/private/bzlmod/pip.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ load(
load("//python/pip_install:requirements_parser.bzl", parse_requirements = "parse")
load("//python/private:full_version.bzl", "full_version")
load("//python/private:normalize_name.bzl", "normalize_name")
load("//python/private:parse_whl_name.bzl", "parse_whl_name")
load("//python/private:version_label.bzl", "version_label")
load(":pip_repository.bzl", "pip_repository")

Expand Down Expand Up @@ -78,7 +79,7 @@ You cannot use both the additive_build_content and additive_build_content_file a
whl_mods = whl_mods,
)

def _create_whl_repos(module_ctx, pip_attr, whl_map):
def _create_whl_repos(module_ctx, pip_attr, whl_map, whl_overrides):
python_interpreter_target = pip_attr.python_interpreter_target

# if we do not have the python_interpreter set in the attributes
Expand Down Expand Up @@ -131,6 +132,10 @@ def _create_whl_repos(module_ctx, pip_attr, whl_map):
repo = pip_name,
repo_prefix = pip_name + "_",
annotation = annotation,
whl_patches = {
p: json.encode(args)
for p, args in whl_overrides.get(whl_name, {}).items()
},
python_interpreter = pip_attr.python_interpreter,
python_interpreter_target = python_interpreter_target,
quiet = pip_attr.quiet,
Expand Down Expand Up @@ -217,6 +222,35 @@ def _pip_impl(module_ctx):
# Build all of the wheel modifications if the tag class is called.
_whl_mods_impl(module_ctx)

_overriden_whl_set = {}
whl_overrides = {}

for module in module_ctx.modules:
for attr in module.tags.override:
if not module.is_root:
fail("overrides are only supported in root modules")

if not attr.file.endswith(".whl"):
fail("Only whl overrides are supported at this time")

whl_name = normalize_name(parse_whl_name(attr.file).distribution)

if attr.file in _overriden_whl_set:
fail("Duplicate module overrides for '{}'".format(attr.file))
_overriden_whl_set[attr.file] = None

for patch in attr.patches:
if whl_name not in whl_overrides:
whl_overrides[whl_name] = {}

if patch not in whl_overrides[whl_name]:
whl_overrides[whl_name][patch] = struct(
patch_strip = attr.patch_strip,
whls = [],
)

whl_overrides[whl_name][patch].whls.append(attr.file)

# Used to track all the different pip hubs and the spoke pip Python
# versions.
pip_hub_map = {}
Expand Down Expand Up @@ -261,7 +295,7 @@ def _pip_impl(module_ctx):
else:
pip_hub_map[pip_attr.hub_name].python_versions.append(pip_attr.python_version)

_create_whl_repos(module_ctx, pip_attr, hub_whl_map)
_create_whl_repos(module_ctx, pip_attr, hub_whl_map, whl_overrides)

for hub_name, whl_map in hub_whl_map.items():
pip_repository(
Expand Down Expand Up @@ -381,6 +415,35 @@ cannot have a child module that uses the same `hub_name`.
}
return attrs

# NOTE: the naming of 'override' is taken from the bzlmod native
# 'archive_override', 'git_override' bzlmod functions.
_override_tag = tag_class(
attrs = {
"file": attr.string(
doc = """\
The Python distribution file name which needs to be patched. This will be
applied to all repositories that setup this distribution via the pip.parse tag
class.""",
mandatory = True,
),
"patch_strip": attr.int(
default = 0,
doc = """\
The number of leading path segments to be stripped from the file name in the
patches.""",
),
"patches": attr.label_list(
doc = """\
A list of patches to apply to the repository *after* 'whl_library' is extracted
and BUILD.bazel file is generated.""",
mandatory = True,
),
},
doc = """\
Apply any overrides (e.g. patches) to a given Python distribution defined by
other tags in this extension.""",
)

def _extension_extra_args():
args = {}

Expand Down Expand Up @@ -412,6 +475,7 @@ the BUILD files for wheels.
""",
implementation = _pip_impl,
tag_classes = {
"override": _override_tag,
"parse": tag_class(
attrs = _pip_parse_ext_attrs(),
doc = """\
Expand Down
72 changes: 72 additions & 0 deletions python/private/parse_whl_name.bzl
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Copyright 2023 The Bazel Authors. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""
A starlark implementation of a Wheel filename parsing.
"""

def parse_whl_name(file):
"""Parse whl file name into a struct of constituents.
Args:
file (str): The file name of a wheel
Returns:
A struct with the following attributes:
distribution: the distribution name
version: the version of the distribution
build_tag: the build tag for the wheel. None if there was no
build_tag in the given string.
python_tag: the python tag for the wheel
abi_tag: the ABI tag for the wheel
platform_tag: the platform tag
"""
if not file.endswith(".whl"):
fail("not a valid wheel: {}".format(file))

file = file[:-len(".whl")]

# Parse the following
# {distribution}-{version}(-{build tag})?-{python tag}-{abi tag}-{platform tag}.whl
#
# For more info, see the following standards:
# https://packaging.python.org/en/latest/specifications/binary-distribution-format/#binary-distribution-format
# https://packaging.python.org/en/latest/specifications/platform-compatibility-tags/
head, _, platform_tag = file.rpartition("-")
if not platform_tag:
fail("cannot extract platform tag from the whl filename: {}".format(file))
head, _, abi_tag = head.rpartition("-")
if not abi_tag:
fail("cannot extract abi tag from the whl filename: {}".format(file))
head, _, python_tag = head.rpartition("-")
if not python_tag:
fail("cannot extract python tag from the whl filename: {}".format(file))
head, _, version = head.rpartition("-")
if not version:
fail("cannot extract version from the whl filename: {}".format(file))
distribution, _, maybe_version = head.partition("-")

if maybe_version:
version, build_tag = maybe_version, version
else:
build_tag = None

return struct(
distribution = distribution,
version = version,
build_tag = build_tag,
python_tag = python_tag,
abi_tag = abi_tag,
platform_tag = platform_tag,
)
Loading

0 comments on commit c0e18ed

Please sign in to comment.