Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor/fix: store dists in parse_requirements output #1917

Merged
merged 25 commits into from
Jun 1, 2024

Conversation

aignas
Copy link
Collaborator

@aignas aignas commented May 23, 2024

This moves some of the code out of the pip.bzl extension and changes
the layout of the code to prepare for multi-platform whl support.

Summary:

  • parse_requirements: add whls and sdists attribute, so that we can use
    a function to populate the lists. Not sure if there is a better way to
    do this.
  • parse_requirements: add an extra code to ensure that we are handling
    the target platform filtering correctly.
  • select_whl: split the select_whl into select_whls, which filters
    the whls (this can be used later in multi-platform selects) and
    select_whl , which just is used get the most appropriate whl for the
    host platform.
  • Additionally fix the logic in select_whl, which would result in
    Python 3.12 wheels being selected on Python 3.11 interpreters because
    we were not taking into account the interpreter tag when doing the
    filtering.

Fixes #1930

@aignas aignas marked this pull request as ready for review May 23, 2024 08:49
@aignas aignas requested a review from rickeylev as a code owner May 23, 2024 08:49
@aignas aignas changed the title refactor: store dists in parse_requirements output refactor/fix: store dists in parse_requirements output May 30, 2024
@aignas aignas self-assigned this May 30, 2024
@dougthor42
Copy link
Contributor

I tested #1930 this with the latest commit 552c526. Got error:

ERROR: Traceback (most recent call last):
        File "/usr/local/google/home/dthor/.cache/bazel/_bazel_dthor/dbe74c4144b5c9a438d84a119652bef9/external/rules_python~/python/private/bzlmod/pip.bzl", line 444, column 30, in _pip_impl
                _create_whl_repos(module_ctx, pip_attr, hub_whl_map, whl_overrides, hub_group_map, simpleapi_cache)
        File "/usr/local/google/home/dthor/.cache/bazel/_bazel_dthor/dbe74c4144b5c9a438d84a119652bef9/external/rules_python~/python/private/bzlmod/pip.bzl", line 197, column 37, in _create_whl_repos
                parse_requirements_add_dists(
        File "/usr/local/google/home/dthor/.cache/bazel/_bazel_dthor/dbe74c4144b5c9a438d84a119652bef9/external/rules_python~/python/private/parse_requirements_add_dists.bzl", line 61, column 31, in parse_requirements_add_dists
                whls = select_whls(
        File "/usr/local/google/home/dthor/.cache/bazel/_bazel_dthor/dbe74c4144b5c9a438d84a119652bef9/external/rules_python~/python/private/whl_target_platforms.bzl", line 172, column 27, in select_whls
                _, _, whl = sorted(any_whls)[-1]
Error in sorted: unsupported comparison: struct <=> struct
ERROR: Analysis of target '//src/pyle/dataking/system_optimization/snake_optimizer:snake_metrics_test' failed; build aborted: error evaluating module extension pip in @@rules_python~//python/extensions:pip.bzl

With this we should not get failures, but we would get different
wheels, which from the user point of view might be better experience.

I have also rewrote the tests to better test things and be more succinct.

The complex thing here is that we want to do some filtering of the whls
but we should not do too much a we are doing what the select should be
doing.
@dougthor42
Copy link
Contributor

With 348ca06 it looks like the credential helper is no longer working:

===== stdout start =====                                                                                                                                                                                                                      
Looking in indexes: https://[redacted]/simple, https://pypi.python.org/simple                                                                                                                                       
User for us-west2-python.pkg.dev:                                                                                                                                                                                                             
===== stdout end =====

I would only see this prompt if my cred helper script was broken or I remove common --credential_helper=... from my .bazelrc.

@aignas
Copy link
Collaborator Author

aignas commented May 31, 2024

Well credential helper not working is weird, this is definitely unintended. I thought I haven't touched the code there.

@dougthor42, could you try bazel clean to ensure that we have no PyPI index cache?

@dougthor42
Copy link
Contributor

For the record, all I'm doing to test these things is changing the git_override commit in my MODULE.bazel, followed by bazel build //... and if that passes, bazel test //.... The 8791cbb commit below is the commit that I've been doing all my bazel-ifying with and it works wonderfully.

git_override(
    module_name = "rules_python",
    commit = "8791cbbaa2336e24555a6b577ed6e19df24d7d88",
    remote = "https://github.com/bazelbuild/rules_python",
)

git_override(
    module_name = "rules_python_gazelle_plugin",
    commit = "8791cbbaa2336e24555a6b577ed6e19df24d7d88",
    remote = "https://github.com/bazelbuild/rules_python",
    strip_prefix = "gazelle",
)

At first I thought there might be something else between 8791cbb..348ca06 that caused the cred helper to fail, but I didn't have the issue in the previous commit 552c526 so it was definitely a change in 348ca06.


I forgot to mention it, but going from 8791cbb to 348ca06 also caused a bunch of debug statements to show up for various packages. Example:

DEBUG: /usr/local/google/home/dthor/.cache/bazel/_bazel_dthor/dbe74c4144b5c9a438d84a119652bef9/external/rules_python~/python/private/bzlmod/pip.bzl:285:22: WARNING: falling back to pip for installing the right file for tensorboard==2.16.2     --hash=sha256:9f2b4e7dad86667615c0e5cd072f1ea8403fc032a299f0072d6f74855775cc45

run bazel clean

Sure. Same result for 348ca06 (debug messages and credential helper failure)

For 0261471:

  • no debug message during build 🎉
  • No credential helper failure during build 🎉

I'm running test now, but a fresh test can take like 30 minutes (which is why we need to bazel-ify our code!)

@aignas
Copy link
Collaborator Author

aignas commented May 31, 2024

Ah, yeah, the DEBUG statements is what I was looking for, but was not sure if they were present. Thanks for debugging, really appreciate it, I must have broken something here.

@aignas aignas marked this pull request as draft May 31, 2024 05:27
@aignas
Copy link
Collaborator Author

aignas commented May 31, 2024

Added extra params to pip.parse:

    quiet = False,
    verbosity = "TRACE",

You can select the verbosity from INFO, DEBUG and TRACE to better get an understanding why the wheels are getting dropped.

Maybe you could later upload some of the trace/debug logs if they are not sensitive.

Copy link
Collaborator

@rickeylev rickeylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think. Nothing in particular stuck out to me as needing significant changes.

re: conversation about cred helper failing: I don't see why this PR would cause that. Maybe the index-url vs extra-index-url? IIRC, Doug has a private index, so maybe that is getting mixed up somewhere, somehow.

want_abis(list[str]): A list of ABIs that are supported.
want_platform(str): The target platform.
want_platforms(str): The platforms

Returns:
None or a struct with `url`, `sha256` and `filename` attributes for the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think the doc is now "returns a list of structs ..."


Args:
whls(list[struct]): A list of candidates.
want_version(str): An optional parameter to filter whls by version. Defaults to '3.0'.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this version referring to python version? If so, please clarify in the doc or arg name.

if tag.startswith("cp3") or tag.startswith("py3"):
version = int(tag[len("..3"):] or 0)
else:
# tag.startswith("cp2") or tag.startswith("py2")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out debug code, or cryptic comment? In any case, please delete or clarify


platform_tags = list({_LEGACY_ALIASES.get(p, p): True for p in parsed.platform_tag.split(".")})
supported_implementations = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, is this dict acting like a set? Just drop a comment. I really wish starlark had a set type

Suggested change
supported_implementations = {}
# dict, but used as a set.
supported_implementations = {}

candidates = {
parse_whl_name(w.filename).platform_tag: w
for w in whls
if "musllinux_" not in w.filename
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing the musllinux gets special cased because we don't have way to model this yet?

want_platform = repository_platform,
) or sdist
) or (requirement.sdists[0] if requirement.sdists else None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the 0th sdist special? Or is this just picking an arbitrary element to have something valid?


load(":whl_target_platforms.bzl", "select_whls")

def parse_requirements_add_dists(requirements_by_platform, index_urls, python_version):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be better put in parse_requirements.bzl? It just seems closely coupled to it.

@@ -261,11 +285,22 @@ def _create_whl_repos(module_ctx, pip_attr, whl_map, whl_overrides, group_map, s
whl_library_args.update({k: v for k, (v, default) in maybe_args_with_default.items() if v == default})

if requirement.whls or requirement.sdists:
logger.debug("Selecting a compatible dist for {} from dists:\n{}".format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some extra format() calls probably aren't a big deal, but serializing a json object as part of a debug call is a bit much. Perhaps passing a lambda instead, to defer evaluation?

Also, repo_utils.debug_print exists to base it upon an env var

@dougthor42
Copy link
Contributor

Doug has a private repo

Correct, we use a private repo on Google Artifact Registry and then also public PyPI. I'm still trying to get the Ops people to use a virtual repo that is a union of public pypi and private... but I digress.

Here's our current pip.parse:

pip.parse(
    experimental_extra_index_urls = ["REDACTED/simple"],
    experimental_index_url = "https://pypi.org/simple",
    hub_name = "pypi",
    python_version = "3.11",
    requirements_lock = "//:requirements.txt",
)

upload some of the trace/debug logs

That should be doable, albeit with some s/foo/REDACTED/g haha. However, it looks like things are working so do you still want them?

In general, those new params to pip.parse will be really helpful!

latest commit 098835f

Looks good! build and test ran successfully and the installed version of foobar is indeed for 3.11 as it should be.

aignas added 6 commits June 1, 2024 12:46
decided to move this to the same file and rewire the dependency injection so that the struct stays immutable, removing the need for explanations
@aignas
Copy link
Collaborator Author

aignas commented Jun 1, 2024

Great to hear, thanks for being the tester here! I think once this is merged we can start looking into #1837. I have locally stacked it on top of the 1837 and it seems to work for the usecases I have so it will be interesting to hear if I missed something there.

@aignas aignas marked this pull request as ready for review June 1, 2024 04:28
@aignas aignas enabled auto-merge June 1, 2024 04:29
@aignas aignas added this pull request to the merge queue Jun 1, 2024
Merged via the queue into bazelbuild:main with commit b4b52fc Jun 1, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pip.parse is installing a Py3.12 version of a (private-index) wheel even though we're using Py3.11
3 participants