Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rules/python: Add a coverage_tool attribute to py_runtime. #15590

Closed

Conversation

adam-azarchs
Copy link
Contributor

This allows users to specify a target providing the coveragepy tool (and its dependencies). This is essential for hermetic python builds, where an absolute path will not really work. It's also superior to other potential methods using environment variables because the runfiles dependency on the coverage tool and its files is only incurred when building with coverage enabled.

This also builds on the work @TLATER began with #14677 to integrate with coveragepy's lcov support, with an additional step of at least attempting to convert the absolute paths which coveragepy uses in the lcov output into the relative paths which the rest of bazel can actually consume.

This is my first time touching Java code professionally, so I'll admit to mostly cargo-culting those parts, and would welcome any feedback on how to improve things there. I also would have no objections to someone else taking over this PR to get it over the finish line. I've tested this out with our own team's internal monorepo, and have successfully generated a full combined coverage report for most of our python and go code. There's still a bunch of things which don't quite work, in particular when it comes to compiled extension modules or executables run from within python tests, but those will need to be addressed separately, and this is already a giant step forward for our team.

Closes #14436.

[Coveragepy recently gained support for lcov
output][nedbat/coveragepy#1289], which allows implementing full
support for python coverage without relying on a downstream fork of
that project

Coveragepy actually must be invoked twice; One generating a
`.coverage` database file, the other exporting the data. This means
that it is incompatible with the old implementation.

The fork of coveragepy previously intended for use with bazel
circumvented this by just changing how coveragepy works, and never
outputting that database - just outputting the lcov directly
instead. If we'd like to use upstream coveragepy, this is of course
not possible.

The stub_template seems to be written with the idea of supporting
other coverage tooling in mind, however it still hard-codes arguments
specific to coveragepy. Instead, we think it makes sense to properly
support one of them for now, and to rethink a more generic interface
later - it will probably take specific scripting for each
implementation of coverage in python anyway.

As such, this patch rewrites the python stub template to fully support
upstream coveragepy as a coverage tool, and reworks some of the logic
around invoking python to do so more cleanly.

Additional notes:

  - Python coverage will only work with Python 3.7+ with upstream
    coveragepy, since the first release with lcov support does not
    support earlier Python versions - this is unfortunate, but there
    is not much we can do downstream short of forking to resolve
    that. The stub template itself should still work with Python 2.4+.

  - Comments in the code claim to use `os.execv` for performance
    reasons. There may be a small overhead to `subprocess.call`, but
    it shouldn't be too impactful, especially considering the overhead
    in logic (written in Python) this involves - if this is indeed for
    performance reasons, this is probably a somewhat premature
    optimization.

    A colleauge helped dig through some history, finding
    3bed4af as the source of this -
    if that commit is to believed, this is actually to resolve issues
    with signal handling, however that seems odd as well, since this
    calls arbitrary python applications, which in turn may use
    subprocesses again as well, and therefore break what that commit
    seems to attempt to fix.

    It's completely opaque to me why we put so much effort into trying
    to ensure we use `os.execv`. I've replicated the behavior and
    comments assuming it was correct previously, but the patch
    probably shouldn't land as-is - the comment explaining the use of
    `os.execv` is most likely misleading.

---

[nedbat/coveragepy#1289]: nedbat/coveragepy#1289

Co-authored-by: Bradley Burns <[email protected]>
ret_code = subprocess.call(
[python_program, coverage_tool, "lcov", "-o", output_filename] + args,
env=env,
cwd=workspace
)
# Coveragepy uses absolute file names in the output. We need to strip off
Copy link

@TLATER TLATER May 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative may be asking coveragepy to report relative paths. This issue suggests there already is some sort of support for that, albeit incomplete: nedbat/coveragepy#963

If the lcov backend doesn't support it yet, I suggest we fix that upstream over hacking downstream, especially given that your comments claim there are edge cases that don't work already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be preferable, though I'd note that first of all "relative to what?" still applies; we'd probably still need to strip off at least the workspace name. This approach here was mostly a "it's Friday evening and I just want to see if I can get a useful coverage report before going to bed" solution - I'm definitely not thrilled with it, but it does work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that when it resolves the symlinks in a runfiles tree then it ends up getting a path in either the original source tree or $(bazel info execution_root)/{external,bazel-out}, none of which are are useful when made relative to the current working directory. The problem with my hacky solution to this problem is that it will only work for one of the above roots.

From my read of the code (I haven't tried it, yet) it does look like if coveragepy is configured to use relative paths then it doesn't resolve symlinks, which would fix the problem here. The problem is that behavior seems to only be controllable through the config file rather than a command line flag, which can be tricky to get working properly in a bazel sandbox environment. Various solutions to that problem:

  1. Generate a config file in a temp directory for each test invocation.
  2. Include a config file probably as a target somewhere in bazel_tools so it could be shared by all test instantiations
  3. Add a way for the user to add specify a config file in toolchain configuration. This has some advantages, but is also prone to causing problems.

I don't love any of those but would probably go with the first approach. I'll try it out on Tuesday.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it seems like coveragepy's relative_paths feature doesn't actually work, as it still canonicalizes paths in the lcov report (as that bug you link pointed out), though it does at least seem to use relative paths in the intermediate data and xml output. That said, there is a better way to do what I was doing, which I'll push up here once I've tested it properly.

Copy link
Contributor Author

@adam-azarchs adam-azarchs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I'll need to fix up a bunch of unit tests. A bit annoying because bazel test //... doesn't actually pass for me on master, which isn't too surprising given all of the various configuration and edge cases which it's testing, but I should still be able to track down everything that needs updating.

ret_code = subprocess.call(
[python_program, coverage_tool, "lcov", "-o", output_filename] + args,
env=env,
cwd=workspace
)
# Coveragepy uses absolute file names in the output. We need to strip off
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that when it resolves the symlinks in a runfiles tree then it ends up getting a path in either the original source tree or $(bazel info execution_root)/{external,bazel-out}, none of which are are useful when made relative to the current working directory. The problem with my hacky solution to this problem is that it will only work for one of the above roots.

From my read of the code (I haven't tried it, yet) it does look like if coveragepy is configured to use relative paths then it doesn't resolve symlinks, which would fix the problem here. The problem is that behavior seems to only be controllable through the config file rather than a command line flag, which can be tricky to get working properly in a bazel sandbox environment. Various solutions to that problem:

  1. Generate a config file in a temp directory for each test invocation.
  2. Include a config file probably as a target somewhere in bazel_tools so it could be shared by all test instantiations
  3. Add a way for the user to add specify a config file in toolchain configuration. This has some advantages, but is also prone to causing problems.

I don't love any of those but would probably go with the first approach. I'll try it out on Tuesday.

@sgowroji sgowroji added the team-Rules-Python Native rules for Python label May 30, 2022
@adam-azarchs adam-azarchs force-pushed the azarchs/py_runtime-coverage branch 4 times, most recently from 6be4b7e to 359d984 Compare June 1, 2022 03:38
@adam-azarchs adam-azarchs force-pushed the azarchs/py_runtime-coverage branch from 359d984 to db14416 Compare June 1, 2022 07:26
@adam-azarchs
Copy link
Contributor Author

Well, in its current state this works fine for our internal setup, which uses a hermetic python but with additional python packages installed (as required) into that python's search path, rather than some separate PYTHONPATH entry. Unfortunately when I tried setting up a test that would work the way most people are more likely to have things set up, it doesn't work so well:

  1. The pip_install rule depends on having a configured python toolchain, which results in a circular dependency if you try to use it to fetch the coverage package.
  2. You can fetch coverage with e.g.
http_archive(
    name = "coverage_linux_x64",
    build_file_content = """
filegroup(
    name = "coverage",
    srcs = ["coverage/__main__.py"],
    data = glob(["coverage/**"]),
    visibility = ["//visibility:public"],
)
""",
    sha256 = "84631e81dd053e8a0d4967cedab6db94345f1c36107c71698f746cb2636c63e3",
    type = "zip",
    urls = [
        "https://files.pythonhosted.org/packages/74/0d/0f3c522312fd27c32e1abe2fb5c323b583a5c108daf2c26d6e8dfdd5a105/coverage-6.4.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
    ],
)

The problem with that being that you get

  File "/.../my_test.runfiles/coverage_linux_x64/coverage/__main__.py", line 7, in <module>
    from coverage.cmdline import main
ModuleNotFoundError: No module named 'coverage'

The obvious solution would be to check if the coverage_tool attribute has a PyInfo provider and, if so, all it to augment the PYTHONPATH (e.g. with the imports attribute on py_library). There are two problems with that approach:

  1. Ideally we wouldn't be adding stuff to the test's PYTHONPATH for this. But I suppose that's unavoidable.
  2. py_library doesn't have a way to declare an executable entry point. So we'd need to either declare a separate attribute on py_runtime for the entry point, or execute it as python -m coverage .... The latter has some advantages, but also some significant disadvantages. If we prepend the PYTHONPATH for coverage, we might cause unexpected behavior for the package under test, and if we append it to the end then we might not get the coverage tool we expect (though, maybe packages which put a module named coverage into the global namespace kind of deserve broken behavior).

Using py_binary for this would obviously be a different kind of problematic.

@adam-azarchs
Copy link
Contributor Author

Never mind, the problem is a different one: it's not including the runfiles for coverage_tool, though it needs to.

@adam-azarchs adam-azarchs force-pushed the azarchs/py_runtime-coverage branch 2 times, most recently from d0d892f to 9db7401 Compare June 7, 2022 01:48
@adam-azarchs
Copy link
Contributor Author

Ok, with the addition of an integration test, I now feel like this is about ready to transition out of draft status. @TLATER would appreciate it if you have any further feedback to offer before I do that.

@adam-azarchs
Copy link
Contributor Author

Well, the integration test passes locally for me. Honestly I don't actually know what the best practice would be for making it work on the builders where it's failing. I don't think people would be terribly happy about vendoring in an entire python runtime + coverage wheel (or really several, since if I was being less lazy about it, it would be including targets for macos and aarch64 as well), especially given that they're not even including a complete version of rules_python. Then again, maybe it would be a good idea just to have better coverage for the hermetic python use case. This would all be a lot simpler if py_runtime was defined in starlark rather than java so that this change could be made entirely in rules_python without directly impacting the main bazel repo, though the fact that the stub lives in the main bazel repo makes that challenging.

I'm not sure what the best way forward is in the mean time - I don't like the idea of proposing this change without a CI-passing test case.

Copy link

@TLATER TLATER left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks pretty good to me, from what I remember of my foray into this part of Bazel all those months ago. Obviously still needs review from someone with actual authority around here, though :)

@adam-azarchs adam-azarchs marked this pull request as ready for review June 9, 2022 17:22
@lberki lberki requested review from comius and removed request for lberki June 10, 2022 07:46
@comius
Copy link
Contributor

comius commented Jul 21, 2022

cc @c-mita

@comius comius requested review from c-mita and removed request for gregestren, fweikert and comius July 21, 2022 10:55
@c-mita
Copy link
Member

c-mita commented Sep 20, 2022

I'm not familiar with the Python rules or how coverage works for them, but with that caveat, LGTM. ;)

@c-mita c-mita added awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally and removed awaiting-review PR is awaiting review from an assigned reviewer labels Sep 20, 2022
@adam-azarchs
Copy link
Contributor Author

Oops, accidentally pushed a button on the github UI that made it merge from master. Not that that's a bad thing exactly, but I'm going to assume that that clobbers the approvals.

Related note, I'm not sure how stuff gets from having the awaiting-PR-merge label to actually being merged?

@rickeylev
Copy link
Contributor

It says "changed approved" still, so I think the approval is still good. Thanks for bumping the thread -- if something lingers for more than a day or so please tag me.

@c-mita Can you please merge? I don't have permission to do so.

@rickeylev
Copy link
Contributor

Oh, wait, I think I see now. "After your review is complete, a Bazel maintainer applies your patch to Google's internal version control system."

I'll see if I can figure out how to do that internally.

@Wyverald
Copy link
Member

Related note, I'm not sure how stuff gets from having the awaiting-PR-merge label to actually being merged?

A member of the Bazel support team will import it into Google. This label is constantly monitored by them. @sgowroji just in case this fell off your radar, could you import this please? Thanks :)

@sgowroji
Copy link
Member

Am already on it waiting for approval on internal CL from @c-mita.

@sgowroji sgowroji removed the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label Sep 23, 2022
@adam-azarchs adam-azarchs deleted the azarchs/py_runtime-coverage branch September 23, 2022 18:43
aiuto pushed a commit to aiuto/bazel that referenced this pull request Oct 12, 2022
This allows users to specify a target providing the coveragepy tool (and its dependencies).  This is essential for hermetic python builds, where an absolute path will not really work.  It's also superior to other potential methods using environment variables because the runfiles dependency on the coverage tool and its files is only incurred when building with coverage enabled.

This also builds on the work @TLATER began with bazelbuild#14677 to integrate with `coveragepy`'s `lcov` support, with an additional step of at least attempting to convert the absolute paths which `coveragepy` uses in the lcov output into the relative paths which the rest of bazel can actually consume.

This is my first time touching Java code professionally, so I'll admit to mostly cargo-culting those parts, and would welcome any feedback on how to improve things there.  I also would have no objections to someone else taking over this PR to get it over the finish line.  I've tested this out with our own team's internal monorepo, and have successfully generated a full combined coverage report for most of our python and go code.  There's still a bunch of things which don't quite work, in particular when it comes to compiled extension modules or executables run from within python tests, but those will need to be addressed separately, and this is already a giant step forward for our team.

Closes bazelbuild#14436.

Closes bazelbuild#15590.

PiperOrigin-RevId: 476314433
Change-Id: I4be4d10e0af741f4ba1a7b5367c6f7a338a3c43d
TLATER pushed a commit to TLATER/engflow-example that referenced this pull request Nov 11, 2022
Currently WIP because:

- `$(location)` and `$(rootpath)` make vars aren't working correctly
  - It substitutes with an `external` directory prefixed, which does
    not actually exist in the sandbox.
- Python imports are wonky
  - Seems to be a bazel issue, perhaps specific to this version?
- Not tested in remote execution
  - Since we're using `pip_install`, this requires a pip on the host
    executing the WORKSPACE
- Does not use the new toolchain feature that supports adding
  coveragepy directly to the python toolchain
  - Added together with the new coveragepy support in
    bazelbuild/bazel#15590, and is probably better to use than the
    `$(rootpath)` and `pip_install` solution we're showing here
- Requires bumping Bazel to a version with bazelbuild/bazel#15590
  merged, which currently is only available in prereleases
TLATER pushed a commit to TLATER/engflow-example that referenced this pull request Nov 11, 2022
Currently WIP because:

- `$(location)` and `$(rootpath)` make vars aren't working correctly
  - It substitutes with an `external` directory prefixed, which does
    not actually exist in the sandbox.
- Python imports are wonky
  - Seems to be a bazel issue, perhaps specific to this version?
- Not tested in remote execution
  - Since we're using `pip_install`, this requires a pip on the host
    executing the WORKSPACE
- Does not use the new toolchain feature that supports adding
  coveragepy directly to the python toolchain
  - Added together with the new coveragepy support in
    bazelbuild/bazel#15590, and is probably better to use than the
    `$(rootpath)` and `pip_install` solution we're showing here
- Requires bumping Bazel to a version with bazelbuild/bazel#15590
  merged, which currently is only available in prereleases
TLATER pushed a commit to TLATER/engflow-example that referenced this pull request Nov 11, 2022
Currently WIP because:

- `$(location)` and `$(rootpath)` make vars aren't working correctly
  - This is ultimately what we tried to fix with the
  - `PYTHON_COVERAGE_TARGET` variable, which was ultimately not
    accepted. See also next point.
- Does not use the new toolchain feature that supports adding
  coveragepy directly to the python toolchain
  - Added together with the new coveragepy support in
    bazelbuild/bazel#15590, and is probably better to use than the
    `$(rootpath)` and `pip_install` solution we're showing here
  - While I'd love to rewrite the blog post to show this instead, I'd
    need a bit more time for that
- Python imports are wonky
  - For some reason, the sibling module can only be imported under a
    parent `python` module. I confirmed this by looking through
    `PYTHON_PATH`
  - Seems to be a Bazel regression? Strangely nobody is complaining
    about this yet, we should probably raise an issue
- Not tested in remote execution
  - No access to an engflow setup to test this right this instant
  - Since we're not using a toolchain (and even if we were we're using
    `pip_install`), this requires a pip on the host executing the
    WORKSPACE, so there's a decent chance it won't work
- Requires bumping Bazel to a version with bazelbuild/bazel#15590
  merged, which currently is only available in prereleases
TLATER pushed a commit to TLATER/engflow-example that referenced this pull request Nov 11, 2022
Currently WIP because:

- `$(location)` and `$(rootpath)` make vars aren't working correctly
  - This is what we tried to fix with the `PYTHON_COVERAGE_TARGET`
    variable, which was ultimately not accepted. See also next point.
- Does not use the new toolchain feature that supports adding
  coveragepy directly to the python toolchain
  - Added together with the new coveragepy support in
    bazelbuild/bazel#15590, and is probably better to use than the
    `$(rootpath)` and `pip_install` solution we're showing here
  - While I'd love to rewrite the blog post to show this instead, I'd
    need a bit more time for that
- Python imports are wonky
  - For some reason, the sibling module can only be imported under a
    parent `python` module. I confirmed this by looking through
    `PYTHON_PATH`
  - Seems to be a Bazel regression? Strangely nobody is complaining
    about this yet, we should probably raise an issue
- Not tested in remote execution
  - No access to an engflow setup to test this right this instant
  - Since we're not using a toolchain (and even if we were we're using
    `pip_install`), this requires a pip on the host executing the
    WORKSPACE, so there's a decent chance it won't work
- Requires bumping Bazel to a version with bazelbuild/bazel#15590
  merged, which currently is only available in prereleases
TLATER pushed a commit to TLATER/engflow-example that referenced this pull request Nov 11, 2022
Currently WIP because:

- `$(location)` and `$(rootpath)` make vars aren't working correctly
  - This is what we tried to fix with the `PYTHON_COVERAGE_TARGET`
    variable, which was ultimately not accepted. See also next point.
- Does not use the new toolchain feature that supports adding
  coveragepy directly to the python toolchain
  - Added together with the new coveragepy support in
    bazelbuild/bazel#15590, and is probably better to use than the
    `$(rootpath)` and `pip_install` solution we're showing here
  - While I'd love to rewrite the blog post to show this instead, I'd
    need a bit more time for that
- Python imports are wonky
  - For some reason, the sibling module can only be imported under a
    parent `python` module. I confirmed this by looking through
    `PYTHON_PATH`
  - Seems to be a Bazel regression? Strangely nobody is complaining
    about this yet, we should probably raise an issue
- Not tested in remote execution
  - No access to an engflow setup to test this right this instant
  - Since we're not using a toolchain (and even if we were we're using
    `pip_install`), this requires a python on the host, so there's a
    decent chance it won't work
- Requires bumping Bazel to a version with bazelbuild/bazel#15590
  merged, which currently is only available in prereleases
rickeylev added a commit to rickeylev/bazel that referenced this pull request Feb 28, 2023
…ries.

PR bazelbuild#15590 accidentally introduced a bug where zip-based binaries weren't
cleaning up the temporary runfiles directory they extracted their zip
into when they were run *outside* of a test invocation.

To fix, explicitly keep track of when the module space needs to be
deleted or not, and pass that long to the code that decides how to
execute the program and how to clean it up afterwards.

Fixes bazelbuild#17342
copybara-service bot pushed a commit that referenced this pull request Mar 1, 2023
PR #15590 accidentally introduced a bug where zip-based binaries weren't
cleaning up the temporary runfiles directory they extracted their zip into when
they were run outside of a test invocation.

To fix, explicitly keep track of when the module space needs to be deleted or
not, and pass that long to the code that decides how to execute the program and
how to clean it up afterwards.

Fixes #17342

PiperOrigin-RevId: 513256904
Change-Id: I8c3d322248f734a6290a8a7ee5c8725fae5203dc
ShreeM01 pushed a commit to ShreeM01/bazel that referenced this pull request Mar 13, 2023
PR bazelbuild#15590 accidentally introduced a bug where zip-based binaries weren't
cleaning up the temporary runfiles directory they extracted their zip into when
they were run outside of a test invocation.

To fix, explicitly keep track of when the module space needs to be deleted or
not, and pass that long to the code that decides how to execute the program and
how to clean it up afterwards.

Fixes bazelbuild#17342

PiperOrigin-RevId: 513256904
Change-Id: I8c3d322248f734a6290a8a7ee5c8725fae5203dc
ShreeM01 added a commit that referenced this pull request Mar 13, 2023
…17764)

PR #15590 accidentally introduced a bug where zip-based binaries weren't
cleaning up the temporary runfiles directory they extracted their zip into when
they were run outside of a test invocation.

To fix, explicitly keep track of when the module space needs to be deleted or
not, and pass that long to the code that decides how to execute the program and
how to clean it up afterwards.

Fixes #17342

PiperOrigin-RevId: 513256904
Change-Id: I8c3d322248f734a6290a8a7ee5c8725fae5203dc

Co-authored-by: Richard Levasseur <[email protected]>
fweikert pushed a commit to fweikert/bazel that referenced this pull request May 25, 2023
PR bazelbuild#15590 accidentally introduced a bug where zip-based binaries weren't
cleaning up the temporary runfiles directory they extracted their zip into when
they were run outside of a test invocation.

To fix, explicitly keep track of when the module space needs to be deleted or
not, and pass that long to the code that decides how to execute the program and
how to clean it up afterwards.

Fixes bazelbuild#17342

PiperOrigin-RevId: 513256904
Change-Id: I8c3d322248f734a6290a8a7ee5c8725fae5203dc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Rules-Python Native rules for Python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

To run coverage with python, setting PYTHON_COVERAGE in every test env is required
7 participants