Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: compile source files at build time #1902

Merged
merged 27 commits into from
May 18, 2024

Conversation

rickeylev
Copy link
Collaborator

This implements precompiling: performing Python source to byte code compilation at build time. This allows improved program startup time by allowing the byte code compilation step to be skipped at runtime.

Precompiling is disabled by default, for now. A subsequent release will enable it by default. This allows the necessary flags and attributes to become available so users can opt-out prior to it being enabled by default. Similarly, //python:features.bzl is introduced to allow feature detection.

This implementation is made to serve a variety of use cases, so there are several attributes and flags to control behavior. The main use cases being served are:

  • Large mono-repos that need to incrementally enable/disable precompiling.
  • Remote execution builds, where persistent workers aren't easily available.
  • Environments where toolchains are custom defined instead of using the ones created by rules_python.

To that end, there are several attributes and flags to control behavior, and the toolchains allow customizing the tools used.

Fixes #1761

This implements precompiling: performing Python source to
byte code compilation at build time. This allows improved program
startup time by allowing the byte code compilation step to be skipped
at runtime.

Precompiling is disabled by default, for now. A subsequent release will
enable it by default. This allows the necessary flags and attributes
to become available so users can opt-out prior to it being enabled by
default. Similarly, `//python:features.bzl` is introduced to allow
feature detection.

This implementation is made to serve a variety of use cases, so there
are several attributes and flags to control behavior. The main use cases
being served are:
* Large mono-repos that need to incrementally enable/disable
  precompiling.
* Remote execution builds, where persistent workers aren't easily
  available.
* Environments where toolchains are custom defined instead of using
  the ones created by rules_python.

To that end, there are several attributes and flags to control behavior,
and the toolchains allow customizing the tools used.

Fixes todo: omitted to prevent github spam
@rickeylev rickeylev requested a review from aignas May 15, 2024 21:52
@rickeylev rickeylev requested a review from f0rmiga as a code owner May 15, 2024 21:52
@rickeylev
Copy link
Collaborator Author

My apologies for the very large PR. The core logic in common_bazel.bzl#maybe_precompile isn't too big, but there's a lot of supporting code to define a toolchain, flags, attributes, precompiler tools, and etc.

I'll work on the CI failures. It looks like I'll need to add extra logic to gate these behind bazel 7 and pystar.

Copy link
Collaborator

@aignas aignas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall structure makes sense. I gave it a once-over and I could not see anything blocking. I like the enum, features.bzl plumbing that I might be able to reuse them in pip.

python/features.bzl Outdated Show resolved Hide resolved
# See the License for the specific language governing permissions and
# limitations under the License.

"""A simple precompiler to generate deterministic pyc files for Bazel."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a substantial amount of code, what do you think about adding a ruff pre-commit hook and ensuring that this passes it? I wonder what are your thoughts on having type hints everywhere? I guess we have to make them compatible with 3.8 upwards, so we might need the old style annotations, but ruff could be configured to ensure that we do 3.8-compatible type hints.

It's fine to introduce ruff as a separate PR though since it would be scope creep.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like all the ideas :). That said:

  • Yes, I think adding ruff is out of scope. +1 on adding linting as a check though
  • Yes, I like type annotations. I added them. But, because the imports are deferred (deferring them makes non-worker executions twice as fast), I don't think it's going to play well with type checking. If/when we can move the worker impl to another file, then we can probably get a type checker running over it. But I don't want to block this PR on figuring that out. The type annotations are a bit anemic, too, because most of the objects being interacted with are "dynamicly defined" (argparse Namespace, json request/response object), or rely on side effects (e.g. compile writes to the input file paths). But it's a start.
  • Compatibility with older Python versions is a good point. The non-worker and serial worker impls will probably work fine going back pretty far. I'm not sure about the asyncio based one, though. In any case, there are toggles that can help work around this situation, and it's initially disabled.

Depending on how much trouble CI gives me, I'll see if I have time to add some basic tests of the precompiler (a little can go a long way).

@rickeylev rickeylev enabled auto-merge May 18, 2024 16:24
@rickeylev rickeylev added this pull request to the merge queue May 18, 2024
Merged via the queue into bazelbuild:main with commit 3730803 May 18, 2024
4 checks passed
@rickeylev rickeylev deleted the precompile.toolchain branch May 18, 2024 16:50
Comment on lines +8 to +9
Precompiling is enabled by default, so there typically isn't anything special
you must do to use it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR message and changelog say it's disabled by default?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thanks for catching that. I've sent a PR to fix the docs here. It is disabled by default for now. After the next release, we can consider enabling it by default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rules/python: Option to generate .pyc files for py_library.
3 participants