Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use compileall.compile_file instead of compileall.compile_dir #8540

Closed
wants to merge 1 commit into from

Conversation

chrahunt
Copy link
Member

@chrahunt chrahunt commented Jul 4, 2020

We want to move towards having more control over the generation of pyc
files, which will allow us to provide deterministic installs and
generate pyc files without relying on an already-extracted wheel.

To that end, here we are stripping away one layer of abstraction,
compileall.compile_dir. compileall.compile_dir essentially recurses
through the provided directories and passes the files and args verbatim
to compileall.compile_file, so removing that layer means that we
directly call compileall.compile_file.

We make the assumption that we can successfully walk over the
source file tree, since we just wrote it, and omit the per-directory
traversal error handling done by compileall.compile_dir.

Progresses #6030 and #7808.

We want to move towards having more control over the generation of pyc
files, which will allow us to provide deterministic installs and
generate pyc files without relying on an already-extracted wheel.

To that end, here we are stripping away one layer of abstraction,
`compileall.compile_dir`. `compileall.compile_dir` essentially recurses
through the provided directories and passes the files and args verbatim
to `compileall.compile_file`, so removing that layer means that we
directly call `compileall.compile_file`.

We make the assumption that we can successfully walk over the
source file tree, since we just wrote it, and omit the per-directory
traversal error handling done by `compileall.compile_dir`.
@chrahunt chrahunt added type: refactor Refactoring code skip news Does not need a NEWS file entry (eg: trivial changes) labels Jul 4, 2020
@chrahunt chrahunt marked this pull request as ready for review July 4, 2020 17:30
@pfmoore
Copy link
Member

pfmoore commented Jul 4, 2020

What's the advantage of deliberately not using a stdlib function but replicating its behaviour ourselves? This seems like we're taking on extra complexity for no obvious reason. You say

We want to move towards having more control over the generation of pyc
files, which will allow us to provide deterministic installs and
generate pyc files without relying on an already-extracted wheel.

but I don't see how this will enable that. I suggest postponing this change until it's needed for that work, so that the benefit can be seen in context. At the moment this feels very much like YAGNI, or at least a case of splitting PRs up too finely...

@chrahunt
Copy link
Member Author

chrahunt commented Jul 4, 2020

compileall.compile_dir takes a single top-level directory and recurses through it, byte-compiling any .py files it finds along the way. It does not have a non-file interface, so the source file must exist on disk and the pyc file will be output to disk.

Currently, 'pyc' files are only generated for the wheel we are installing because we run compileall.compile_dir on the temporary directory in which we extracted the wheel. We don't want this temporary directory to exist anymore (because we want to install directly from the wheel), but we still need files on disk to communicate with compileall.compile_dir.

The approach I intend to take is to use the source files after they are installed. Using compileall.compile_dir would catch unintended files, which may exist alongside the installed source files for the current wheel in the install destination. compileall.compile_file can be used to compile 1 file at a time, so we can pass it each file that we knew we installed.

at least a case of splitting PRs up too finely...

That may be fair. My reason for submitting this as a separate PR was to give reviewers a chance to look at compileall.compile_dir and compileall.compile_file and convince themselves that we aren't missing some detail for our use case, since we shouldn't just take my word for it. That already seems like a great deal of effort, and I didn't want to add behavior changes into the mix.

If it would help I can submit a PR with the next steps.

@chrahunt
Copy link
Member Author

chrahunt commented Jul 4, 2020

I put the next steps that depend on this one in #8541, so it's clearer how this will be used. I'd love any feedback on how this could be broken up more effectively for review. :)

@pfmoore
Copy link
Member

pfmoore commented Jul 4, 2020

My reason for submitting this as a separate PR was to give reviewers a chance to look at compileall.compile_dir and compileall.compile_file and convince themselves that we aren't missing some detail for our use case

Personally, I'm not going to dig into how compileall.compile_dir works, to make sure you replicated its behaviour. If I'm reviewing a PR, what I want to see is what effect pip needs, and does the PR deliver that. If the PR (in isolation) doesn't need to stop using compile_dir, then IMO it's too limited in scope, and should be combined with the PR that motivates the change.

Of course, that's just my opinion - and I don't particularly mind larger PRs, so maybe I'm unusual here.

I put the next steps that depend on this one in #8541, so it's clearer how this will be used.

Thanks - that's a much more reasonable level to review IMO. To be clear, #8541 replaces this PR, doesn't it? I had a quick look and I think I have a few comments, but I'll look at it properly tomorrow.

(Note - I'm somewhat ambivalent about the whole question of "reproducible builds", so I'm more interested in progressing #6030 than #7808. But I don't think that affects my view on this change).

@chrahunt
Copy link
Member Author

chrahunt commented Jul 5, 2020

Thank you for your insight! In general I'm trying to optimize for lower time needed to review, since that's our most limited resource. It helps that goal a lot to get your perspective on it. :)

To be clear, #8541 replaces this PR, doesn't it?

Yes I would consider this superseded by #8541.

@chrahunt chrahunt closed this Jul 5, 2020
@chrahunt chrahunt deleted the refactor/use-compile-file branch July 5, 2020 12:55
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
skip news Does not need a NEWS file entry (eg: trivial changes) type: refactor Refactoring code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants