-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Splitting RequirementPreparer #7049
Comments
I agree with this approach in general - currently a lot of what makes things complicated are
We had talked before about having the objects advance themselves (as opposed to being transformed from the outside). I played a little with that, which you can see for example |
Yep yep. I was mostly just hinting at how these operations would then be the same as the transitions we'd want in that model. The exact interface for how the transitions happen would likely stay the same as what we'd discussed. |
I haven't seen the full design, but this seems a little too fancy to me. In general, I would go for decoupling things, use simple, straightforward patterns, and favor being explicit over cleverness or implicit behavior. |
Looking at this again, there's three things here:
I think step 2 is going to be the most involved, since that's what involves moving code around -- the rest seems to be renaming things. |
I think the thing that makes step 2 complicated is the "partial fetch" logic for
If we get rid of that, we'd basically have a much simpler workflow, where the download and metadata generation steps are separate and not intertwined in any way. |
We should probably get PEP 658 in first, but that might not be trivial either with the current structure. Not quite sure, I only did a very brief investigation a while ago and may very well have missed obvious approaches. |
There's a PR implementing PEP 658, thanks to @cosmicexplorer -- #11111 |
This comment was marked as off-topic.
This comment was marked as off-topic.
i hid my previous comment since it wasn't feasible, but wanted to restate that I think removing fast-deps handling until it can be shown to improve performance is probably a good idea if it simplifies any of this. I'll continue experimenting with that technique on my own but that shouldn't block any of this work. |
A huge % of wheels will have the entire |
I might be misunderstanding you, but this exact realization indeed led directly to the current |
@cosmicexplorer the lazier lazy-wheel makes many fewer requests! Check it out! We avoid the Less-lazy lazy wheel wastes a This work was part of https://github.com/conda-incubator/conda-package-streaming, we can do a great job on |
Folks, I'd like to suggest moving further discussion on lazy wheels to #8670 -- it's unrelated to the refactoring that is on-topic for this issue. :) |
Welcome to another edition of Pradyun dumps his thoughts in an issue to get inputs from actually smart humans!
Currently, RequirementPreparer has two "jobs" - fetch+unpack a requirement and generate metadata.
It does this using:
unpack_url
frompip._internal.download
Distribution
objects frompip._internal.distributions
There is some additional functionality that it provides. Mainly, it calls
req.archive
, which well, is used inpip download
. I think there's benefit to splitting out all these operations.Given that
InstallRequirement.archive
skips the generatedpip-egg-info
directory, I think it's reasonable to move the logic archive generation code to do so before metadata is generated.This would result in a behavior change that I'm okay with -- we'd create an archive before calling egg_info, when using pip download. I'm not sure if this affects setuptools_scm, but based on my rudimentary understanding, it shouldn't. And if we really care a lot, I think we'd be better off moving the logic for the archiving into pip download, so that we can maintain the separation between these stages. We should probably be doing that anyway since in the more-correct resolver model, we'd only want to archive whatever is in the final set. Anyway, I'm calling it "not required" right now so I won't be making that change.
With that change, all the fetching related logic would happen before metadata generation. That'll allow splitting
RequirementPreparer
intoRequirementFetcher
andMetadataGenerator
s. This in turn would make it so that we can also introduce abstraction-style objects between these stages if we want to. I'm open to exploring that based on how the refactor here goes.My understanding is that we can get away with making the MetadataGenerator to be just functions. In future, we could make them transform some kind of FetchedCandidate into a Distribution object. For now, they'll consume an InstallRequirement, do whatever we're doing today and return the same object. This change also confuses me what we'd want to be doing with Distribution objects (they have the build isolation code and call the metadata generation code in InstallRequirement) but, hey, one step at a time. I'll look into that once this "stage" is done with.
The text was updated successfully, but these errors were encountered: