Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better freeze of distributions installed from direct URL references #7612

Merged
merged 8 commits into from
Apr 1, 2020

Conversation

sbidoul
Copy link
Member

@sbidoul sbidoul commented Jan 19, 2020

Closes #609 via PEP 610.

In a nutshell, before it did this:

$ pip install git+https://github.com/pypa/packaging
....
$ pip freeze
packaging==20.1.dev0
pyparsing==2.4.6
six==1.14.0

Now it does this:

$ pip install git+https://github.com/pypa/packaging
....
$ pip freeze
packaging @ git+https://github.com/pypa/packaging@d49fdc500a7c057b71851847ff8d7cc92865bcf2
pyparsing==2.4.6
six==1.14.0
  • implement DirectUrl, a model for PEP 610
    • unit tests
  • helper to convert a DirectUrl to PEP 440 requirement string
    • unit tests
  • helper to obtain a DirectUrl from a Distribution if direct_url.json metadata is present
    • unit tests (normal path tested via freeze test)
  • helper to create a DirectUrl from a Link
    • unit tests
  • record direct_url.json at installation of non-editable direct URL requirements
    • add DirectUrl support to install_wheel (incl. unit test)
    • add direct_url to InstallRequirement and pass it to install_wheel via install
    • for VCS links, obtain commit_id when the wheel was retrieved from cache => it's the requested revision because otherwise it would not have been cached
    • tests
  • pip freeze exploits direct_url.json metadata
    • test

@sbidoul sbidoul force-pushed the pip609-v2-sbi branch 8 times, most recently from 7a22e23 to b8afe21 Compare January 25, 2020 16:56
@sbidoul sbidoul force-pushed the pip609-v2-sbi branch 2 times, most recently from 6480ecf to 1be708f Compare January 26, 2020 12:07
@sbidoul sbidoul marked this pull request as ready for review January 26, 2020 12:36
@sbidoul sbidoul force-pushed the pip609-v2-sbi branch 2 times, most recently from ea038be to 5f60f6f Compare January 27, 2020 10:59
@sbidoul
Copy link
Member Author

sbidoul commented Jan 28, 2020

@pypa/pip-committers I have a question following #7650 (comment). Should pip freeze emit PEP 440 direct URL references requirements (name @ URL), or should it emit pip-style (URL#egg=name) requirements?

This PR currently emits PEP 440 style requirements.

@pfmoore
Copy link
Member

pfmoore commented Jan 28, 2020

IMO it should use PEP 440 (so that requirements files are less tied to pip-specific details, and are more standards based). But that does mean that we need to make sure we support PEP 440 for all the URL styles we need - and #7650 suggests we have some tidying up to do there.

@sbidoul
Copy link
Member Author

sbidoul commented Jan 28, 2020

Good. Converting pip URLs to PEP 440 works just fine, because URL parsing is done by the Link class and get_url_rev_and_auth (for VCS specific stuff) in both cases. The git+file issue in #7650 is due to an over-zealous check being done in packaging.

def _get(d, expected_type, key, default=None):
# type: (Dict[str, Any], Type[T], str, Optional[T]) -> Optional[T]
"""Get value from dictionary and verify expected type."""
if key not in d:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try:
    value = d[key]
except KeyError:
    return default

is more Pythonic IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. Is it? I've used both over time and I now try to avoid exceptions for flow control.


def _exactly_one_of(infos):
# type: (Iterable[Optional[InfoType]]) -> InfoType
infos = [info for info in infos if info is not None]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
infos = [info for info in infos if info is not None]
infos = list(filter(None, infos))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filter(None, iterable) removes falsy entries, not just None. It’s probably not a problem here, but I prefer the current comprehension implementation for clarity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also find list comprehension more readable, as its more explicit. With filter I have to think for a fraction of a second what None means as a filter function.

def _exactly_one_of(infos):
# type: (Iterable[Optional[InfoType]]) -> InfoType
infos = [info for info in infos if info is not None]
if len(infos) == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if len(infos) == 0:
if not infos:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be too magical?

iterator = (info for info in infos if info is not None)
try:
    info = next(iterator)
except StopIteration:
    # missing one
if next(iterator, None):
    # more than one
return info

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be too magical?

I like it, although I would use iterator = filter(None, infos) here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be too magical?

Yes it would be, IMO. That requires me quite more than a fraction of a second to understand what this does compared to the couple of ifs. Also I tend to avoid using exceptions for flow control purposes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid exceptions here you can always do next(iterator, None). But that doesn’t help with the magical aspect.

else None,
vcs_info=self.info.to_dict()
if isinstance(self.info, VcsInfo)
else None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would we keep a name class member on the individual types which holds the key name and just use that instead of these individual checks

requested_revision = None # type: Optional[str]
commit_id = None # type: str
resolved_revision = None # type: Optional[str]
resolved_revision_type = None # type: Optional[str]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this and the other classes can we specify these in an __init__ so it's clearer that these are to be used as object members and not class members?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. TIL that mypy understands that.

@@ -598,6 +611,7 @@ def install_wheel(
pycompile=True, # type: bool
warn_script_location=True, # type: bool
_temp_dir_for_testing=None, # type: Optional[str]
direct_url=None, # type: DirectUrl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type should be Optional[DirectUrl] - strict-optional is disabled in this file otherwise it would've been caught.

)
with open(temp_direct_url_path, 'wb') as temp_direct_url_file:
temp_direct_url_file.write(direct_url.to_json().encode("utf-8"))
shutil.move(temp_direct_url_path, direct_url_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make an issue to use a safer write-then-move approach like we do here? If we don't fix it everywhere in this module before this PR gets merged then I'd prefer to keep this as-is so that they can all be fixed at once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue created: #7699

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The corresponding issue now has an approved PR (#7929); so I think it'd be nice if we could adopt that approach -- it'd probably need that PR to merge but I don't think there's any reason it can't be merged within this week. :)

def test_std_install_with_direct_url(self, data, tmpdir):
self.prep(data, tmpdir)
direct_url = DirectUrl()
direct_url.url = "file:///home/user/archive.tgz"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance it seems kind of random that we're setting a different url than we would actually expect. I assume this is to ensure that install_wheel isn't incorrectly using the wheel path argument? Can we put a comment to that effect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a comment. direct_url.url is typically the original requested link, while wheelpath is the result of a local build.

"""Returns a link to a cached item if it exists, otherwise returns the
passed link. Also returns a flag indicating if the cached link is
in the persistent cache (True), in the ephemeral cache (False),
or not in cache (None).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would it look to return something enum-like instead?

src/pip/_internal/operations/freeze.py Show resolved Hide resolved
@pypa-bot
Copy link

pypa-bot commented Feb 6, 2020

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will eligible for code review and hopefully merging!

@sbidoul
Copy link
Member Author

sbidoul commented Mar 31, 2020

Rebased to resolve conflict following de populate_link move in #7801

Still ready for another review round. @xavfernandez maybe? @chrahunt it would be great if you could update your review, I think the changes your requested are done.

@pradyunsg
Copy link
Member

pradyunsg commented Apr 1, 2020

Other than a minor comment (#7612 (comment)), I think this PR looks pretty good.

I do think it'd be a good idea to merge this in ASAP, and iterate on this as we go; instead of trying to implement rest of the optional / nuanced functionality as part of this PR.

src/pip/_internal/models/direct_url.py Outdated Show resolved Hide resolved
@pradyunsg
Copy link
Member

Wait, no. Uhhhh. I didn't intend to actually do the merge.

Hopefully "use both changes" was the correct thing here. Otherwise, apologies for the extra work @sbidoul. 🙈

@sbidoul
Copy link
Member Author

sbidoul commented Apr 1, 2020

Thanks @xavfernandez for the review.
No worries, @pradyunsg, I rebased, and applied adjacent_temp_file to direct_url.json too.

@pradyunsg pradyunsg merged commit 4b159c5 into pypa:master Apr 1, 2020
@pradyunsg
Copy link
Member

Wheeee! I got to do the green button clicky stuff! ^.^

@pradyunsg
Copy link
Member

Many thanks and congratulations to @sbidoul! Thanks everyone who contributed to or reviewed this PR! ^>^

@bmartinn
Copy link

bmartinn commented Apr 1, 2020

@sbidoul well done !!!
Guys this is a great feature we have been wanting for a while now... Thank you!
Can't wait to test the first RC release 🎉 🎊

@sbidoul
Copy link
Member Author

sbidoul commented Apr 2, 2020

Thanks for merging @pradyunsg. Big thank you to everyone who helped with this in one way or another!

@bmartinn Thank you for the kind words. Note pip does not do RC releases, but I encourage you to test now by installing the master branch with pip install -U "pip @ https://github.com/pypa/pip/tarball/master" (in an activated virtualenv).

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label May 5, 2020
@lock lock bot locked as resolved and limited conversation to collaborators May 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation C: freeze 'pip freeze' related PEP implementation Involves some PEP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pip freeze does not list repo url