Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New resolver reports version conflict on github repo URL variations #9229

Closed
fiendish opened this issue Dec 4, 2020 · 21 comments
Closed

New resolver reports version conflict on github repo URL variations #9229

fiendish opened this issue Dec 4, 2020 · 21 comments

Comments

@fiendish
Copy link

fiendish commented Dec 4, 2020

The resolver appears to treat projectname @ git+https://[email protected]/org/repo.git as different from projectname @ git+https://github.com/org/repo.git even though they're the same thing and the only difference is the "git@" in the url. If something is required directly and indirectly (via another requirement) using slightly different but totally valid URL variations, pip fails with a version conflict error.

pip install -r requirements.txt on a requirements.txt containing:

project-abc @ git+https://github.com/org/project-abc.git
project-def @ git+https://github.com/org/project-def.git

Where project-def has a requirements.txt containing the "git@" URL variant:

project-abc @ git+https://[email protected]/org/project-abc.git

Produces

ERROR: Cannot install -r requirements.txt (line 2) ... because these package
versions have conflicting dependencies.

The conflict is caused by:
    The user requested project-abc 0.1.0 (from git+https://github.com/org/project-abc.git)
    project-def 0.1.2 depends on project-abc 0.1.0 (from git+https://****@github.com/org/project-abc.git)

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

Removing the "user@" from the second requirements.txt avoids this, but it worked before the resolver upgrade and may require changes to packages upstream from the user.

Expected Behavior

Working URL variations for the same package should not affect version resolution.

Environment details

pip 20.3.1
macOS Mojave 10.14.6
Python 3.8.6

@fiendish fiendish changed the title Resolver reports version conflict on github repo URL variations New resolver reports version conflict on github repo URL variations Dec 4, 2020
@fiendish
Copy link
Author

fiendish commented Dec 5, 2020

Simplified demo using a random real python repo and no dependency chaining:

$ pip install git+https://github.com/fiendish/xlrd git+https://[email protected]/fiendish/xlrd

Collecting git+https://github.com/fiendish/xlrd
  Cloning https://github.com/fiendish/xlrd to /private/var/folders/7r/zzgsf_917vn98xy97_274d24z3zxz5/T/pip-req-build-z3za5_mk
Collecting git+https://****@github.com/fiendish/xlrd
  Cloning https://****@github.com/fiendish/xlrd to /private/var/folders/7r/zzgsf_917vn98xy97_274d24z3zxz5/T/pip-req-build-6x7m6gbw
ERROR: Cannot install xlrd 1.2.0 (from git+https://****@github.com/fiendish/xlrd) and xlrd 1.2.0 (from git+https://github.com/fiendish/xlrd) because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested xlrd 1.2.0 (from git+https://github.com/fiendish/xlrd)
    The user requested xlrd 1.2.0 (from git+https://****@github.com/fiendish/xlrd)

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies

@uranusjr
Copy link
Member

uranusjr commented Dec 5, 2020

I believe this is expected; two URLs with different log in information are treated as different sources since they are technically different URLs, and can return entirely different contents in theory. pip did not actually check this previously (even if the URLs entirely differ), and simply picked the first URL it seems and ignore the second one. The new resolver is stricter on this and require the URLs to be equivalent.

@fiendish
Copy link
Author

fiendish commented Dec 5, 2020

can return entirely different contents in theory

Except that they didn't. Pip should know this because the conflict message appears after both are cloned.

two URLs with different log in information are treated as different sources since they are technically different URLs

Also I'm pretty sure that's not how github works?

@uranusjr
Copy link
Member

uranusjr commented Dec 5, 2020

A clone is not sufficient, since Git’s clone is not reproducible. pip would need to actually build the package to tell whether two package contents are indeed the same, and even that has problems due to technicaly details, e.g. #5648. So pip opts the strictest route, and reject different URLs (ideally pip should not even clone, but that’s an implementation quirk that should be improved).

Also I'm pretty sure that's not how github works?

It is how GitHub works for humans now, but pip does not know that. pip does not contain information specific to a given index (and should not IMO); so the user needs to help it fill these gaps.

@fiendish
Copy link
Author

fiendish commented Dec 5, 2020

A clone is not sufficient, since Git’s clone is not reproducible.

Can you explain this part? If I clone both versions to adjacent locations, and then diff shows the cloned contents are identical, which is what's happening here, then the clones are the same even if by some quirk of the phase of the moon or neutrino interference they would result in different builds. What scenario is this protecting?

so the user needs to help it fill these gaps.

This puts a weird burden on unrelated distributed projects to agree to all coordinate to use the same URL format, which may not even be possible, in order to prevent something that seems in my ignorance to be slightly far fetched.

@uranusjr
Copy link
Member

uranusjr commented Dec 5, 2020

Can you explain this part? If I clone both versions to adjacent locations, and then diff shows the cloned contents are identical, which is what's happening here, then they're the same.

IIUC the content of .git is subject to change and not reliable. That is usually not a part of a package not thus not relevant, but again pip does not know that, which is why a build step would be needed.

This puts a weird burden on unrelated distributed projects to agree to all coordinate to use the same URL format.

And on the flip side, not having that burden on you means more burden to us, because we get to maintain the functionality that satisfies your convenience 🙂

@fiendish
Copy link
Author

fiendish commented Dec 5, 2020

the content of .git is subject to change

Modification times or something else? Do you have a resource where I can learn about what kinds of changes we're talking about? I'm only able to find references for modification times or changes between different computers (seemingly not relevant here).

@fiendish
Copy link
Author

fiendish commented Dec 5, 2020

(I feel like I should also add that the formulation where local and remote requirements.txt files used different URL format for the same project leads to the seemingly infinite install looping behavior already reported elsewhere. I didn't originally want to put it in this report because it's already on your radar, but now I'm second guessing that.)

@blaiseli
Copy link

blaiseli commented Dec 9, 2020

I think I just experienced a variation on this issue while re-building a singularity container of mine that starts by updating pip, then git clones a repo that contains a requirements.txt file and tries to use it:

ERROR: Cannot install -r requirements.txt (line 52) and libworkflows 0.3 (from git+https://gitlab.pasteur.fr/bli/libworkflows.git@2b10cc3e5ab61284853463d2950a157295ad7f16) because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested libworkflows 0.3 (from git+https://gitlab.pasteur.fr/bli/libworkflows.git@2b10cc3e5ab61284853463d2950a157295ad7f16)
    libhts 0.3 depends on libworkflows 0.3 (from git+https://gitlab.pasteur.fr/bli/libworkflows.git)

My requirements.txt pulls libhts, which has libworkflows as a dependency, and also libworkflows directly.

The indirect dependency is specified in the install_requires section of the setup.py of libhts, without a version, just the git repo address.

The direct dependency is specified with a version constraint in requirements.txt.
The specified version is the last commit in master, so it is actually the same as the indirect dependency. In any case, it would make sense that the same git repo URL, except with no commit hash specified, should be considered as compatible with the repo URL containing the extra commit hash.

@blaiseli
Copy link

I'm trying to figure out how I could solve my issue with spurious incompatibilities between the same package installed via git but one via requirements.txt and the other via the install_requires section of a setup.py.

I was wondering whether I would have to remove the git commit hash from the requirements.txt file or add it in the install_requires section of the setup.py of the package that depends on it. However, after reading https://packaging.python.org/discussions/install-requires-vs-requirements/, it seems that the recommended good practice is to pin specific versions in requirements.txt but not in the install_requires section of a setup.py.

@uranusjr Is there a way I can follow the above recommendations, while being able to use the new resolver?

@uranusjr
Copy link
Member

From pip's perspective, any URL spec is pinning the requirement. So it does not really matter; by using a URL is setup.py, you're already pinning.

@blaiseli
Copy link

blaiseli commented Dec 10, 2020

Wouldn't it possible to modify pip so that it at least considers that when the urls just differ by presence / absence of a commit hash, the one without the commit hash should be considered as satisfied by the one with the commit hash?

@uranusjr
Copy link
Member

It is possible, but would require a feature request and substantial discussion, since the current behaviour (use the revision pointed by the default branch) is well-established and likely depended by many.

Variants of a Python package are ultimately identified by their versions, and this mechanism is built into many areas of Python. PEP 508 URL is a mean to specify where pip should download things to satisfy a package version request. By giving semantics to a Git URL’s revision information, we’d essentially introduce an alternative versioning scheme parallel to the package’s version metadata, which would not play well with other parts of packaging. Not that the revision can’t be used as package version—Go’s packaging solution does basically that—but the scheme does not work well in Python packaging without substential designing to fit into the current version system.

@blaiseli
Copy link

Thanks for the explanations. I'll probably just add commit hashes in setup.py, then.

@uranusjr
Copy link
Member

Are there anything left here? Feel free to close if you think this is resolved.

@no-response
Copy link

no-response bot commented Dec 27, 2020

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.

@no-response no-response bot closed this as completed Dec 27, 2020
jrdh added a commit to NaturalHistoryMuseum/ckanext-statistics that referenced this issue Feb 2, 2021
Having some issues with pip conflicts where other ckan extensions
have the same dependencies but without the .git part. This causes
a version conflict. Joy! See here: pypa/pip#9229
@Syntactical01
Copy link

Syntactical01 commented Sep 14, 2021

We are still facing this issue, we have a requirements.txt file generated from a poetry lock file. As such, the requirements file contains all the requirements similar to the OP above since a requirements file generated from a lock would contain ALL requirements for all packages.

So the requirements file generated from the lock contains:

A @ git+https://github.com/org/A.git@master
B @ git+https://github.com/org/B.git@master

but down the chain of requirements for A you will eventually find the following in a setup.py file:

B @ git+https://github.com/org/B.git@master

and all the sudden pip can't do anything because:

Cannot install -r /build/transformer/__package_saved_reqs.txt (line 19) and B 0.1 (from git+https://github.com/org/B@master) because these package versions have conflicting dependencies.

Please advice because I don't see how to get around this issue, this seems like a massive flaw in the new implementation of pip.


For whoever it may help, pip being "strict" is an under statement. I solved it by reading my requirements character by character. They must match perfectly or nothing works. (Not a great update IMO).

The error I was getting:

The conflict is caused by:
    The user requested aws-cloudformation 0.9 (from git+ssh://****@github.com/ORG/aws-cloudformation.git@master)
    pe-ec2 0.0.1.7 depends on aws-cloudformation 0.9 (from git+ssh://****@github.com/org/aws-cloudformation.git@master)

Note the ORG vs org. One file was using camps, one wasn't. This doesn't matter for a normal pip install but pip treated them as different. Is this really desired functionality?

@uranusjr
Copy link
Member

Yes, because pip has no way to tell the two are actually the same repository.

@Syntactical01
Copy link

Syntactical01 commented Sep 14, 2021

@uranusjr GitHub organization and repository names are not case-sensitive though when it comes to checking if there is a duplicate i.e. foobar and FooBar as repository names cannot exist at the same time even if one is under org and the other is under ORG; however, those two could exist if one is under org and the other is under org2. Take for example a link to this very ticket: https://github.com/PyPa/piP/issues/9229. As such pip should not care about their case or you get issues like mine where the URL is 100% valid but pip thinks the two are different.

@pfmoore
Copy link
Member

pfmoore commented Sep 14, 2021

That's a github-specific rule. URLs in general may or may not be case sensitive, and pip has no way of knowing whether a particular VCS URL is cas sensitive or not. And before you ask, no, we're not going to hard-code github's rules - that way lies madness 🙁

@Syntactical01
Copy link

" we're not going to hard-code github's rules - that way lies madness"

Yeah, I can see that. Would be nice to be able to turn off strict mode or had it be done via a --strict flag but half day of debugging and a bunch of PRs later at least its working now. Hopefully no more breaking changes from pip in the near future.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 15, 2021
@pradyunsg pradyunsg removed the S: awaiting response Waiting for a response/more information label Mar 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants