Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip-compile forcefully converts package names to lowercase on second run #431

Closed
IlyaSemenov opened this issue Dec 29, 2016 · 29 comments
Closed

Comments

@IlyaSemenov
Copy link

On a first run, pip-compile creates requirements.txt where package names are properly cased. However on subsequent runs, it converts them all to lower case.

Steps to replicate

Please see the full log below:

semenov@daedra ~ $ pyenv virtualenv 3.5.2 piptools-test                                                                                         "53e50feeIgnoring indexes: https://pypi.python.org/simple
Requirement already satisfied (use --upgrade to upgrade): setuptools in /Users/semenov/.pyenv/versions/3.5.2/envs/piptools-test/lib/python3.5/site-packages
Requirement already satisfied (use --upgrade to upgrade): pip in /Users/semenov/.pyenv/versions/3.5.2/envs/piptools-test/lib/python3.5/site-packages
semenov@daedra ~ $ mkdir -p tmp/piptools-test
semenov@daedra ~ $ cd tmp/piptools-test/
semenov@daedra ~/tmp/piptools-test $ pyenv local piptools-test
(piptools-test) semenov@daedra ~/tmp/piptools-test $ pip install pip-tools
Collecting pip-tools
  Using cached pip_tools-1.8.0-py2.py3-none-any.whl
Collecting first (from pip-tools)
  Using cached first-2.0.1-py2.py3-none-any.whl
Collecting six (from pip-tools)
  Using cached six-1.10.0-py2.py3-none-any.whl
Collecting click>=6 (from pip-tools)
  Using cached click-6.6-py2.py3-none-any.whl
Installing collected packages: first, six, click, pip-tools
Successfully installed click-6.6 first-2.0.1 pip-tools-1.8.0 six-1.10.0
You are using pip version 8.1.1, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
(piptools-test) semenov@daedra ~/tmp/piptools-test $ echo 'Django>=1.9,<1.10' > requirements.in
(piptools-test) semenov@daedra ~/tmp/piptools-test $ pip-compile
#
# This file is autogenerated by pip-compile
# To update, run:
#
#    pip-compile --output-file requirements.txt requirements.in
#
Django==1.9.12
(piptools-test) semenov@daedra ~/tmp/piptools-test $ pip-compile
#
# This file is autogenerated by pip-compile
# To update, run:
#
#    pip-compile --output-file requirements.txt requirements.in
#
django==1.9.12
Expected result

On second run, the dependency in requirements.txt is still "Django" not "django".

Actual result

On second run, pip-compile converts "Django" to "django".

@IlyaSemenov
Copy link
Author

This repeats on macOS Sierra (case-insensitive HFS) and Ubuntu 16.04 (case-sensitive zfs), so I believe it's not filesystem-related.

@hyperknot
Copy link

hyperknot commented Jan 2, 2017

It also converts underscores to dash inconsistently.

@hyperknot
Copy link

The upper/lower case seems to depend on using -U or not in my case.

@hyperknot
Copy link

Can confirm, first run possibly-upper case, second run: always lower case.

@omarkohl
Copy link

omarkohl commented Jan 5, 2017

To me it is unclear when it is converted to upper and when to lower case but it is definitely not consistent and changes every few runs. This is quite annoying because the diffs in version control get a lot bigger than necessary.

I'm fine with either lower or upper case but please consistently :-)

As a workaround I'm considering forcing the whole file to be lower case (including comments etc.) like this:

tr '[:upper:]' '[:lower:]' < requirements.txt > temp.txt && mv temp.txt requirements.txt

@barrywhart
Copy link
Contributor

I wonder, does pip-compile maintain a persistent cache of library dependencies? Just speculating as to the possible cause of this issue (which I've seen as well). I've noticed pip-compile seems faster on subsequent runs, which supports the idea of there being some sort of cache. And lower-casing of names is a common way to allow case insensitive lookup from a cache or other key-value data structure.

I've occasionally seen a different issue (possibly related), where libraries appear in requirements.txt without the usual "via" comment, even though they do not appear in requirements.in. E.g. vine, which is required by newer versions of celery, but not older ones. This happened when I was experimenting with different versions of celery (3.1.25 vs 4.0.2), so I could see how a cache-related bug could cause this behavior.

@davidovich
Copy link
Contributor

@barrywhart Yes, you are right, pip-compile has a cache, and naming seems to be dependent on this cache...

Using the --rebuild flag outputs a case-consistent (lowercased and normalized) set of requirements. I believe a cache flush should fix these problems and future ones because the latest versions of pip do a better package name normalizing.

pip-compile --rebuild

@omarkohl
Copy link

It seems to be fairly consistent here:

  • pip-compile --upgrade -> Some package names end up in upper case (such as SQLAlchemy)
  • pip-compile -> Package names end up lower case (e.g. sqlalchemy)

The --rebuild option makes no difference in either case.

Using pip-tools 1.8.0 and pip 9.0.1

@basvdheuvel
Copy link

So, the problem is sort of established. Is anybody working on a PR?

@basvdheuvel
Copy link

Okay. I have been digging around in the source for a while now. One thing I can conclude is that underscores changing into dashes comes from Setuptools. This is part of a normalization process and it doesn't seem to be inconsistent or causing any problems.

As for the capitalization problem. Do we want to just normalize to lowercase? This is easy to implement. Or do we want to keep the original capitalization? This is a little more work, since the original capitalization needs to be stored in the cache. Since all the examples in the README show lowercase, I'm going to assume that we want just lowercase.

@omarkohl
Copy link

omarkohl commented Feb 3, 2017

lowercase is fine with me

@hyperknot
Copy link

hyperknot commented Feb 3, 2017 via email

@jdufresne
Copy link
Member

jdufresne commented Mar 8, 2017

As for the capitalization problem. Do we want to just normalize to lowercase? This is easy to implement. Or do we want to keep the original capitalization? This is a little more work, since the original capitalization needs to be stored in the cache.

I recently noticed this change while working on master. To be honest, I was a bit surprised by this and thought there was a new bug converting "Django" -> "django". In my opinion, I think a project's specified name on PyPI should be preserved as-is instead of trying to force it to some other convention. I understand .lower() is easier than some other solution, but it seems odd to choose a name other than what a 3rd party package has specified.

@basvdheuvel
Copy link

I agree, if it were the case that pip actually cared about this. It doesn't. This is a wrapper tool for pip. Why should we care?

@jdufresne
Copy link
Member

When reviewing the output of pip-compile, the fewer name transforms and mangling, the easier it is to check for correctness and compare to a verifiable source, which I see as a very important feature of pip-compile. A particular capitalization is how a 3rd party package chose to register and spell their package. We should honor that.

@davidovich
Copy link
Contributor

@jdufresne while there is still some mystery about name transformations throughout the download/caching of packages, I think that lower casing is the least surprising.

The alternative is to have package names change case on odd compilation runs, which is more annoying.

The problem exists in the tooling around pip, and I believe there is no general API to normalize a package name. Note that this is not exclusive to pip-tools, the pipenv project has had a lot or problems with exactly this situation.

@IlyaSemenov
Copy link
Author

I think that lower casing is the least surprising

How so? The least surprising is to have the original, package maintainer supplied capitalization. Any mangling whatsoever is more surprising.

The alternative is to have package names change case on odd compilation runs

No, this is a false dilemma. The real alternative is to keep package names always in the original, package maintainer supplied capitalization.

The problem exists in the tooling around pip

I don't see how this could be used as an argument. The fact that the problem exists around and is not exclusive to pip-tools doesn't mean it should be ignored, it's actually the opposite. Someone will need to break the vicious circle at some point.

That said, I agree that lowercasing everything did fix the original problem that I reported, and I believe it's a good compromise between minimizing efforts and maximizing result. I just don't understand why false arguments are being used to put this as the most appropriate and comprehensive solution.

@jdufresne
Copy link
Member

The problem exists in the tooling around pip

Are you aware of any upstream issues, PRs, or commits that identify or approach this problem? If so I'd like to track the issue there as well. I was unable to find any.

@barrywhart
Copy link
Contributor

I would prefer that case be preserved, and would be willing to spend some time to try and find a solution which does so. Would it make sense to hold off on merging this while we explore potential alternative?

@davidovich
Copy link
Contributor

@barrywhart this issue is resolved by #452, and merged in 1.8.1rc already. Have you had time to explore the case preserving scenario ?

I think stabilizing compilation output brings more value right now, and we could revisit later?

@barrywhart
Copy link
Contributor

I haven't had time yet to look at preserving case. I agree, this is definitely a useful improvement as-is. Let's go ahead and include it.

@davidovich
Copy link
Contributor

@IlyaSemenov I appreciate the thoroughness of the analyse of my answer, but the problem is a compound one and is hard to pin down right now, given that I do not have all the answers concerning interactions in the pip-tools, pip and setuptools code bases. All these can introduce naming variations.

I think that lower casing is the least surprising
How so? The least surprising is to have the original, package maintainer supplied capitalization. Any mangling whatsoever is more surprising.

From the compile viewpoint, it is less surprising that two runs produce a stable output. I agree that it can be surprising that the case is not preserved, and that the ideal case is to maintain it.

No, this is a false dilemma. The real alternative is to keep package names always in the original, package maintainer supplied capitalization.

Agreed. The current observable implementation is not exact, and results in varying case in the output. Comments on this issue seem to agree that lowercasing is an acceptable solution which is why #452 was merged in.

I believe it's a good compromise between minimizing efforts and maximizing result.

I also think so.

@davidovich
Copy link
Contributor

@jdufresne I monitored issues surrounding proper_case (https://github.com/kennethreitz/pipenv/search?q=proper_case&type=Issues&utf8=%E2%9C%93) in the pipenv project, and that enforced my current thought that it is not a simple fix.

@ALL Are we ok with the lowercasing?

@jdufresne
Copy link
Member

I agree that while name transforms may not be ideal, lower casing is the best compromise given the limitations of upstream tools. Without the hard work to get to the root of the issue, I see no reason to have this hold back a release. Thanks for hearing me out and the informed thoughtful responses.

@hyperknot
Copy link

I totally agree that lower case is an almost perfect solution! Let's use it!

@hyperknot
Copy link

hyperknot commented Apr 18, 2017

The new lower case is perfect, but semantic_version still gets converted into semantic-version.

@basvdheuvel
Copy link

@hyperknot As I mentioned before:

One thing I can conclude is that underscores changing into dashes comes from Setuptools. This is part of a normalization process and it doesn't seem to be inconsistent or causing any problems.

@hyperknot
Copy link

@klaplong as long as it's consistent there is no problem.

jfly added a commit to jfly/pipenv that referenced this issue May 2, 2018
This fixes pypa#2113.

This bug was introduced in
<jazzband/pip-tools#452> as a band-aid fix to
<jazzband/pip-tools#431>. Pipenv then copied
that code in <pypa@2553ebc#diff-b56b95ccea8595a0f6f24ea753842976>, and inherited this latent bug.

Maybe the right fix is for pypa/packaging to lowercase the name? There's
a comment here
<https://github.com/pypa/packaging/blob/16.8/packaging/requirements.py#L86>
about normalizing the requirement's name, which might be what this is
referring to.
jfly added a commit to jfly/pipenv that referenced this issue May 7, 2018
This fixes pypa#2113.

This bug was introduced in
<jazzband/pip-tools#452> as a band-aid fix to
<jazzband/pip-tools#431>. Pipenv then copied
that code in <pypa@2553ebc#diff-b56b95ccea8595a0f6f24ea753842976>, and inherited this latent bug.

Maybe the right fix is for pypa/packaging to lowercase the name? There's
a comment here
<https://github.com/pypa/packaging/blob/16.8/packaging/requirements.py#L86>
about normalizing the requirement's name, which might be what this is
referring to.
jfly added a commit to jfly/pipenv that referenced this issue May 19, 2018
This fixes pypa#2113.

This bug was introduced in
<jazzband/pip-tools#452> as a band-aid fix to
<jazzband/pip-tools#431>. Pipenv then copied
that code in <pypa@2553ebc#diff-b56b95ccea8595a0f6f24ea753842976>, and inherited this latent bug.

Maybe the right fix is for pypa/packaging to lowercase the name? There's
a comment here
<https://github.com/pypa/packaging/blob/16.8/packaging/requirements.py#L86>
about normalizing the requirement's name, which might be what this is
referring to.
jfly added a commit to jfly/pipenv that referenced this issue May 23, 2018
This fixes pypa#2113.

This bug was introduced in
<jazzband/pip-tools#452> as a band-aid fix to
<jazzband/pip-tools#431>. Pipenv then copied
that code in <pypa@2553ebc#diff-b56b95ccea8595a0f6f24ea753842976>, and inherited this latent bug.

Maybe the right fix is for pypa/packaging to lowercase the name? There's
a comment here
<https://github.com/pypa/packaging/blob/16.8/packaging/requirements.py#L86>
about normalizing the requirement's name, which might be what this is
referring to.
jfly added a commit to jfly/pipenv that referenced this issue May 23, 2018
This fixes pypa#2113.

This bug was introduced in
<jazzband/pip-tools#452> as a band-aid fix to
<jazzband/pip-tools#431>. Pipenv then copied
that code in <pypa@2553ebc#diff-b56b95ccea8595a0f6f24ea753842976>, and inherited this latent bug.

Maybe the right fix is for pypa/packaging to lowercase the name? There's
a comment here
<https://github.com/pypa/packaging/blob/16.8/packaging/requirements.py#L86>
about normalizing the requirement's name, which might be what this is
referring to.

To test this, I invented a new, very simple python package called
`depends-on-marked-package`. The setup.py for this package is just:

```python
import setuptools

setuptools.setup(
    name="depends-on-marked-package",
    version="0.0.1",
    packages=setuptools.find_packages(),
    install_requires=['pytz; platform_python_implementation=="CPython"'],
)
```

This is a simplified version of gevent's setup.py's install_requires upon greenlet.
@Stannislav
Copy link

Not sure I understand why so many are excited about changing the original package spelling. This is not even compatible with pip freeze nor with what's listed on PyPI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants