Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2020-resolver] Pip downloads lots of different versions of the same package #8713

Closed
jcugat opened this issue Aug 5, 2020 · 54 comments
Closed
Labels
state: needs discussion This needs some more discussion UX User experience related

Comments

@jcugat
Copy link

jcugat commented Aug 5, 2020

First of all, apologies if this was already reported or it's expected behaviour. I tried searching previous issues but couldn't find anything, the only similar issue might be #8683 but the output it's different.

What did you want to do?

From a completely empty virtualenv:

$ python -V
Python 3.7.6
$ pip -V
pip 20.2.1 from /Users/josepcugat/.pyenv/versions/3.7.6/envs/albus/lib/python3.7/site-packages/pip (python 3.7)
$ pip install --use-feature=2020-resolver "aiobotocore>=1.0.7" "boto3~=1.10"

Output

Collecting aiobotocore>=1.0.7
  Using cached aiobotocore-1.0.7-py3-none-any.whl (42 kB)
Collecting botocore<1.15.33,>=1.15.32
  Using cached botocore-1.15.32-py2.py3-none-any.whl (6.0 MB)
Collecting aioitertools>=0.5.1
  Using cached aioitertools-0.7.0-py3-none-any.whl (20 kB)
Collecting typing_extensions>=3.7
  Using cached typing_extensions-3.7.4.2-py3-none-any.whl (22 kB)
Processing /Users/josepcugat/Library/Caches/pip/wheels/62/76/4c/aa25851149f3f6d9785f6c869387ad82b3fd37582fa8147ac6/wrapt-1.12.1-cp37-cp37m-macosx_10_15_x86_64.whl
Collecting docutils<0.16,>=0.10
  Using cached docutils-0.15.2-py3-none-any.whl (547 kB)
Collecting jmespath<1.0.0,>=0.7.1
  Using cached jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Collecting aiohttp>=3.3.1
  Using cached aiohttp-3.6.2-cp37-cp37m-macosx_10_13_x86_64.whl (642 kB)
Collecting async-timeout<4.0,>=3.0
  Using cached async_timeout-3.0.1-py3-none-any.whl (8.2 kB)
Collecting attrs>=17.3.0
  Using cached attrs-19.3.0-py2.py3-none-any.whl (39 kB)
Collecting chardet<4.0,>=2.0
  Using cached chardet-3.0.4-py2.py3-none-any.whl (133 kB)
Collecting multidict<5.0,>=4.5
  Using cached multidict-4.7.6-cp37-cp37m-macosx_10_14_x86_64.whl (48 kB)
Collecting yarl<2.0,>=1.0
  Using cached yarl-1.5.1-cp37-cp37m-macosx_10_14_x86_64.whl (127 kB)
Collecting idna>=2.0
  Using cached idna-2.10-py2.py3-none-any.whl (58 kB)
Collecting urllib3<1.26,>=1.20
  Using cached urllib3-1.25.10-py2.py3-none-any.whl (127 kB)
Collecting python-dateutil<3.0.0,>=2.1
  Using cached python_dateutil-2.8.1-py2.py3-none-any.whl (227 kB)
Collecting six>=1.5
  Using cached six-1.15.0-py2.py3-none-any.whl (10 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.35-py2.py3-none-any.whl (129 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.34-py2.py3-none-any.whl (129 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.33-py2.py3-none-any.whl (129 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.32-py2.py3-none-any.whl (129 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.31-py2.py3-none-any.whl (129 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.30-py2.py3-none-any.whl (129 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.29-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.28-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.27-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.26-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.25-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.24-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.23-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.22-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.21-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.20-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.19-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.18-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.17-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.16-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.15-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.14-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.13-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.12-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.11-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.10-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.9-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.8-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.7-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.6-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.5-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.4-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.3-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.2-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.1-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.14.0-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.26-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.25-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.24-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.23-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.22-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.21-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.20-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.19-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.18-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.17-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.16-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.15-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.14-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.13-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.12-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.11-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.10-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.9-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.8-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.7-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.6-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.5-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.4-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.3-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.2-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.1-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.13.0-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.49-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.48-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.47-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.46-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.45-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.44-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.43-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.42-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.41-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.40-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.39-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.38-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.37-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.36-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.35-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.34-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.33-py2.py3-none-any.whl (128 kB)
Collecting boto3~=1.10
  Using cached boto3-1.12.32-py2.py3-none-any.whl (128 kB)
Collecting s3transfer<0.4.0,>=0.3.0
  Using cached s3transfer-0.3.3-py2.py3-none-any.whl (69 kB)
Installing collected packages: six, urllib3, typing-extensions, python-dateutil, multidict, jmespath, idna, docutils, yarl, chardet, botocore, attrs, async-timeout, wrapt, s3transfer, aioitertools, aiohttp, boto3, aiobotocore
Successfully installed aiobotocore-1.0.7 aiohttp-3.6.2 aioitertools-0.7.0 async-timeout-3.0.1 attrs-19.3.0 boto3-1.12.32 botocore-1.15.32 chardet-3.0.4 docutils-0.15.2 idna-2.10 jmespath-0.10.0 multidict-4.7.6 python-dateutil-2.8.1 s3transfer-0.3.3 six-1.15.0 typing-extensions-3.7.4.2 urllib3-1.25.10 wrapt-1.12.1 yarl-1.5.1

Additional information

I was not expecting to download all those different versions of boto3, since previously pip only downloaded a single one:

$ pip install "aiobotocore>=1.0.7" "boto3~=1.10"
Collecting aiobotocore>=1.0.7
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/a8/91/deb864d92c5ca332d897b521072f6078992b46e2c67da8365a4ee5b9cd47/aiobotocore-1.0.7-py3-none-any.whl (42 kB)
Collecting boto3~=1.10
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/cf/50/e9c3b7a5b0e06f9b3818074400f83482063c582bd2f5af799adecbd0b0cd/boto3-1.14.35-py2.py3-none-any.whl (129 kB)
Collecting botocore<1.15.33,>=1.15.32
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/49/86/6448bb5ab4b0c169f379fce589e568e798907b569eaeb012c720a4dd9ca2/botocore-1.15.32-py2.py3-none-any.whl (6.0 MB)
Collecting aioitertools>=0.5.1
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/83/42/90df27c516ce54fa26964bc4a632ecaf352c7e99574b515255e48b4a7cc7/aioitertools-0.7.0-py3-none-any.whl (20 kB)
Collecting aiohttp>=3.3.1
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/40/fd/3a595d6467eb31f7b69eb980778567e764b5d93990b4ceb8ddf6079dd776/aiohttp-3.6.2-cp37-cp37m-macosx_10_13_x86_64.whl (642 kB)
Processing /Users/josepcugat/Library/Caches/pip/wheels/22/57/67/bb23e07497606e6e48717206afde74034f2eba5c43a0903d6f/wrapt-1.12.1-cp37-cp37m-macosx_10_15_x86_64.whl
Collecting jmespath<1.0.0,>=0.7.1
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/07/cb/5f001272b6faeb23c1c9e0acc04d48eaaf5c862c17709d20e3469c6e0139/jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Collecting s3transfer<0.4.0,>=0.3.0
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/69/79/e6afb3d8b0b4e96cefbdc690f741d7dd24547ff1f94240c997a26fa908d3/s3transfer-0.3.3-py2.py3-none-any.whl (69 kB)
Collecting docutils<0.16,>=0.10
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/22/cd/a6aa959dca619918ccb55023b4cb151949c64d4d5d55b3f4ffd7eee0c6e8/docutils-0.15.2-py3-none-any.whl (547 kB)
Collecting python-dateutil<3.0.0,>=2.1
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/d4/70/d60450c3dd48ef87586924207ae8907090de0b306af2bce5d134d78615cb/python_dateutil-2.8.1-py2.py3-none-any.whl (227 kB)
Collecting urllib3<1.26,>=1.20; python_version != "3.4"
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/9f/f0/a391d1463ebb1b233795cabfc0ef38d3db4442339de68f847026199e69d7/urllib3-1.25.10-py2.py3-none-any.whl (127 kB)
Collecting typing_extensions>=3.7
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/0c/0e/3f026d0645d699e7320b59952146d56ad7c374e9cd72cd16e7c74e657a0f/typing_extensions-3.7.4.2-py3-none-any.whl (22 kB)
Collecting chardet<4.0,>=2.0
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133 kB)
Collecting attrs>=17.3.0
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/a2/db/4313ab3be961f7a763066401fb77f7748373b6094076ae2bda2806988af6/attrs-19.3.0-py2.py3-none-any.whl (39 kB)
Collecting multidict<5.0,>=4.5
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/ce/d7/bb8c9cd3189b1698ec1fa60c4862dc0be49cfda665fa402f54f5721cc284/multidict-4.7.6-cp37-cp37m-macosx_10_14_x86_64.whl (48 kB)
Collecting async-timeout<4.0,>=3.0
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/e1/1e/5a4441be21b0726c4464f3f23c8b19628372f606755a9d2e46c187e65ec4/async_timeout-3.0.1-py3-none-any.whl (8.2 kB)
Collecting yarl<2.0,>=1.0
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/0c/69/5f1e593fb13b9c989980e1bc051d9e85e094d456a32c08b5accd62670d09/yarl-1.5.1-cp37-cp37m-macosx_10_14_x86_64.whl (127 kB)
Collecting six>=1.5
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/ee/ff/48bde5c0f013094d729fe4b0316ba2a24774b3ff1c52d924a8a4cb04078a/six-1.15.0-py2.py3-none-any.whl (10 kB)
Collecting idna>=2.0
  Using cached https://artifactory.skyscannertools.net/artifactory/api/pypi/pypi/packages/packages/a2/38/928ddce2273eaa564f6f50de919327bf3a00f091b5baba8dfa9460f3a8a8/idna-2.10-py2.py3-none-any.whl (58 kB)
Installing collected packages: docutils, six, python-dateutil, jmespath, urllib3, botocore, typing-extensions, aioitertools, chardet, attrs, multidict, async-timeout, idna, yarl, aiohttp, wrapt, aiobotocore, s3transfer, boto3
ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.

We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.

boto3 1.14.35 requires botocore<1.18.0,>=1.17.35, but you'll have botocore 1.15.32 which is incompatible.
Successfully installed aiobotocore-1.0.7 aiohttp-3.6.2 aioitertools-0.7.0 async-timeout-3.0.1 attrs-19.3.0 boto3-1.14.35 botocore-1.15.32 chardet-3.0.4 docutils-0.15.2 idna-2.10 jmespath-0.10.0 multidict-4.7.6 python-dateutil-2.8.1 s3transfer-0.3.3 six-1.15.0 typing-extensions-3.7.4.2 urllib3-1.25.10 wrapt-1.12.1 yarl-1.5.1

I also tested this with the latest master version of pip and the same issue happens.

Output from pipdeptree:

aiobotocore==1.0.7
  - aiohttp [required: >=3.3.1, installed: 3.6.2]
    - async-timeout [required: >=3.0,<4.0, installed: 3.0.1]
    - attrs [required: >=17.3.0, installed: 19.3.0]
    - chardet [required: >=2.0,<4.0, installed: 3.0.4]
    - multidict [required: >=4.5,<5.0, installed: 4.7.6]
    - yarl [required: >=1.0,<2.0, installed: 1.5.1]
      - idna [required: >=2.0, installed: 2.10]
      - multidict [required: >=4.0, installed: 4.7.6]
      - typing-extensions [required: >=3.7.4, installed: 3.7.4.2]
  - aioitertools [required: >=0.5.1, installed: 0.7.0]
    - typing-extensions [required: >=3.7, installed: 3.7.4.2]
  - botocore [required: >=1.15.32,<1.15.33, installed: 1.15.32]
    - docutils [required: >=0.10,<0.16, installed: 0.15.2]
    - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
    - python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.1]
      - six [required: >=1.5, installed: 1.15.0]
    - urllib3 [required: >=1.20,<1.26, installed: 1.25.10]
  - wrapt [required: >=1.10.10, installed: 1.12.1]
boto3==1.12.32
  - botocore [required: >=1.15.32,<1.16.0, installed: 1.15.32]
    - docutils [required: >=0.10,<0.16, installed: 0.15.2]
    - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
    - python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.1]
      - six [required: >=1.5, installed: 1.15.0]
    - urllib3 [required: >=1.20,<1.26, installed: 1.25.10]
  - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
  - s3transfer [required: >=0.3.0,<0.4.0, installed: 0.3.3]
    - botocore [required: >=1.12.36,<2.0a.0, installed: 1.15.32]
      - docutils [required: >=0.10,<0.16, installed: 0.15.2]
      - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
      - python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.1]
        - six [required: >=1.5, installed: 1.15.0]
      - urllib3 [required: >=1.20,<1.26, installed: 1.25.10]
pipdeptree==1.0.0
  - pip [required: >=6.0.0, installed: 20.3.dev0]
setuptools==49.2.1
wheel==0.34.2
@pradyunsg
Copy link
Member

pradyunsg commented Aug 13, 2020

Thanks for filing this issue @jcugat!

pip does indeed try to use multiple versions of the same package. This is because of conflicting requirements in the dependency graph it's working with, and is a part of the proper dependency resolution process (the old resolver did not do things correctly). It basically tries to use a specific version of boto3, and then, when it realizes that version creates a conflict, it backtracks that choice and tries the next version.

We are working on what would be a good way to convey this behavior change, and figuring out a good way to present this to the users -- would you have any suggestions/inputs to that end?

@pradyunsg pradyunsg added C: new resolver UX User experience related state: needs discussion This needs some more discussion labels Aug 13, 2020
@jcugat
Copy link
Author

jcugat commented Aug 13, 2020

What I found really surprising is that pip needs to download all those packages to do the backtracking. Doesn't have enough info with the versions required to go directly from boto3-1.14.35 to boto3-1.12.32 without downloading all the intermediate ones?

@pfmoore
Copy link
Member

pfmoore commented Aug 13, 2020

Unfortunately no, that's not how the algorithms work. In theory, intermediate versions could have different dependencies that alter the possibilities - and we have to download to find the dependencies.

It's one of the frustrating "this might happen so we have to allow for it, even though in practice nobody¹ ever does this, so it's a waste of time" features of Python packaging that it's difficult to explain well to people who don't have to deal with the silly edge cases...

Obligatory XKCD: https://xkcd.com/1172/

¹ Except that one guy with a package with a really weird workflow, who yells when you assume you can simplify things 🙁

@McSinyx
Copy link
Contributor

McSinyx commented Aug 13, 2020

@jcugat, pip download the distributions only to retrieve the dependency information. Say if some project has dependency requirements (e.g. spam>42) conflicting with that of boto3>1.12.32 (e.g. spam<42), there's no way pip can know what spam version that boto3 from 1.12.33 to 1.14.34 require by just looking at boto3 1.14.35. It seems that there's no way to lower the complexity of dependency resolution (which is NP-hard).

Concerning each download, however, there could be faster way than downloading whole distributions (e.g. only the metadata part of a wheel like --use-feature=fast-deps or expecting such info from a package index, i.e. pypi/warehouse#8254) but I think complex backtracking will still take time.

@pfmoore
Copy link
Member

pfmoore commented Aug 13, 2020

I have a vague intention to one day look at pip maintaining some sort of persistent cache of dependency data. But it's low priority (and that guy I mentioned would no doubt come along with a use case that invalidates the idea 😉).

As @McSinyx says though, the problem of dependency resolution is fundamentally hard, though, so there's limits to how much we can do (if we exclude the option of "get the wrong answers", which is what the old resolver did, and which proved not to be what people wanted 😉)

@uranusjr
Copy link
Member

I wondered a while ago whether Core Metadata could add a field for packages to declare they follow Semantic Versioning. pip could backtrack much more efficiently with that assumption. But then the question would be what do we do if a package declaring it does not actually follow the rules.

@jcugat
Copy link
Author

jcugat commented Aug 13, 2020

Ok, that makes a lot of sense now, and I see the complexity and what pip is trying to do in those cases. But from a user's perspective, it was very surprising at first, like pip was under an infinite loop downloading all possible versions of boto3. The issue could be solved with better output from pip during dependency resolution, similar to what's being tracked in #8683 or #8346.

@pradyunsg
Copy link
Member

From a graph exploration perspective -- the algorithm in resolvelib is not the most optimized and we can certainly do a lot more of "tree trimming" tricks there. The issue is that it's gonna be non-trivial for us to implement those and that we're trying to do this work with limited funded developer time.

I'd love to spend a few months pulling my hair out trying to figure out why my implementation of some optimization doesn't work BUT I'm pretty sure it is a bigger priority to get something good-enough out of the door instead of getting to perfect.

@smuhit
Copy link

smuhit commented Aug 21, 2020

@pradyunsg: I don't have a problem with the resolver needing to introspect the dependencies for several versions of a package before deciding on a version to install. In fact, I agree with the reasoning behind it.

What I do have a problem with is having to download the full versions to figure out the dependencies. Consider running the following: pip install --use-feature=2020-resolver "py4j<0.10.5" pyspark
If no version of pyspark is already in the cache, this will download (at the time of writing) 15 different versions of pyspark to satisfy the py4j requirement. Since the pyspark packages are roughly around 200MB each, that's roughly 3 GB it needs to download before figuring out which one to use.

Granted, this is an extreme example. But considering that there are systems with limited resources (a raspberry pi for instance), this could very well be a problem, especially if there are several libraries that need to be installed all with various interdependencies between them.

Ideally, the dependencies should be retrievable as metadata about the package version and only download the full package if that piece of metadata is missing. But considering that adding it as metadata probably isn't trivial, it's probably a future "nice to have".

I don't have an easy solution to this, just thought you should keep it in mind.

Also, I did want to say thank you for working on the resolver. It sorely needed updating.

@uranusjr
Copy link
Member

uranusjr commented Aug 21, 2020

That’s unfortunately how Python packaging works. There are proposals for better methods, but pip does not have a choice at the current time. Feel free to join the conversations on discuss.python.org if you are interested in improving the situation.

@pradyunsg
Copy link
Member

pradyunsg commented Aug 21, 2020

Indeed. It's definitely something that we do have in mind. There are planned changes that would address those concerns. I don't think we can make any promises on the timeline of those since they're volunteer-driven.

Two that come to mind are:

In particular, I think the important parts for this issue, in the short term, are that we get the output correct, and communicate about these changes as best as possible (through documentation of the changes + workarounds, signal boosting etc).

@tomer-dev
Copy link

tomer-dev commented Oct 15, 2020

It happens here as well, please take a look!
Thank you for the resolver!

/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (3.0.12)
Requirement already satisfied: zipp==0.5.1 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.5.1)
Requirement already satisfied: pluggy==0.12.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.12.0)
Requirement already satisfied: py==1.8.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (1.8.0)
Requirement already satisfied: filelock==3.0.12 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (3.0.12)
Requirement already satisfied: zipp==0.5.1 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.5.1)
Requirement already satisfied: pluggy==0.12.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.12.0)
Requirement already satisfied: py==1.8.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (1.8.0)
Requirement already satisfied: filelock==3.0.12 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (3.0.12)
Requirement already satisfied: zipp==0.5.1 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.5.1)
Requirement already satisfied: pluggy==0.12.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.12.0)
Requirement already satisfied: py==1.8.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (1.8.0)
Requirement already satisfied: filelock==3.0.12 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (3.0.12)
Requirement already satisfied: zipp==0.5.1 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.5.1)
Requirement already satisfied: pluggy==0.12.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.12.0)
Requirement already satisfied: py==1.8.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (1.8.0)
Requirement already satisfied: filelock==3.0.12 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (3.0.12)
Requirement already satisfied: zipp==0.5.1 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.5.1)
Requirement already satisfied: pluggy==0.12.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.12.0)
Requirement already satisfied: py==1.8.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (1.8.0)
Requirement already satisfied: filelock==3.0.12 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (3.0.12)
Requirement already satisfied: zipp==0.5.1 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.5.1)
Requirement already satisfied: pluggy==0.12.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.12.0)
Requirement already satisfied: py==1.8.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (1.8.0)
Requirement already satisfied: filelock==3.0.12 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (3.0.12)
Requirement already satisfied: zipp==0.5.1 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.5.1)
Requirement already satisfied: pluggy==0.12.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (0.12.0)
Requirement already satisfied: py==1.8.0 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (1.8.0)
Requirement already satisfied: filelock==3.0.12 in /usr/local/lib/python3.7/site-packages (from x_project~=2.0->-r requirements.txt (line 6)) (3.0.12)

@ssbarnea
Copy link
Contributor

Another example of endless loop caused by the use of the new resolver: https://github.com/ansible-community/molecule/blob/master/Dockerfile -- if you miss to see the PIP_USE_FEATURE=2020-resolver in the file it means I succeeded in removing it.

ssbarnea added a commit to ansible/molecule that referenced this issue Oct 19, 2020
Apparently new resolved can get into endless loops of building
packages.

Related: pypa/pip#8713
ssbarnea added a commit to ansible/molecule that referenced this issue Oct 19, 2020
Apparently new resolved can get into endless loops of building packages.

Related: pypa/pip#8713
@ei8fdb
Copy link
Contributor

ei8fdb commented Oct 21, 2020

I've opened this discussion in discuss.python to suggest possible "low-tech" improvements for the users experience in this situation.

TL; DR - my suggestion is based on the presumption that these interim packages are no longer important to the user, therefore pip should "clean up" - this can be a prompt to the user, a message printed explaining how the user can do this, or it being done automatically.

I don't believe the solution here is to only bring the user to a documentaton page explaining why pip has done this. What is needed is, in order of priority, 1) enable the user to remove these packages (if it does not affect their environment), 2) explain to them why this has happened.

@sinoroc
Copy link
Contributor

sinoroc commented Oct 21, 2020

Has something like the following been considered?

(Apologies if this idea is based on wrong assumptions about how pip does its dependency resolution.)

@pfmoore
Copy link
Member

pfmoore commented Oct 21, 2020

That was the point behind @uranusjr's comment here and @pradyunsg's follow-up.

Basically, we could do a lot better with infrastructure/standards changes, and we are making (slow - we're all volunteers) progress with such changes. Once they happen, pip should take advantage of them (issues like this will be reminders to do that). The question right now is what we can do in pip in the context of current infrastructure.

@dstufft
Copy link
Member

dstufft commented Oct 21, 2020

pypi/warehouse#8254

@cpoptic
Copy link

cpoptic commented Mar 21, 2021

Hilarious. Also totally agree with @c7hm4r

Why does PIP not delete the previously downloaded packages at the moment when it decides that it needs another version?

That would surely be more efficient than filling up your disk with dozens and dozens of totally unnecessary previous versions of a package.

@ssbarnea
Copy link
Contributor

In fact keeping them is not so bad. Think about using pip on 100-200 projects, all tested with multiple versions of python, some with maintenance branches. You endup installing a wide range of versions of the same package.

IMHO what it needs to do is to index their metadata and store the last use of a package. If every week it would remove the oldest packages (30d?) it would be ok.

@c7hm4r
Copy link

c7hm4r commented Mar 24, 2021 via email

@ssbarnea
Copy link
Contributor

AFAIK, pip download cache is per user and not per system or per project. My cache is now ~9.5GB 🤔, you can find it running pip cache dir. The stuff happening in /tmp is likely building of binary packages, something that I suspect would be cleaned after each build and have the built wheel kept inside the cache.

I would indeed be worried about cluttering /tmp, but not so much about the cache. AFAIK, /tmp should never be used for caching things.

I do consider this bug fixed for now and I think you should better create a new one that would be very clear about the problem, we are already going in weird directions and I would be sad to see maintainers having to lock the topic to reduce noise. There are 3 different issues debated in the last messages: footprint of cache, footprint of /tmp and amount of downloads from the index. While strategy on one affects others, a bug should be only about one specific issue.

@glacials
Copy link

Thank you to the maintainers for explaining the decision in calm and kind words and for all your hard volunteer work on this project 🙏🏻

I'd like to reinforce what a couple people mentioned above that while the behavior today is overall a good thing and the more extreme cases like boto3 and azure will tend to come out in the wash, the messaging could be improved. The fact that we all arrived here after googling due to our confusion about pip's behavior is proof that it's unclear what pip is doing and why.

As an example, rather hundreds of lines like

  Downloading boto3-1.7.40-py2.py3-none-any.whl (128 kB)
  Downloading boto3-1.7.39-py2.py3-none-any.whl (128 kB)
  Downloading boto3-1.7.38-py2.py3-none-any.whl (128 kB)
  Downloading boto3-1.7.37-py2.py3-none-any.whl (128 kB)
  Downloading boto3-1.7.36-py2.py3-none-any.whl (128 kB)
  Downloading boto3-1.7.35-py2.py3-none-any.whl (128 kB)
  Downloading boto3-1.7.34-py2.py3-none-any.whl (128 kB)

it could be nice to see a single line like

Backtracking boto3 to find a compatible version...
# Later...
Backtracking boto3 to find a compatible version... using boto3-1.7.34-py2.py3-none-any.whl

@pfmoore
Copy link
Member

pfmoore commented Mar 31, 2021

the messaging could be improved

We are definitely in agreement with that. However, it's hard to work out how to do that - the resolver mechanism is complex (both in terms of the code, and algorithmically) and working out both how to capture the necessary progress information, and when to report it and how to summarise it, is not easy.

If anyone wants to look at the code and come up with suggestions, that would be most welcome. Otherwise be assured that we do want to improve the messages, but we can't promise how soon we'll be able to do so.

@pradyunsg
Copy link
Member

it could be nice to see a single line like

We have the messaging you're suggesting, already being logged. Copying from an example above:

INFO: pip is looking at multiple versions of azure-storage-blob to determine which version is compatible with other requirements. This could take a while.

@ssbarnea
Copy link
Contributor

Both aws and azure managed to put themselves into a corner by using questionable packing methods for they pypi uploads. Is clearly not pip fault these packages create installation problems.

In fact I do like displaying each version on a separated line, it is raising awareness about dependencies being too loosely specified somewhere in the chain. The trick is to narrow down the ranges in order to speed-up the process.

@howardjones
Copy link

I'm 8 hours into a "pip install" run - does it ever actually give up? :-)
(trying to build an ansible execution container with azure-cli and azure.azcollection inside)

@cidermole
Copy link

Would it be possible to cache the dependency graph of all versions of a package on the pypi side, and provide it to pip clients as one file?

@pfmoore
Copy link
Member

pfmoore commented Apr 24, 2021

Would it be possible to cache the dependency graph of all versions of a package on the pypi side

PyPI doesn't have that information. For wheels it could (but doesn't currently) extract it from the wheel. For sdists, though, it can't know the information without running a build, and there's no infrastructure set up to do that.

Also, it's possible for the dependency graph to change based on the target architecture, so "dependencies for all versions" isn't enough, you need "dependencies for all versions on all architectures".

@pradyunsg
Copy link
Member

For wheels it could (but doesn't currently) extract it from the wheel.

And, I should note that this is part of a PyPI revamp that the PSF's Packaging-WG is seeking funding for. https://github.com/psf/fundable-packaging-improvements/blob/master/FUNDABLES.md#revamp-pypi-api

@dolfinus
Copy link

This issue can become a nightmare after merging #9631.

@ifokeev
Copy link

ifokeev commented Apr 29, 2021

same here, a real nightmare

@safijari
Copy link

safijari commented Jul 8, 2021

Just faceplanted into this when I found my CI running for 3 hours... I like how #9631's reasoning for let's not merge this is "people will be angry" and not "we'll break a whole bunch of shit without giving a good alternative"

@mtylerpreston
Copy link

I'm 45 hours into a RUN python -m pip install . on Tensorflow's object detection docker build...is it madness to let it keep going? Or is there something else I should do?

@McSinyx
Copy link
Contributor

McSinyx commented Jul 19, 2021 via email

@mtylerpreston
Copy link

mtylerpreston commented Jul 19, 2021

Thank you McSinyx!

Do you have access to the output? Is it stuck at resolution computation or still downloading?

Here is the output:
=> [11/11] RUN python -m pip install . 231074.1s
=> => # Downloading packaging-14.3-py2.py3-none-any.whl (16 kB)
=> => # INFO: pip is looking at multiple versions of kaggle to determine which
=> => # version is compatible with other requirements. This could take a whil
=> => # e.
=> => # Collecting kaggle>=1.3.9
=> => # Downloading kaggle-1.5.10.tar.gz (59 kB)

If you could figure out which package causes the conflict you can contact upstream to for better compatibility.

It seems that numerous versions of this package took up the most time, although there was plenty of time spent on others before it: packaging-14.3-py2.py3-none-any.whl. How would I contact upstream?

It seems to get slower and slower with each package/version that it tries. Is that right? And if so, why? Thanks again!

FYI - I still haven't stopped it because why not. At 64 hours now...[laughs to keep from crying] 😆

@marzhar
Copy link

marzhar commented Jul 19, 2021

Thank you McSinyx!

Do you have access to the output? Is it stuck at resolution computation or still downloading?

Here is the output:
=> [11/11] RUN python -m pip install . 231074.1s
=> => # Downloading packaging-14.3-py2.py3-none-any.whl (16 kB)
=> => # INFO: pip is looking at multiple versions of kaggle to determine which
=> => # version is compatible with other requirements. This could take a whil
=> => # e.
=> => # Collecting kaggle>=1.3.9
=> => # Downloading kaggle-1.5.10.tar.gz (59 kB)

If you could figure out which package causes the conflict you can contact upstream to for better compatibility.

It seems that numerous versions of this package took up the most time, although there was plenty of time spent on others before it: packaging-14.3-py2.py3-none-any.whl. How would I contact upstream?

It seems to get slower and slower with each package/version that it tries. Is that right? And if so, why? Thanks again!

FYI - I still haven't stopped it because why not. At 64 hours now...[laughs to keep from crying] 😆

are you trying to install object detection api? if so..the reason you stuck at Downloading kaggle-1.5.10.tar.gz (59 kB) is because you're have a problem on your installation of protoc buffers...make sure you correctly add the google protobuf to your enviroment path...on your User path.....i have experienced the exact same problem...and adding google protobuf on the path correctly... solved the problem....

@McSinyx
Copy link
Contributor

McSinyx commented Jul 19, 2021 via email

@mtylerpreston
Copy link

are you trying to install object detection api? if so..the reason you stuck at Downloading kaggle-1.5.10.tar.gz (59 kB) is because you're have a problem on your installation of protoc buffers...make sure you correctly add the google protobuf to your enviroment path...on your User path.....i have experienced the exact same problem...and adding google protobuf on the path correctly... solved the problem....

Thank you for your help. I am indeed trying to install Tensorflow's Object Detection API (/models/research/object_detection/). I will certainly see what I can do to address the protocol installation/path issue. But given that I'm using the docker build that was put together by Tensorflow (they set up the Dockerfile and everything), I would imagine they would set I up to work properly. But maybe they didn't and I should really be bringing this to them at this point...

@marzhar
Copy link

marzhar commented Jul 20, 2021

are you trying to install object detection api? if so..the reason you stuck at Downloading kaggle-1.5.10.tar.gz (59 kB) is because you're have a problem on your installation of protoc buffers...make sure you correctly add the google protobuf to your enviroment path...on your User path.....i have experienced the exact same problem...and adding google protobuf on the path correctly... solved the problem....

Thank you for your help. I am indeed trying to install Tensorflow's Object Detection API (/models/research/object_detection/). I will certainly see what I can do to address the protocol installation/path issue. But given that I'm using the docker build that was put together by Tensorflow (they set up the Dockerfile and everything), I would imagine they would set I up to work properly. But maybe they didn't and I should really be bringing this to them at this point...

here's a good reference for you.. https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/install.html

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
state: needs discussion This needs some more discussion UX User experience related
Projects
None yet
Development

No branches or pull requests