Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent Hash Validation Failures #7881

Open
4 tasks done
rohanp-eiq opened this issue May 5, 2023 · 15 comments
Open
4 tasks done

Intermittent Hash Validation Failures #7881

rohanp-eiq opened this issue May 5, 2023 · 15 comments
Labels
area/solver Related to the dependency resolver kind/bug Something isn't working as expected status/triage This issue needs to be triaged

Comments

@rohanp-eiq
Copy link

  • I am on the latest stable Poetry version, installed using a recommended method.
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • I have consulted the FAQ and blog for any relevant entries or release notes.
  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option) and have included the output below.

Issue

Hi, we are running into an issue where we intermittently see libraries failing to install with the command: poetry install -vvv --with dev,test --sync with the following error (torch is an example, this has failed on other libraries):


  7  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:280 in _execute_operation
      278│ 
      279│             try:
    → 280│                 result = self._do_execute_operation(operation)
      281│             except EnvCommandError as e:
      282│                 if e.e.returncode == -2:

  6  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:382 in _do_execute_operation
      380│             return 0
      381│ 
    → 382│         result: int = getattr(self, f"_execute_{method}")(operation)
      383│ 
      384│         if result != 0:

  5  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:502 in _execute_install
      500│ 
      501│     def _execute_install(self, operation: Install | Update) -> int:
    → 502│         status_code = self._install(operation)
      503│ 
      504│         self._save_url_reference(operation)

  4  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:540 in _install
      538│             archive = self._download_link(operation, Link(package.source_url))
      539│         else:
    → 540│             archive = self._download(operation)
      541│ 
      542│         operation_message = self.get_operation_message(operation)

  3  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:715 in _download
      713│             self._yanked_warnings.append(message)
      714│ 
    → 715│         return self._download_link(operation, link)
      716│ 
      717│     def _download_link(self, operation: Install | Update, link: Link) -> Path:

  2  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:754 in _download_link
      752│ 
      753│         # Use the original archive to provide the correct hash.
    → 754│         self._populate_hashes_dict(original_archive, package)
      755│ 
      756│         return archive

  1  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:760 in _populate_hashes_dict
      758│     def _populate_hashes_dict(self, archive: Path, package: Package) -> None:
      759│         if package.files and archive.name in {f["file"] for f in package.files}:
    → 760│             archive_hash = self._validate_archive_hash(archive, package)
      761│             self._hashes[package.name] = archive_hash
      762│ 

  RuntimeError

  Hash for torch (2.0.0) from archive torch-2.0.0-cp310-cp310-manylinux1_x86_64.whl not found in known hashes (was: sha256:1056dbd19648e16b410f610ae6e556a783ed566b18ddcb49c8af688c70748e48)

  at ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:769 in _validate_archive_hash
      765│         archive_hash: str = "sha256:" + get_file_hash(archive)
      766│         known_hashes = {f["hash"] for f in package.files if f["file"] == archive.name}
      767│ 
      768│         if archive_hash not in known_hashes:
    → 769│             raise RuntimeError(
      770│                 f"Hash for {package} from archive {archive.name} not found in"
      771│                 f" known hashes (was: {archive_hash})"
      772│             )
      773│ 

We have seen this mostly occur in Github actions. The relevant steps are:

  - name: Install Poetry
        run: pipx install poetry

  - uses: actions/[email protected]
    with:
      python-version: "${{ env.PYTHON_VERSION }}" # this is 3.10.6

   - name: install deps
      run: |
      poetry install --with dev,test --sync 

We see intermittent issues with different libraries failing on hash checks, and retrying with no changes ends up fixing the problem.

We've tried adding steps like: poetry cache clear . --all and

rm -rf ~/.cache/pypoetry/cache
rm -rf ~/.cache/pypoetry/artifacts
poetry lock --no-update

but haven't had any luck. Additionally, we've verified that the runner isn't caching data in a way that would be causing issues.

Thank you for any help in advance!

@rohanp-eiq rohanp-eiq added kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels May 5, 2023
@dimbleby
Copy link
Contributor

dimbleby commented May 5, 2023

Well indeed that's not the correct hash for that file - see https://pypi.org/project/torch/#copy-hash-modal-6e25311f-3fc9-4584-a403-74f06dd31ce3. poetry is protecting you from installing the wrong thing.

perhaps you are short of disk space and are getting incomplete downloads.

@rohanp-eiq
Copy link
Author

Thanks for the help!

We checked the kuberenetes pods that we're running this on - we have about 100GB+ of free disk space mounted to the paths that poetry would install to. We were able to dig a bit deeper and did see that one of our dependencies, tensorflow downloaded a wheel that was ~100MB (package on pypi is ~585MB) which makes sense as to why the hash is wrong but we're unable to understand why this is still happening or what could be possible contributors.

@micahjsmith
Copy link

micahjsmith commented Jun 7, 2023

Also experiencing this in docker on macos (poetry 1.5.1, python 3.11)

RuntimeError                                    
                                                    
  Hash for scipy (1.10.1) from archive scipy-1.10.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl not found in known hashes (was: sha256:b323423e76ae0f6fb4aace295d5a95d9c49ac735bf0b324b327b7db472619490)                                              
                                                    
  at /usr/local/lib/python3.11/site-packages/poetry/installation/executor.py:818 in _validate_archive_hash                                                                                                         
      814│         archive_hash: str = "sha256:" + get_file_hash(archive)                                
      815│         known_hashes = {f["hash"] for f in package.files if f["file"] == archive.name}        
      816│                                        
      817│         if archive_hash not in known_hashes:                                                  
    → 818│             raise RuntimeError(                                                               
      819│                 f"Hash for {package} from archive {archive.name} not found in"                
      820│                 f" known hashes (was: {archive_hash})"                                        
      821│             )                            
      822│                                                                              
RuntimeError                          
                                                    
  Hash for torch (2.0.1) from archive torch-2.0.1-cp311-cp311-manylinux1_x86_64.whl not found in known hashes (was: sha256:c3ba63617c35ff58a95e6a4f7e9fb5fea3153cfacc8573bf392d01c08d24f129)                       
                                                    
  at /usr/local/lib/python3.11/site-packages/poetry/installation/executor.py:818 in _validate_archive_hash                                                                                                         
      814│         archive_hash: str = "sha256:" + get_file_hash(archive)                                
      815│         known_hashes = {f["hash"] for f in package.files if f["file"] == archive.name}        
      816│                      
      817│         if archive_hash not in known_hashes:                                                  
    → 818│             raise RuntimeError(
      819│                 f"Hash for {package} from archive {archive.name} not found in"                
      820│                 f" known hashes (was: {archive_hash})"                                        
      821│             )                  
      822│                       

Other failed packages include pydantic and numpy. pydantic is very small distribution so seems less likely an incomplete download would be to blame here.

Not able to reproduce in a linux VM nor docker on a linux VM.

@dimbleby
Copy link
Contributor

dimbleby commented Jun 7, 2023

Again, poetry is correct to refuse to install an archive with the wrong hash.

You'll maybe want to inspect the faulty archives - probably available in the poetry cache - and see if you can figure out what's wrong with them and why.

But unless you can find a way in which poetry is doing Something Wrong during download - unlikely since it works for almost everyone and after all requests is pretty widely used - this should likely just be closed. It seems unlikely that there's anything this repository can do about whatever it is you're seeing.

@micahjsmith
Copy link

@dimbleby I definitely agree that poetry is doing the Right Thing (TM) regarding the faulty archives.

I guess the theory is it is a storage or network issue? I would expect the download to fail outright rather than the hashes to be missing later in the install process? Either way would you please be able to link to docs or suggest a way to identify the actual archive file? Contents of ~/.cache/pypoetry are typically just hashes and don't seem to contain the faulty archives.

@dimbleby
Copy link
Contributor

dimbleby commented Jun 7, 2023

.whl files can also be found somewhere in that directory

@rbebb
Copy link

rbebb commented Jun 8, 2023

I would recommend deleting the cache and artifacts folders, deleting your .venv, and running poetry lock. That solved it for me.

@Flova
Copy link

Flova commented Jun 12, 2023

Deleting the poetry artifacts and cache in ~/.cache/pypoetry worked.

rm -rf ~/.cache/pypoetry/cache/
rm -rf ~/.cache/pypoetry/artifacts

so the remote archive (torch for cpu) seems fine in my case.

@silverwind
Copy link

silverwind commented Jul 4, 2023

I'm seeing such intermittent hash failures on CI, which is a completely fresh environment, no cache involved. Happens about 1 in 10 runs it seems, and always on the same package mysql-connector-python for me:

  Hash for mysql-connector-python (8.0.33) from archive mysql_connector_python-8.0.33-cp310-cp310-manylinux1_x86_64.whl not found in known hashes (was: sha256:29d15124ce60ee6801fa3ac92927fca06e07636440dbd5b34cb78c3febd682f3)

Poetry 1.5.1 (also tested 1.3.1, same issue)
Python 3.10.6
Ubuntu 22.04

@byt3bl33d3r
Copy link

Experiencing this with Docker on MacOS (Intel) with the torch and numpy packages with both Poetry 1.5.1 and 1.4.0 (Poetry installed via pipx) and Python 3.11.4:

 RuntimeError

  Hash for torch (2.0.1) from archive torch-2.0.1-cp311-cp311-manylinux1_x86_64.whl not found in known hashes (was: sha256:895e5689bf7f80726b0a84d33c8222a17f55c608802c091498a0206c163cf96a)

  at /usr/local/py-utils/venvs/poetry/lib/python3.11/site-packages/poetry/installation/executor.py:754 in _validate_archive_hash
      750│         archive_hash: str = "sha256:" + get_file_hash(archive)
      751│         known_hashes = {f["hash"] for f in package.files}
      752│ 
      753│         if archive_hash not in known_hashes:
    → 754│             raise RuntimeError(
      755│                 f"Hash for {package} from archive {archive.name} not found in"
      756│                 f" known hashes (was: {archive_hash})"
      757│             )
      758│ 

@dimbleby
Copy link
Contributor

#8235 establishes that incomplete-downloads-with-no-apparent-error (because flakey connections) is fixed in the next release, via urllib3 2.0

probably safe to assume that's the primary cause here, close this out, and invite reporting of any further issues after poetry 1.6.0

@apenney
Copy link

apenney commented Sep 22, 2023

We still see this heavily with 1.6.1. My engineers constantly run into this with Docker builds, and just rerunning the build will magically fix it. At this point I'm going to write a custom wrapper that tries to run poetry 3 times before dying, but ideal would be some flag that retries a download if the hash doesn't match.

I agree that this is probably not something poetry is responsible for, but it would be a huge help for those of us suffering this if poetry could handle a few retries on hash failures.

@jiangying000
Copy link

jiangying000 commented Jul 10, 2024

Still seeing this error

(mygpt-py3.11) PS C:\Users\64478\code\jiangying000\gb-be1> python --version   
Python 3.11.9


(mygpt-py3.11) PS C:\Users\64478\code\jiangying000\gb-be1> poetry --version
Poetry (version 1.8.3)


(mygpt-py3.11) PS C:\Users\64478\code\jiangying000\gb-be1> poetry install     
Installing dependencies from lock file

Package operations: 144 installs, 0 updates, 0 removals

  - Installing scipy (1.12.0): Failed

  RuntimeError

  Hash for scipy (1.12.0) from archive scipy-1.12.0-cp311-cp311-win_amd64.whl not found in known hashes (was: sha256:efeee7d6414d1b4ae5f3025dd5c193b5a8aa03c3c1a0a47132ae07d45859325f)

  at ~\pipx\venvs\poetry\Lib\site-packages\poetry\installation\executor.py:812 in _validate_archive_hash
      808│ 
      809│         archive_hash = f"{hash_type}:{get_file_hash(archive, hash_type)}"
      810│
      811│         if archive_hash not in known_hashes:
    → 812│             raise RuntimeError(
      813│                 f"Hash for {package} from archive {archive.name} not found in"
      814│                 f" known hashes (was: {archive_hash})"
      815│             )
      816│

Cannot install scipy.

(mygpt-py3.11) PS C:\Users\64478\code\jiangying000\gb-be1> 

i can solve this by adding

[[tool.poetry.source]]
name = "mirrors"
url = "https://pypi.tuna.tsinghua.edu.cn/simple/"
priority = "primary"

and run poetry install again

@dimbleby
Copy link
Contributor

the hash that poetry reports is indeed the wrong value, so it is correct not to install the wheel

probably at some point in the past you have downloaded a partial or corrupt version of the file and that is now in your cache and you should fix by clearing your cache

most likely that download happened before #8235 and there is no current bug

@jiangying000
Copy link

I'm not sure, but will keep an eye on it

@Secrus Secrus added the area/solver Related to the dependency resolver label Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/solver Related to the dependency resolver kind/bug Something isn't working as expected status/triage This issue needs to be triaged
Projects
None yet
10 participants