Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve sdist builds can race and fail. #2078

Closed
jsirois opened this issue Mar 4, 2023 · 0 comments · Fixed by #2080
Closed

Resolve sdist builds can race and fail. #2078

jsirois opened this issue Mar 4, 2023 · 0 comments · Fixed by #2080
Assignees

Comments

@jsirois
Copy link
Member

jsirois commented Mar 4, 2023

The failure looks like:

...
pex: Building pex :: Resolving distributions (boto3==1.16.7 enigma-namedframes~=1.0.2 enigma-pyspark-commons matplotlib==3.4.2 mlflow==1.20.2 numpy<1.24,>=1.20 pandas==1.2.4 plotly==5.1.0 protobuf==3.17.2 pyarrow==4.0.0 pyspark-test pyspark==3.1.2 pytest scikit-learn==0.24.1 scipy~=1.6.0 types-setuptools) :: Resolving requirements from lock file build-support/databricks_lock.txt :: Categorizing 78 downloaded artifacts
pex: Building pex :: Resolving distributions (boto3==1.16.7 enigma-namedframes~=1.0.2 enigma-pyspark-commons matplotlib==3.4.2 mlflow==1.20.2 numpy<1.24,>=1.20 pandas==1.2.4 plotly==5.1.0 protobuf==3.17.2 pyarrow==4.0.0 pyspark-test pyspark==3.1.2 pytest scikit-learn==0.24.1 scipy~=1.6.0 types-setuptools) :: Resolving requirements from lock file build-support/databricks_lock.txt :: Building 3 artifacts and installing 78
pex: Building pex :: Resolving distributions (boto3==1.16.7 enigma-namedframes~=1.0.2 enigma-pyspark-commons matplotlib==3.4.2 mlflow==1.20.2 numpy<1.24,>=1.20 pandas==1.2.4 plotly==5.1.0 protobuf==3.17.2 pyarrow==4.0.0 pyspark-test pyspark==3.1.2 pytest scikit-learn==0.24.1 scipy~=1.6.0 types-setuptools) :: Resolving requirements from lock file build-support/databricks_lock.txt :: Building 3 artifacts and installing 78 :: Building distributions for:
  BuildRequest(target=LocalInterpreter(id='usr.bin.python3.8', platform=Platform(platform='manylinux_2_27_x86_64', impl='cp', version='3.8.0', version_info=(3, 8, 0), abi='cp38'), marker_environment=MarkerEnvironment(implementation_name='cpython', implementation_version='3.8.0', os_name='posix', platform_machine='x86_64', platform_python_implementation='CPython', platform_release='5.15.49-linuxkit', platform_system='Linux', platform_version='#1 SMP Tue Sep 13 07:51:46 UTC 2022', python_full_version='3.8.0', python_version='3.8', sys_platform='linux'), interpreter=PythonInterpreter('/usr/bin/python3.8', PythonIdentity('/usr/bin/python3.8', 'cp38', 'cp38', 'manylinux_2_27_x86_64', (3, 8, 0)))), source_path='/root/.cache/pants/named_caches/pex_root/downloads/5e25ebb18756e9715f4d26848cc7e558035025da74b4fc325a0ebc05ff538e65/pyspark-3.1.2.tar.gz', fingerprint='5e25ebb18756e9715f4d26848cc7e558035025da74b4fc325a0ebc05ff538e65')
  BuildRequest(target=LocalInterpreter(id='usr.bin.python3.8', platform=Platform(platform='manylinux_2_27_x86_64', impl='cp', version='3.8.0', version_info=(3, 8, 0), abi='cp38'), marker_environment=MarkerEnvironment(implementation_name='cpython', implementation_version='3.8.0', os_name='posix', platform_machine='x86_64', platform_python_implementation='CPython', platform_release='5.15.49-linuxkit', platform_system='Linux', platform_version='#1 SMP Tue Sep 13 07:51:46 UTC 2022', python_full_version='3.8.0', python_version='3.8', sys_platform='linux'), interpreter=PythonInterpreter('/usr/bin/python3.8', PythonIdentity('/usr/bin/python3.8', 'cp38', 'cp38', 'manylinux_2_27_x86_64', (3, 8, 0)))), source_path='/root/.cache/pants/named_caches/pex_root/downloads/791a5686953c4b366d3228c5377196db2f534475bb38d26f70eb69668efd9028/alembic-1.4.1.tar.gz', fingerprint='791a5686953c4b366d3228c5377196db2f534475bb38d26f70eb69668efd9028')
  BuildRequest(target=LocalInterpreter(id='usr.bin.python3.8', platform=Platform(platform='manylinux_2_27_x86_64', impl='cp', version='3.8.0', version_info=(3, 8, 0), abi='cp38'), marker_environment=MarkerEnvironment(implementation_name='cpython', implementation_version='3.8.0', os_name='posix', platform_machine='x86_64', platform_python_implementation='CPython', platform_release='5.15.49-linuxkit', platform_system='Linux', platform_version='#1 SMP Tue Sep 13 07:51:46 UTC 2022', python_full_version='3.8.0', python_version='3.8', sys_platform='linux'), interpreter=PythonInterpreter('/usr/bin/python3.8', PythonIdentity('/usr/bin/python3.8', 'cp38', 'cp38', 'manylinux_2_27_x86_64', (3, 8, 0)))), source_path='/root/.cache/pants/named_caches/pex_root/downloads/bc0c4dd082f033cb6d7978cacaca5261698efe3a4c70f52f98762c38db925ce0/databricks-cli-0.17.4.tar.gz', fingerprint='bc0c4dd082f033cb6d7978cacaca5261698efe3a4c70f52f98762c38db925ce0')
pex: Using cached build of /root/.cache/pants/named_caches/pex_root/downloads/5e25ebb18756e9715f4d26848cc7e558035025da74b4fc325a0ebc05ff538e65/pyspark-3.1.2.tar.gz at /root/.cache/pants/named_caches/pex_root/built_wheels/sdists/pyspark-3.1.2.tar.gz/5e25ebb18756e9715f4d26848cc7e558035025da74b4fc325a0ebc05ff538e65/cp38-cp38-manylinux_2_27_x86_64
Traceback (most recent call last):
  File "/root/.cache/pants/named_caches/pex_root/installed_wheels/d672450bf975731d91fefe6a308e399df21fc804a25b9ad253a52eb4ae6bb6be/pex-2.1.125-py2.py3-none-any.whl/pex/result.py", line 105, in catch
    return func(*args, **kwargs)
  File "/root/.cache/pants/named_caches/pex_root/installed_wheels/d672450bf975731d91fefe6a308e399df21fc804a25b9ad253a52eb4ae6bb6be/pex-2.1.125-py2.py3-none-any.whl/pex/bin/pex.py", line 856, in do_main
    pex_builder = build_pex(
  File "/root/.cache/pants/named_caches/pex_root/installed_wheels/d672450bf975731d91fefe6a308e399df21fc804a25b9ad253a52eb4ae6bb6be/pex-2.1.125-py2.py3-none-any.whl/pex/bin/pex.py", line 689, in build_pex
    resolve_from_lock(
  File "/root/.cache/pants/named_caches/pex_root/installed_wheels/d672450bf975731d91fefe6a308e399df21fc804a25b9ad253a52eb4ae6bb6be/pex-2.1.125-py2.py3-none-any.whl/pex/resolve/lock_resolver.py", line 422, in resolve_from_lock
    installed_distributions = build_and_install_request.install_distributions(
  File "/root/.cache/pants/named_caches/pex_root/installed_wheels/d672450bf975731d91fefe6a308e399df21fc804a25b9ad253a52eb4ae6bb6be/pex-2.1.125-py2.py3-none-any.whl/pex/resolver.py", line 757, in install_distributions
    build_results = self._wheel_builder.build_wheels(
  File "/root/.cache/pants/named_caches/pex_root/installed_wheels/d672450bf975731d91fefe6a308e399df21fc804a25b9ad253a52eb4ae6bb6be/pex-2.1.125-py2.py3-none-any.whl/pex/resolver.py", line 598, in build_wheels
    build_requests, build_results = self._categorize_build_requests(
  File "/root/.cache/pants/named_caches/pex_root/installed_wheels/d672450bf975731d91fefe6a308e399df21fc804a25b9ad253a52eb4ae6bb6be/pex-2.1.125-py2.py3-none-any.whl/pex/resolver.py", line 555, in _categorize_build_requests
    build_results[build_request.source_path].add(build_result.finalize_build())
  File "/root/.cache/pants/named_caches/pex_root/installed_wheels/d672450bf975731d91fefe6a308e399df21fc804a25b9ad253a52eb4ae6bb6be/pex-2.1.125-py2.py3-none-any.whl/pex/resolver.py", line 316, in finalize_build
    raise AssertionError(
AssertionError: Build of BuildRequest(target=LocalInterpreter(id='usr.bin.python3.8', platform=Platform(platform='manylinux_2_27_x86_64', impl='cp', version='3.8.0', version_info=(3, 8, 0), abi='cp38'), marker_environment=MarkerEnvironment(implementation_name='cpython', implementation_version='3.8.0', os_name='posix', platform_machine='x86_64', platform_python_implementation='CPython', platform_release='5.15.49-linuxkit', platform_system='Linux', platform_version='#1 SMP Tue Sep 13 07:51:46 UTC 2022', python_full_version='3.8.0', python_version='3.8', sys_platform='linux'), interpreter=PythonInterpreter('/usr/bin/python3.8', PythonIdentity('/usr/bin/python3.8', 'cp38', 'cp38', 'manylinux_2_27_x86_64', (3, 8, 0)))), source_path='/root/.cache/pants/named_caches/pex_root/downloads/5e25ebb18756e9715f4d26848cc7e558035025da74b4fc325a0ebc05ff538e65/pyspark-3.1.2.tar.gz', fingerprint='5e25ebb18756e9715f4d26848cc7e558035025da74b4fc325a0ebc05ff538e65') produced 2 artifacts; expected 1:
0. cp38-cp38-manylinux_2_27_x86_64.3ec80d51ce26438a86c97b71d562e96e
1. pyspark-3.1.2-py2.py3-none-any.whl

The issue here is in the oldest use of AtomicDirectory in the code base. In that use, multiple processes race to build the sdist and AtomicDirectory assures just one wins. All other uses of AtomicDirectory (save for InstallResult in the same file which does not have a similar collection / collection problem) use the atomic_directory context manager and an exclusive lock so there is no racing, but the racing isn't the issue here - that's safe if CPU wasteful. The issue is the collection code here:
https://github.com/pantsbuild/pex/blob/1ebd92bcfa344e5f895a670c71c771b7d92b7bdb/pex/resolver.py#L298-L314

That code does not account for the temporary work dirs of racing processes that may be visible (the cp38-cp38-manylinux_2_27_x86_64.3ec80d51ce26438a86c97b71d562e96e dir in the failure backtrace).

Ideally sdist builds would use the more modern atomic_directory exclusive lock mechanism, but for now, narrowing the build result search to "*.whl" should suffice to fix this.

@jsirois jsirois self-assigned this Mar 4, 2023
This was referenced Mar 4, 2023
jsirois added a commit to jsirois/pex that referenced this issue Mar 4, 2023
Narrow the search for a result build artifact to "*.whl" files to avoid
any racing work directories that may be present when collecting the
wheel build artifact.

Fixes pex-tool#2078
jsirois added a commit that referenced this issue Mar 4, 2023
Narrow the search for a result build artifact to "*.whl" files to avoid
any racing work directories that may be present when collecting the
wheel build artifact.

Fixes #2078
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant