Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Boost download still fails #40756

Open
helly25 opened this issue Mar 22, 2024 · 11 comments
Open

[C++] Boost download still fails #40756

helly25 opened this issue Mar 22, 2024 · 11 comments

Comments

@helly25
Copy link

helly25 commented Mar 22, 2024

Describe the bug, including details regarding any error messages, version, and platform.

This is very much ongoing for 15.0.0, 15.0.1, 15.0.2, 16.0.0.dev... from which I derive that either the system that handles hashes is not working or that neither of the download targets is working. I get the same as benz0li for sourceforge and strange behavior with ifrog where t reliably fails from a frankfurt datacenter but seems to work in zurich. I tried boost version 1.81.0 (from source), 1.84.0 (seems to be pretty late) and 1.75.0 (referenced in this bug). So my conclusion is that this is something systemic. Unfortunately the boost version for arrow is a stripped down version (which is awesome to save network bandwidth and download time), so we cannot simply point to a local version--even if that was simple. Any other hacks to try in the meantime?

See _Originally reference from @benz0li in #34675 (comment)_

Component(s)

C++

Details

Building from 15.0.2 I get the following log:

-- ARROW_BOOST_BUILD_VERSION: 1.81.0
-- ARROW_BOOST_BUILD_SHA256_CHECKSUM: 9e0ffae35528c35f90468997bc8d99500bf179cbae355415a89a600c38e13574

[...]

[ 14%] Built target zstd_ep
CMake Error at boost_ep-stamp/boost_ep-download-RELEASE.cmake:37 (message):
  Command failed: 1

   '/home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/external/cmake-3.23.2-linux-x86_64/bin/cmake' '-Dmake=' '-Dconfig=' '-P' '/home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir/boost_ep-prefix/src/boost_ep-stamp/boost_ep-download-RELEASE-impl.cmake'

  See also

    /home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir/boost_ep-prefix/src/boost_ep-stamp/boost_ep-download-*.log


-- stdout output is:
...skipping to end...
omplete]
-- [download 49% complete]

[...]

-- [download 100% complete]
-- verifying file...
       file='/home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir/boost_ep-prefix/src/boost_1_81_0.tar.gz'
-- SHA256 hash of
    /home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir/boost_ep-prefix/src/boost_1_81_0.tar.gz
  does not match expected value
    expected: '9e0ffae35528c35f90468997bc8d99500bf179cbae355415a89a600c38e13574'
      actual: '205666dea9f6a7cfed87c7a6dfbeb52a2c1b9de55712c9c1a87735d7181452b6'
-- Hash mismatch, removing...
-- Using src='https://apache.jfrog.io/artifactory/arrow/thirdparty/7.0.0/boost_1_81_0.tar.gz'
-- [download 100% complete]
-- [download 9% complete]
-- [download 22% complete]
-- [download 34% complete]
-- [download 46% complete]
-- [download 58% complete]
-- [download 70% complete]
-- [download 82% complete]
-- [download 94% complete]
-- [download 100% complete]
-- verifying file...
       file='/home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir/boost_ep-prefix/src/boost_1_81_0.tar.gz'
-- SHA256 hash of
    /home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir/boost_ep-prefix/src/boost_1_81_0.tar.gz
  does not match expected value
    expected: '9e0ffae35528c35f90468997bc8d99500bf179cbae355415a89a600c38e13574'
      actual: 'f799db17e37a963a08674fe3a565b4acb07681de084d26eb14c305d654caef66'
-- Hash mismatch, removing...
-- Using src='https://boostorg.jfrog.io/artifactory/main/release/1.81.0/source/boost_1_81_0.tar.gz'
-- [download 0% complete]
-- [download 1% complete]

[...]

-- [download 100% complete]
-- verifying file...
       file='/home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir/boost_ep-prefix/src/boost_1_81_0.tar.gz'
-- SHA256 hash of
    /home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir/boost_ep-prefix/src/boost_1_81_0.tar.gz
  does not match expected value
    expected: '9e0ffae35528c35f90468997bc8d99500bf179cbae355415a89a600c38e13574'
      actual: '205666dea9f6a7cfed87c7a6dfbeb52a2c1b9de55712c9c1a87735d7181452b6'
-- Hash mismatch, removing...
-- Using src='https://sourceforge.net/projects/boost/files/boost/1.81.0/boost_1_81_0.tar.gz'
-- [download 100% complete]
-- [download 0% complete]
-- [download 1% complete]

[...]

-- [download 100% complete]
-- verifying file...
       file='/home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir/boost_ep-prefix/src/boost_1_81_0.tar.gz'
-- SHA256 hash of
    /home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir/boost_ep-prefix/src/boost_1_81_0.tar.gz
  does not match expected value
    expected: '9e0ffae35528c35f90468997bc8d99500bf179cbae355415a89a600c38e13574'
      actual: '205666dea9f6a7cfed87c7a6dfbeb52a2c1b9de55712c9c1a87735d7181452b6'
-- Hash mismatch, removing...

-- stderr output is:
CMake Error at boost_ep-stamp/download-boost_ep.cmake:170 (message):
  Each download failed!

    
    


CMake Error at boost_ep-stamp/boost_ep-download-RELEASE-impl.cmake:9 (message):
  Command failed (1):

   '/home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/external/cmake-3.23.2-linux-x86_64/bin/cmake' '-P' '/home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__
/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir/boost_ep-prefix/src/boost_ep-stamp/download-boost_ep.cmake'



CMake Error at boost_ep-stamp/boost_ep-download-RELEASE.cmake:47 (message):
  Stopping after outputting logs.


gmake[2]: *** [CMakeFiles/boost_ep.dir/build.make:99: boost_ep-prefix/src/boost_ep-stamp/boost_ep-download] Error 1
gmake[2]: Leaving directory '/home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir'
gmake[1]: *** [CMakeFiles/Makefile2:862: CMakeFiles/boost_ep.dir/all] Error 2
gmake[1]: Leaving directory '/home/marcus/.cache/bazel/_bazel_marcus/a6d0f072dd138e2ef57898fd8d473593/sandbox/linux-sandbox/5/execroot/__main__/bazel-out/k8-opt/bin/external/arrow/arrow.build_tmpdir'
gmake: *** [Makefile:146: all] Error 2

Note that for one download the hash is different from the others, but none match:

@migurski
Copy link

Confirmed for both 15.0.0 and 15.0.2, with this error message:

-- stderr output is:
CMake Error at boost_ep-stamp/download-boost_ep.cmake:170 (message):
  Each download failed!

    error: downloading 'https://apache.jfrog.io/artifactory/arrow/thirdparty/7.0.0/boost_1_81_0.tar.gz' failed
          status_code: 22
          status_string: "HTTP response code said error"
          log:
          --- LOG BEGIN ---
          timeout on name lookup is not supported
    Trying 18.232.172.199:443...

  Connected to apache.jfrog.io (18.232.172.199) port 443 (#0)

@helly25
Copy link
Author

helly25 commented Mar 23, 2024

After patching in the reported hash I can build arrow again.

diff --git a/cpp/thirdparty/versions.txt b/cpp/thirdparty/versions.txt
index 18bb6c9b6..aebdf28e5 100644
--- a/cpp/thirdparty/versions.txt
+++ b/cpp/thirdparty/versions.txt
@@ -57,7 +57,7 @@ ARROW_AWSSDK_BUILD_SHA256_CHECKSUM=2d552fb1a84bef4a9b65e34aa7031851ed2aef5319e02
 ARROW_AZURE_SDK_BUILD_VERSION=azure-core_1.10.3
 ARROW_AZURE_SDK_BUILD_SHA256_CHECKSUM=dd624c2f86adf474d2d0a23066be6e27af9cbd7e3f8d9d8fd7bf981e884b7b48
 ARROW_BOOST_BUILD_VERSION=1.81.0
-ARROW_BOOST_BUILD_SHA256_CHECKSUM=9e0ffae35528c35f90468997bc8d99500bf179cbae355415a89a600c38e13574
+ARROW_BOOST_BUILD_SHA256_CHECKSUM=205666dea9f6a7cfed87c7a6dfbeb52a2c1b9de55712c9c1a87735d7181452b6
 ARROW_BROTLI_BUILD_VERSION=v1.0.9
 ARROW_BROTLI_BUILD_SHA256_CHECKSUM=f9e8d81d0405ba66d181529af42a3354f838c939095ff99930da6aa9cdf6fe46
 ARROW_BZIP2_BUILD_VERSION=1.0.8

I think there are multiple issues here:

  • The hash is currently wrong. Though why would it ever change?
  • The hash cannot be controlled from the build environment but must rather be patched in.
  • The build system takes two separate boost versions (normal and trimmed) into account, but there is only one hash, so only one version can ever be working.
  • Source 'https://apache.jfrog.io/artifactory/arrow/thirdparty/7.0.0/boost_1_81_0.tar.gz' is currently broken (hence the different hash).

@benz0li
Copy link

benz0li commented Mar 24, 2024

Hits my builds, too:

@assignUser There seems to be an issue with the apache jfrog instance. What is the expected timeframe for resolving this issue?

@assignUser
Copy link
Member

assignUser commented Mar 24, 2024

ASF Infra is in contact with jfrog, that's all I can say. I know this is very disruptive and I'm sorry. We will be looking for alternatives/fallbacks to avoid this happening again, see #40760

@assignUser
Copy link
Member

@helly25 you are right. Our hash differs from the release hash of Boost due to being a trimmed version. We do provide fallback URLs in case the artifactory is down and a way to override the URL as well but lack the same functionality in regards to the hash (outside of patching versions.txt as you did).

@helly25
Copy link
Author

helly25 commented Mar 24, 2024

@helly25 you are right. Our hash differs from the release hash of Boost due to being a trimmed version. We do provide fallback URLs in case the artifactory is down and a way to override the URL as well but lack the same functionality in regards to the hash (outside of patching versions.txt as you did).

Being able to also override the hash would be a fast way for anyone to repack the boost library themselves.

Further, I wonder if it would make sense to make the necessary libraries available in github as additional elements of the arrow version they were introduced with. Then you could point to github as an additional location to try.

Last but not least, if it was possible to provide the version.txt file, then ppl could very easily provide their own locations and hashes. So that ability would actually be the most powerful solution.

@assignUser
Copy link
Member

provide the version.txt

So provide a path to an alternate versions.txt file via env or cmake var? I like it!

@helly25
Copy link
Author

helly25 commented Mar 24, 2024

provide the version.txt

So provide a path to an alternate versions.txt file via env or cmake var? I like it!

Like that.

The versions file would probably better be split into 2. The top part that can then be controlled via env or cmake and the second part that contains the DEPENDENCIES. Only the file to be overloaded would have three vars per target ..._BUILD_VERSION, ..._BUILD_SHA256 and ..._BUILD_URL.

Currently the lower part looks like this:

DEPENDENCIES=(
  "ARROW_ABSL_URL absl-${ARROW_ABSL_BUILD_VERSION}.tar.gz https://github.com/abseil/abseil-cpp/archive/${ARROW_ABSL_BUILD_VERSION}.tar.gz"

So the configureable versions.txt would only have entry sets like the following:

ARROW_ABSL_BUILD_VERSION=20211102.0
ARROW_ABSL_BUILD_SHA256_CHECKSUM=dcf71b9cba8dc0ca9940c4b316a0c796be8fab42b070bb6b7cab62b48f0e66c4
ARROW_ABSL_BUILD_URL=https://github.com/abseil/abseil-cpp/archive/${ARROW_ABSL_BUILD_VERSION}.tar.gz

And the non configurable part would be something like:

DEPENDENCIES=(
  "ARROW_ABSL_URL absl-${ARROW_ABSL_BUILD_VERSION}.tar.gz ${ARROW_ABSL_BUILD_URL}"

Maybe the file also needs to configurable or be constructed from the url.

@pitrou
Copy link
Member

pitrou commented Mar 25, 2024

I'll add two questions here:

  1. can we have a quick fix for the currently failing CI jobs?
  2. why do we use bundled Boost on some CI jobs? usually a Boost install is provided by the CI vendor and, if using a Docker container, we can install Boost in the image build phase instead of downloading it on the fly everytime.

@kou

@benz0li
Copy link

benz0li commented Mar 25, 2024

The apache jfrog instance seems to be back online.

@danepitkin
Copy link
Member

danepitkin commented Mar 26, 2024

IMO we need two calls to externalproject_add in CMake. One for the trimmed boost lib and one for the official boost lib to fallback to. Since they are different binaries, they each need their own add to configure the correct hash. e.g.

ARROW_BOOST_BUILD_VERSION=1.81.0
ARROW_BOOST_BUILD_TRIMMED_SHA256_CHECKSUM=9e0ffae35528c35f90468997bc8d99500bf179cbae355415a89a600c38e13574
ARROW_BOOST_BUILD_SHA256_CHECKSUM=205666dea9f6a7cfed87c7a6dfbeb52a2c1b9de55712c9c1a87735d7181452b6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants