Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-17051: [C++] Link Flight/gRPC/Protobuf consistently #13599

Merged
merged 7 commits into from
Jul 26, 2022

Conversation

lidavidm
Copy link
Member

If Protobuf/gRPC are used statically, Flight must be as well, or else we can get odd runtime behavior due to the global state in those libraries when Flight SQL is involved (as Flight SQL would then bundle a second copy of Protobuf into its shared library).

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

if(NOT ARROW_BUILD_STATIC)
message(STATUS "If static Protobuf or gRPC are used, Arrow must be built statically")
message(STATUS "(These libraries have global state, and linkage must be consistent)")
message(FATAL_ERROR "Must build Arrow statically to link Flight tests statically")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this.

ARROW_BUILD_STATIC: "OFF"
# ARROW-17051: this build uses static Protobuf, so we must also
# use static Arrow to run Flight/Flight SQL tests
ARROW_BUILD_STATIC: "ON"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving this to ci/docker/ubuntru-*-cpp.dockerfile? If newer Ubuntu provides enough recent Protobuf as libprotobuf-dev, we can disable static built only for the newer Ubuntu.

Copy link
Member Author

@lidavidm lidavidm Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it. 22.04 is also too old (3.12, we want 3.15): https://packages.ubuntu.com/jammy/libprotobuf-dev

@@ -1214,6 +1216,8 @@ services:
shm_size: *shm-size
environment:
<<: *ccache
ARROW_BUILD_STATIC: 'ON'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? We don't need to build static library when we don't build tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Removed, also updated Flight's CMakeLists.txt to not perform the check added above if we're not building tests.

Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also change the travis jobs that fail?

-- If static Protobuf or gRPC are used, Arrow must be built statically
-- (These libraries have global state, and linkage must be consistent)
CMake Error at src/arrow/flight/CMakeLists.txt:43 (message):
  Must build Arrow statically to link Flight tests statically

https://github.com/apache/arrow/blob/master/.travis.yml#L96 and https://github.com/apache/arrow/blob/master/.travis.yml#L148

@lidavidm
Copy link
Member Author

Should we also change the travis jobs that fail?

Updated, thanks for catching that

@lidavidm
Copy link
Member Author

As described at https://issues.apache.org/jira/browse/ARROW-16919?focusedCommentId=17566864&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17566864 I think this could also fix ARROW-16919, which appears to be due to double-free of a static due to both statically and dynamically linking libarrow.so.

@lidavidm
Copy link
Member Author

@github-actions crossbow submit -g nightly

@github-actions
Copy link

Revision: 488b70c

Submitted crossbow builds: ursacomputing/crossbow @ actions-e15d0e5aae

Task Status
almalinux-8-amd64 Github Actions
almalinux-8-arm64 TravisCI
almalinux-9-amd64 Github Actions
almalinux-9-arm64 TravisCI
amazon-linux-2-amd64 Github Actions
amazon-linux-2-arm64 TravisCI
centos-7-amd64 Github Actions
centos-8-stream-amd64 Github Actions
centos-8-stream-arm64 TravisCI
centos-9-stream-amd64 Github Actions
centos-9-stream-arm64 TravisCI
conan-maximum Github Actions
conan-minimum Github Actions
conda-clean Azure
conda-linux-gcc-py310-arm64 Azure
conda-linux-gcc-py310-cpu Azure
conda-linux-gcc-py310-cuda Azure
conda-linux-gcc-py310-ppc64le Azure
conda-linux-gcc-py37-arm64 Azure
conda-linux-gcc-py37-cpu-r40 Azure
conda-linux-gcc-py37-cpu-r41 Azure
conda-linux-gcc-py37-cuda Azure
conda-linux-gcc-py37-ppc64le Azure
conda-linux-gcc-py38-arm64 Azure
conda-linux-gcc-py38-cpu Azure
conda-linux-gcc-py38-cuda Azure
conda-linux-gcc-py38-ppc64le Azure
conda-linux-gcc-py39-arm64 Azure
conda-linux-gcc-py39-cpu Azure
conda-linux-gcc-py39-cuda Azure
conda-linux-gcc-py39-ppc64le Azure
conda-osx-arm64-clang-py310 Azure
conda-osx-arm64-clang-py38 Azure
conda-osx-arm64-clang-py39 Azure
conda-osx-clang-py310 Azure
conda-osx-clang-py37-r40 Azure
conda-osx-clang-py37-r41 Azure
conda-osx-clang-py38 Azure
conda-osx-clang-py39 Azure
conda-win-vs2017-py310 Azure
conda-win-vs2017-py37-r40 Azure
conda-win-vs2017-py37-r41 Azure
conda-win-vs2017-py38 Azure
conda-win-vs2017-py39 Azure
debian-bookworm-amd64 Github Actions
debian-bookworm-arm64 TravisCI
debian-bullseye-amd64 Github Actions
debian-bullseye-arm64 TravisCI
example-cpp-minimal-build-static Github Actions
example-cpp-minimal-build-static-system-dependency Github Actions
example-python-minimal-build-fedora-conda Github Actions
example-python-minimal-build-ubuntu-venv Github Actions
homebrew-cpp Github Actions
homebrew-r-autobrew Github Actions
homebrew-r-brew Github Actions
java-jars Github Actions
nuget Github Actions
python-sdist Github Actions
test-build-cpp-fuzz Github Actions
test-build-vcpkg-win Github Actions
test-conda-cpp Github Actions
test-conda-cpp-valgrind Azure
test-conda-python-3.10 Github Actions
test-conda-python-3.7 Github Actions
test-conda-python-3.7-hdfs-2.9.2 Github Actions
test-conda-python-3.7-hdfs-3.2.1 Github Actions
test-conda-python-3.7-kartothek-latest Github Actions
test-conda-python-3.7-kartothek-master Github Actions
test-conda-python-3.7-pandas-0.24 Github Actions
test-conda-python-3.7-pandas-latest Github Actions
test-conda-python-3.7-spark-v3.1.2 Github Actions
test-conda-python-3.8 Github Actions
test-conda-python-3.8-hypothesis Github Actions
test-conda-python-3.8-pandas-latest Github Actions
test-conda-python-3.8-pandas-nightly Github Actions
test-conda-python-3.8-spark-v3.2.0 Github Actions
test-conda-python-3.9 Github Actions
test-conda-python-3.9-dask-latest Github Actions
test-conda-python-3.9-dask-master Github Actions
test-conda-python-3.9-pandas-master Github Actions
test-conda-python-3.9-spark-master Github Actions
test-debian-10-cpp-amd64 Github Actions
test-debian-10-cpp-i386 Github Actions
test-debian-11-cpp-amd64 Github Actions
test-debian-11-cpp-i386 Github Actions
test-debian-11-go-1.16 Azure
test-debian-11-python-3 Azure
test-debian-c-glib Github Actions
test-debian-ruby Github Actions
test-fedora-35-cpp Github Actions
test-fedora-35-python-3 Azure
test-fedora-r-clang-sanitizer Azure
test-r-arrow-backwards-compatibility Github Actions
test-r-depsource-bundled Azure
test-r-depsource-system Github Actions
test-r-dev-duckdb Github Actions
test-r-devdocs Github Actions
test-r-gcc-11 Github Actions
test-r-gcc-12 Github Actions
test-r-install-local Github Actions
test-r-linux-as-cran Github Actions
test-r-linux-rchk Github Actions
test-r-linux-valgrind Azure
test-r-minimal-build Azure
test-r-offline-maximal Github Actions
test-r-offline-minimal Azure
test-r-rhub-debian-gcc-devel-lto-latest Azure
test-r-rhub-debian-gcc-release-custom-ccache Azure
test-r-rhub-ubuntu-gcc-release-latest Azure
test-r-rocker-r-base-latest Azure
test-r-rstudio-r-base-4.1-opensuse153 Azure
test-r-rstudio-r-base-4.2-centos7-devtoolset-8 Azure
test-r-rstudio-r-base-4.2-focal Azure
test-r-ubuntu-22.04 Github Actions
test-r-versions Github Actions
test-skyhook-integration Github Actions
test-ubuntu-18.04-cpp Github Actions
test-ubuntu-18.04-cpp-release Github Actions
test-ubuntu-18.04-cpp-static Github Actions
test-ubuntu-18.04-r-sanitizer Azure
test-ubuntu-20.04-cpp Github Actions
test-ubuntu-20.04-cpp-14 Github Actions
test-ubuntu-20.04-cpp-17 Github Actions
test-ubuntu-20.04-cpp-bundled Github Actions
test-ubuntu-20.04-cpp-thread-sanitizer Github Actions
test-ubuntu-20.04-python-3 Azure
test-ubuntu-22.04-cpp Github Actions
test-ubuntu-c-glib Github Actions
test-ubuntu-default-docs Azure
test-ubuntu-ruby Github Actions
ubuntu-bionic-amd64 Github Actions
ubuntu-bionic-arm64 TravisCI
ubuntu-focal-amd64 Github Actions
ubuntu-focal-arm64 TravisCI
ubuntu-jammy-amd64 Github Actions
ubuntu-jammy-arm64 TravisCI
verify-rc-source-cpp-linux-almalinux-8-amd64 Github Actions
verify-rc-source-cpp-linux-conda-latest-amd64 Github Actions
verify-rc-source-cpp-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-cpp-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-cpp-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-cpp-macos-amd64 Github Actions
verify-rc-source-cpp-macos-arm64 Github Actions
verify-rc-source-cpp-macos-conda-amd64 Github Actions
verify-rc-source-csharp-linux-almalinux-8-amd64 Github Actions
verify-rc-source-csharp-linux-conda-latest-amd64 Github Actions
verify-rc-source-csharp-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-csharp-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-csharp-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-csharp-macos-amd64 Github Actions
verify-rc-source-csharp-macos-arm64 Github Actions
verify-rc-source-go-linux-almalinux-8-amd64 Github Actions
verify-rc-source-go-linux-conda-latest-amd64 Github Actions
verify-rc-source-go-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-go-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-go-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-go-macos-amd64 Github Actions
verify-rc-source-go-macos-arm64 Github Actions
verify-rc-source-integration-linux-almalinux-8-amd64 Github Actions
verify-rc-source-integration-linux-conda-latest-amd64 Github Actions
verify-rc-source-integration-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-integration-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-integration-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-integration-macos-amd64 Github Actions
verify-rc-source-integration-macos-arm64 Github Actions
verify-rc-source-integration-macos-conda-amd64 Github Actions
verify-rc-source-java-linux-almalinux-8-amd64 Github Actions
verify-rc-source-java-linux-conda-latest-amd64 Github Actions
verify-rc-source-java-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-java-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-java-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-java-macos-amd64 Github Actions
verify-rc-source-js-linux-almalinux-8-amd64 Github Actions
verify-rc-source-js-linux-conda-latest-amd64 Github Actions
verify-rc-source-js-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-js-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-js-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-js-macos-amd64 Github Actions
verify-rc-source-js-macos-arm64 Github Actions
verify-rc-source-python-linux-almalinux-8-amd64 Github Actions
verify-rc-source-python-linux-conda-latest-amd64 Github Actions
verify-rc-source-python-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-python-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-python-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-python-macos-amd64 Github Actions
verify-rc-source-python-macos-arm64 Github Actions
verify-rc-source-python-macos-conda-amd64 Github Actions
verify-rc-source-ruby-linux-almalinux-8-amd64 Github Actions
verify-rc-source-ruby-linux-conda-latest-amd64 Github Actions
verify-rc-source-ruby-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-ruby-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-ruby-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-ruby-macos-amd64 Github Actions
verify-rc-source-ruby-macos-arm64 Github Actions
verify-rc-source-windows Github Actions
wheel-macos-big-sur-cp310-arm64 Github Actions
wheel-macos-big-sur-cp310-universal2 Github Actions
wheel-macos-big-sur-cp38-arm64 Github Actions
wheel-macos-big-sur-cp39-arm64 Github Actions
wheel-macos-big-sur-cp39-universal2 Github Actions
wheel-macos-high-sierra-cp310-amd64 Github Actions
wheel-macos-high-sierra-cp37-amd64 Github Actions
wheel-macos-high-sierra-cp38-amd64 Github Actions
wheel-macos-high-sierra-cp39-amd64 Github Actions
wheel-macos-mavericks-cp310-amd64 Github Actions
wheel-macos-mavericks-cp37-amd64 Github Actions
wheel-macos-mavericks-cp38-amd64 Github Actions
wheel-macos-mavericks-cp39-amd64 Github Actions
wheel-manylinux2014-cp310-amd64 Github Actions
wheel-manylinux2014-cp310-arm64 TravisCI
wheel-manylinux2014-cp37-amd64 Github Actions
wheel-manylinux2014-cp37-arm64 TravisCI
wheel-manylinux2014-cp38-amd64 Github Actions
wheel-manylinux2014-cp38-arm64 TravisCI
wheel-manylinux2014-cp39-amd64 Github Actions
wheel-manylinux2014-cp39-arm64 TravisCI
wheel-windows-cp310-amd64 Github Actions
wheel-windows-cp37-amd64 Github Actions
wheel-windows-cp38-amd64 Github Actions
wheel-windows-cp39-amd64 Github Actions

@lidavidm lidavidm marked this pull request as ready for review July 15, 2022 12:37
@lidavidm
Copy link
Member Author

Ok, I think this is ready. The CI failures seem to be incidental or addressed in other issues. Thanks Raúl and Kou for all the help here.

@kou
Copy link
Member

kou commented Jul 15, 2022

It seems that -ldl is missing in Travis CI builds:

e.g.: https://app.travis-ci.com/github/apache/arrow/jobs/576693121#L2786

FAILED: debug/plasma-store-server 
: && /usr/bin/c++  -Wno-noexcept-type -Wno-subobject-linkage  -fdiagnostics-color=always -ggdb -O0  -Wall -Wno-conversion -Wno-sign-conversion -Wunused-result -Werror -fno-semantic-interposition  -fPIC -g   src/plasma/CMakeFiles/plasma-store-server.dir/Unity/unity_0_cxx.cxx.o src/plasma/CMakeFiles/plasma-store-server.dir/Unity/unity_0_c.c.o src/plasma/CMakeFiles/plasma-store-server.dir/dlmalloc.cc.o  -o debug/plasma-store-server  debug/libplasma.a  debug/libarrow.a  /usr/lib/s390x-linux-gnu/libgflags.so.2.2.2  /usr/lib/s390x-linux-gnu/libssl.so  /usr/lib/s390x-linux-gnu/libcrypto.so  /usr/lib/s390x-linux-gnu/libbrotlienc.so  /usr/lib/s390x-linux-gnu/libbrotlidec.so  /usr/lib/s390x-linux-gnu/libbrotlicommon.so  /usr/lib/s390x-linux-gnu/libbz2.so  /usr/lib/s390x-linux-gnu/liblz4.so  /usr/lib/s390x-linux-gnu/libsnappy.so.1.1.8  /usr/lib/s390x-linux-gnu/libz.so  /usr/lib/s390x-linux-gnu/libzstd.so  opentelemetry_ep-install/lib/libopentelemetry_exporter_ostream_span.a  opentelemetry_ep-install/lib/libopentelemetry_exporter_otlp_http.a  opentelemetry_ep-install/lib/libopentelemetry_otlp_recordable.a  opentelemetry_ep-install/lib/libopentelemetry_trace.a  opentelemetry_ep-install/lib/libopentelemetry_resources.a  opentelemetry_ep-install/lib/libopentelemetry_common.a  opentelemetry_ep-install/lib/libopentelemetry_exporter_otlp_http_client.a  opentelemetry_ep-install/lib/libopentelemetry_proto.a  protobuf_ep-install/lib/libprotobuf.a  opentelemetry_ep-install/lib/libopentelemetry_http_client_curl.a  /usr/lib/s390x-linux-gnu/libcurl.so  /usr/lib/s390x-linux-gnu/libutf8proc.so  /usr/lib/s390x-linux-gnu/libre2.so  jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a  -pthread  -lrt  -lpthread && :
/usr/bin/ld: debug/libarrow.a(unity_6_cxx.cxx.o): undefined reference to symbol 'dlsym@@GLIBC_2.2'
/usr/bin/ld: /lib/s390x-linux-gnu/libdl.so.2: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status

Could you try this patch because cpp/src/arrow/io/hdfs_internal.cc uses dlsym()?

diff --git a/cpp/CMakeLists.txt b/cpp/CMakeLists.txt
index b67f90e0bd..945ff7b6f8 100644
--- a/cpp/CMakeLists.txt
+++ b/cpp/CMakeLists.txt
@@ -862,7 +862,7 @@ add_dependencies(arrow_test_dependencies toolchain-tests)
 
 if(ARROW_STATIC_LINK_LIBS)
   add_dependencies(arrow_dependencies ${ARROW_STATIC_LINK_LIBS})
-  if(ARROW_ORC)
+  if(ARROW_HDFS OR ARROW_ORC)
     if(NOT MSVC_TOOLCHAIN)
       list(APPEND ARROW_STATIC_LINK_LIBS ${CMAKE_DL_LIBS})
       list(APPEND ARROW_STATIC_INSTALL_INTERFACE_LIBS ${CMAKE_DL_LIBS})

@lidavidm
Copy link
Member Author

Done - thanks for catching that!

@pitrou
Copy link
Member

pitrou commented Jul 19, 2022

Remotely related: substrait-io/substrait#249

Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lidavidm

@pitrou pitrou requested a review from kou July 21, 2022 20:16
.travis.yml Outdated
-e ARROW_BUILD_STATIC=OFF
-e ARROW_BUILD_STATIC=ON
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this because we have ARROW_BUILD_STATIC=OFF in Dockerfile?

.travis.yml Outdated
@@ -145,7 +145,7 @@ jobs:
# aws-sdk-cpp.
DOCKER_RUN_ARGS: >-
"
-e ARROW_BUILD_STATIC=OFF
-e ARROW_BUILD_STATIC=ON
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this because we have ARROW_BUILD_STATIC=OFF in Dockerfile?

@@ -1214,6 +1213,7 @@ services:
shm_size: *shm-size
environment:
<<: *ccache
ARROW_BUILD_TESTS: 'OFF'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?
ci/docker/linux-apt-r.dockerfile has ARROW_BUILD_TESTS=OFF.

@lidavidm
Copy link
Member Author

Seems CI is still pending, but any other feedback here? (Thanks Kou for the comments so far.)

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@kou kou merged commit bbf249e into apache:master Jul 26, 2022
@ursabot
Copy link

ursabot commented Jul 26, 2022

Benchmark runs are scheduled for baseline = 87cefe8 and contender = bbf249e. bbf249e is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Failed ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.31% ⬆️0.03%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.5% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Failed] bbf249e0 ec2-t3-xlarge-us-east-2
[Failed] bbf249e0 test-mac-arm
[Finished] bbf249e0 ursa-i9-9960x
[Finished] bbf249e0 ursa-thinkcentre-m75q
[Failed] 87cefe80 ec2-t3-xlarge-us-east-2
[Finished] 87cefe80 test-mac-arm
[Finished] 87cefe80 ursa-i9-9960x
[Finished] 87cefe80 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

kszucs pushed a commit that referenced this pull request Jul 27, 2022
If Protobuf/gRPC are used statically, Flight must be as well, or else we can get odd runtime behavior due to the global state in those libraries when Flight SQL is involved (as Flight SQL would then bundle a second copy of Protobuf into its shared library).

Authored-by: David Li <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants