Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-17868: [C++][Python] Restore the ARROW_PYTHON CMake option #14273

Merged
merged 9 commits into from
Oct 1, 2022

Conversation

kou
Copy link
Member

@kou kou commented Sep 29, 2022

Restore it but it's marked as a deprecated option. Because the Python component in Apache Arrow C++ was moved to PyArrow by ARROW-16340. It' removed in a feature release.

Users should use CMake presets instead of ARROW_PYTHON but CMake presets requires CMake 3.19 or later.

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@kou
Copy link
Member Author

kou commented Sep 29, 2022

@github-actions crossbow submit -g nightly-tests -g nightly-release -g nightly-packaging

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 29, 2022

@github-actions crossbow submit -g nightly-tests -g nightly-release -g nightly-packaging

@github-actions

This comment was marked as outdated.

ARROW_FLIGHT=OFF \
ARROW_GANDIVA=OFF \
ARROW_HDFS=ON \
ARROW_JSON=ON \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is all this required for R? AFAIR, Python is only used to test PyArrow-R interoperability here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nealrichardson Do you know what components are needed here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we can disable ARROW_HDFS. Other components seems necessary. I'll try it.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Added a few comments

@@ -96,25 +96,27 @@ cmake \
-DARROW_BUILD_SHARED=ON \
-DARROW_BUILD_STATIC=OFF \
-DARROW_BUILD_TESTS=OFF \
-DARROW_COMPUTE=ON \
-DARROW_CSV=ON \
-DARROW_DATASET=${ARROW_DATASET} \
Copy link
Member

@jorisvandenbossche jorisvandenbossche Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just hardcode this to ON? (the ARROW_PYTHON=ON would have ensured that) I don't think we want to create wheels without dataset enabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to create wheels without dataset enabled

I think so but ${ARROW_DATASET} is also used for export PYARROW_WITH_DATASET=${ARROW_DATASET} below. So I think that we use ${ARROW_DATASET} here for consistency. (ARROW_DATASET is initialized by : ${ARROW_DATASET:=ON}.)

@@ -129,9 +131,9 @@ cmake \
-DCMAKE_INSTALL_PREFIX=${build_dir}/install \
-DCMAKE_OSX_ARCHITECTURES=${CMAKE_OSX_ARCHITECTURES} \
-DCMAKE_UNITY_BUILD=${CMAKE_UNITY_BUILD} \
-DOPENSSL_USE_STATIC_LIBS=ON \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to the other changes. Can you give a short reasoning for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is unrelated to ARROW_PYTHON. Sorry for including this to this pull request.

OPENSSL_USE_STATIC_LIBS is redundant here because there is -DARROW_DEPENDENCY_USE_SHARED=OFF in this command line. OPENSSL_USE_STATIC_LIBS is set automatically in https://github.com/apache/arrow/blob/master/cpp/cmake_modules/FindOpenSSLAlt.cmake#L48-L52 .

I just found this by sorting CMake options. So I mix this change into this pull request. Sorry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to be sorry. I was just wondering about the reasoning

Comment on lines 97 to 98
-DARROW_DATASET=${ARROW_DATASET} \
-DARROW_DATASET=ON \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-DARROW_DATASET=${ARROW_DATASET} \
-DARROW_DATASET=ON \
-DARROW_DATASET=ON \

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks but I choose -DARROW_DATASET=${ARROW_DATASET} for consistency.

"ARROW_COMPUTE": "ON",
"ARROW_CSV": "ON",
"ARROW_FILESYSTEM": "ON",
"ARROW_JSON": "ON"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't yet used the presets myself, but shall we include ARROW_DATASET here as well? (that was included with the previous features-python preset.
Maybe you can also keep the exact name to keep it working for people that were using those presets?

cc @wjones127

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm fine not including datasets if it's not required for the Python build. In most workflows with CMakePresets.json, I think we expect developers to create their own presets in CMakeUserPresets.json, which will inherit from the provided once. (See an example here)

Once this is merged, I can send a notice to the mailing list with instructions on how to transition to the new presets. (I need to update my blog post anyways.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can also keep the exact name to keep it working for people that were using those presets?

It makes sense. I added no -minimal/-maximal versions.

..
$ make -j4
$ make install
$ cmake --build . --target install --config Debug
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding: is the --config Debug needed if you already use CMAKE_BUILD_TYPE=Debug above?

And will this line also build in parallel?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the --config Debug is redundant.

In my experience, it seems to automatically build in parallel, but you can also add the option explicitly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry. I added this accidentally. I revert this.

I found -j4 isn't suitable here. We can use -j$(nproc) on Linux and -j$(sysctl -n hw.ncpu) on macOS. But I thought that it's better that we use Ninja here. And I started rewriting this for Ninja but I stopped it because this pull request isn't for the change. But this change remained.

kou and others added 8 commits September 30, 2022 05:55
Restore it but it's marked as a deprecated option. Because the Python
component in Apache Arrow C++ was moved to PyArrow by ARROW-16340.
It' removed in a feature release.

Users should use CMake presets instead of ARROW_PYTHON but CMake
presets requires CMake 3.19 or later.
Co-authored-by: Antoine Pitrou <[email protected]>
@kou
Copy link
Member Author

kou commented Sep 29, 2022

@github-actions crossbow submit -g nightly-tests -g nightly-release -g nightly-packaging

@github-actions
Copy link

Revision: 954f1ad

Submitted crossbow builds: ursacomputing/crossbow @ actions-a153f6e909

Task Status
almalinux-8-amd64 Github Actions
almalinux-8-arm64 TravisCI
almalinux-9-amd64 Github Actions
almalinux-9-arm64 TravisCI
amazon-linux-2-amd64 Github Actions
amazon-linux-2-arm64 TravisCI
centos-7-amd64 Github Actions
centos-8-stream-amd64 Github Actions
centos-8-stream-arm64 TravisCI
centos-9-stream-amd64 Github Actions
centos-9-stream-arm64 TravisCI
conan-maximum Github Actions
conan-minimum Github Actions
conda-clean Azure
conda-linux-gcc-py310-arm64 Azure
conda-linux-gcc-py310-cpu Azure
conda-linux-gcc-py310-cuda Azure
conda-linux-gcc-py310-ppc64le Azure
conda-linux-gcc-py37-arm64 Azure
conda-linux-gcc-py37-cpu-r40 Azure
conda-linux-gcc-py37-cpu-r41 Azure
conda-linux-gcc-py37-cuda Azure
conda-linux-gcc-py37-ppc64le Azure
conda-linux-gcc-py38-arm64 Azure
conda-linux-gcc-py38-cpu Azure
conda-linux-gcc-py38-cuda Azure
conda-linux-gcc-py38-ppc64le Azure
conda-linux-gcc-py39-arm64 Azure
conda-linux-gcc-py39-cpu Azure
conda-linux-gcc-py39-cuda Azure
conda-linux-gcc-py39-ppc64le Azure
conda-osx-arm64-clang-py310 Azure
conda-osx-arm64-clang-py38 Azure
conda-osx-arm64-clang-py39 Azure
conda-osx-clang-py310 Azure
conda-osx-clang-py37-r40 Azure
conda-osx-clang-py37-r41 Azure
conda-osx-clang-py38 Azure
conda-osx-clang-py39 Azure
conda-win-vs2019-py310 Azure
conda-win-vs2019-py37-r40 Azure
conda-win-vs2019-py37-r41 Azure
conda-win-vs2019-py38 Azure
conda-win-vs2019-py39 Azure
debian-bookworm-amd64 Github Actions
debian-bookworm-arm64 TravisCI
debian-bullseye-amd64 Github Actions
debian-bullseye-arm64 TravisCI
example-cpp-minimal-build-static Github Actions
example-cpp-minimal-build-static-system-dependency Github Actions
example-python-minimal-build-fedora-conda Github Actions
example-python-minimal-build-ubuntu-venv Github Actions
homebrew-cpp Github Actions
homebrew-r-autobrew Github Actions
homebrew-r-brew Github Actions
java-jars Github Actions
nuget Github Actions
python-sdist Github Actions
r-binary-packages Github Actions
test-alpine-linux-cpp Github Actions
test-build-cpp-fuzz Github Actions
test-build-vcpkg-win Github Actions
test-conda-cpp Github Actions
test-conda-cpp-valgrind Azure
test-conda-python-3.10 Github Actions
test-conda-python-3.7 Github Actions
test-conda-python-3.7-hdfs-2.9.2 Github Actions
test-conda-python-3.7-hdfs-3.2.1 Github Actions
test-conda-python-3.7-kartothek-latest Github Actions
test-conda-python-3.7-kartothek-master Github Actions
test-conda-python-3.7-pandas-0.24 Github Actions
test-conda-python-3.7-pandas-latest Github Actions
test-conda-python-3.7-spark-v3.1.2 Github Actions
test-conda-python-3.8 Github Actions
test-conda-python-3.8-hypothesis Github Actions
test-conda-python-3.8-pandas-latest Github Actions
test-conda-python-3.8-pandas-nightly Github Actions
test-conda-python-3.8-spark-v3.2.0 Github Actions
test-conda-python-3.9 Github Actions
test-conda-python-3.9-dask-latest Github Actions
test-conda-python-3.9-dask-master Github Actions
test-conda-python-3.9-pandas-master Github Actions
test-conda-python-3.9-spark-master Github Actions
test-debian-10-cpp-amd64 Github Actions
test-debian-10-cpp-i386 Github Actions
test-debian-11-cpp-amd64 Github Actions
test-debian-11-cpp-i386 Github Actions
test-debian-11-go-1.17 Azure
test-debian-11-python-3 Azure
test-debian-c-glib Github Actions
test-debian-ruby Github Actions
test-fedora-35-cpp Github Actions
test-fedora-35-python-3 Azure
test-fedora-r-clang-sanitizer Azure
test-r-arrow-backwards-compatibility Github Actions
test-r-depsource-bundled Azure
test-r-depsource-system Github Actions
test-r-dev-duckdb Github Actions
test-r-devdocs Github Actions
test-r-gcc-11 Github Actions
test-r-gcc-12 Github Actions
test-r-install-local Github Actions
test-r-linux-as-cran Github Actions
test-r-linux-rchk Github Actions
test-r-linux-valgrind Azure
test-r-minimal-build Azure
test-r-offline-maximal Github Actions
test-r-offline-minimal Azure
test-r-rhub-debian-gcc-devel-lto-latest Azure
test-r-rhub-debian-gcc-release-custom-ccache Azure
test-r-rhub-ubuntu-gcc-release-latest Azure
test-r-rocker-r-base-latest Azure
test-r-rstudio-r-base-4.1-opensuse153 Azure
test-r-rstudio-r-base-4.2-centos7-devtoolset-8 Azure
test-r-rstudio-r-base-4.2-focal Azure
test-r-ubuntu-22.04 Github Actions
test-r-versions Github Actions
test-skyhook-integration Github Actions
test-ubuntu-18.04-cpp Github Actions
test-ubuntu-18.04-cpp-release Github Actions
test-ubuntu-18.04-cpp-static Github Actions
test-ubuntu-18.04-r-sanitizer Azure
test-ubuntu-20.04-cpp Github Actions
test-ubuntu-20.04-cpp-17 Github Actions
test-ubuntu-20.04-cpp-bundled Github Actions
test-ubuntu-20.04-cpp-thread-sanitizer Github Actions
test-ubuntu-20.04-python-3 Azure
test-ubuntu-22.04-cpp Github Actions
test-ubuntu-c-glib Github Actions
test-ubuntu-default-docs Azure
test-ubuntu-ruby Github Actions
ubuntu-bionic-amd64 Github Actions
ubuntu-bionic-arm64 TravisCI
ubuntu-focal-amd64 Github Actions
ubuntu-focal-arm64 TravisCI
ubuntu-jammy-amd64 Github Actions
ubuntu-jammy-arm64 TravisCI
verify-rc-source-cpp-linux-almalinux-8-amd64 Github Actions
verify-rc-source-cpp-linux-conda-latest-amd64 Github Actions
verify-rc-source-cpp-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-cpp-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-cpp-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-cpp-macos-amd64 Github Actions
verify-rc-source-cpp-macos-arm64 Github Actions
verify-rc-source-cpp-macos-conda-amd64 Github Actions
verify-rc-source-csharp-linux-almalinux-8-amd64 Github Actions
verify-rc-source-csharp-linux-conda-latest-amd64 Github Actions
verify-rc-source-csharp-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-csharp-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-csharp-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-csharp-macos-amd64 Github Actions
verify-rc-source-csharp-macos-arm64 Github Actions
verify-rc-source-go-linux-almalinux-8-amd64 Github Actions
verify-rc-source-go-linux-conda-latest-amd64 Github Actions
verify-rc-source-go-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-go-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-go-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-go-macos-amd64 Github Actions
verify-rc-source-go-macos-arm64 Github Actions
verify-rc-source-integration-linux-almalinux-8-amd64 Github Actions
verify-rc-source-integration-linux-conda-latest-amd64 Github Actions
verify-rc-source-integration-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-integration-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-integration-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-integration-macos-amd64 Github Actions
verify-rc-source-integration-macos-arm64 Github Actions
verify-rc-source-integration-macos-conda-amd64 Github Actions
verify-rc-source-java-linux-almalinux-8-amd64 Github Actions
verify-rc-source-java-linux-conda-latest-amd64 Github Actions
verify-rc-source-java-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-java-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-java-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-java-macos-amd64 Github Actions
verify-rc-source-js-linux-almalinux-8-amd64 Github Actions
verify-rc-source-js-linux-conda-latest-amd64 Github Actions
verify-rc-source-js-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-js-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-js-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-js-macos-amd64 Github Actions
verify-rc-source-js-macos-arm64 Github Actions
verify-rc-source-python-linux-almalinux-8-amd64 Github Actions
verify-rc-source-python-linux-conda-latest-amd64 Github Actions
verify-rc-source-python-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-python-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-python-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-python-macos-amd64 Github Actions
verify-rc-source-python-macos-arm64 Github Actions
verify-rc-source-python-macos-conda-amd64 Github Actions
verify-rc-source-ruby-linux-almalinux-8-amd64 Github Actions
verify-rc-source-ruby-linux-conda-latest-amd64 Github Actions
verify-rc-source-ruby-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-ruby-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-ruby-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-ruby-macos-amd64 Github Actions
verify-rc-source-ruby-macos-arm64 Github Actions
verify-rc-source-windows Github Actions
wheel-macos-big-sur-cp310-arm64 Github Actions
wheel-macos-big-sur-cp310-universal2 Github Actions
wheel-macos-big-sur-cp38-arm64 Github Actions
wheel-macos-big-sur-cp39-arm64 Github Actions
wheel-macos-big-sur-cp39-universal2 Github Actions
wheel-macos-mojave-cp310-amd64 Github Actions
wheel-macos-mojave-cp37-amd64 Github Actions
wheel-macos-mojave-cp38-amd64 Github Actions
wheel-macos-mojave-cp39-amd64 Github Actions
wheel-manylinux2014-cp310-amd64 Github Actions
wheel-manylinux2014-cp310-arm64 TravisCI
wheel-manylinux2014-cp37-amd64 Github Actions
wheel-manylinux2014-cp37-arm64 TravisCI
wheel-manylinux2014-cp38-amd64 Github Actions
wheel-manylinux2014-cp38-arm64 TravisCI
wheel-manylinux2014-cp39-amd64 Github Actions
wheel-manylinux2014-cp39-arm64 TravisCI
wheel-windows-cp310-amd64 Github Actions
wheel-windows-cp37-amd64 Github Actions
wheel-windows-cp38-amd64 Github Actions
wheel-windows-cp39-amd64 Github Actions

@kou
Copy link
Member Author

kou commented Sep 29, 2022

I think that this is ready to merge.
Build failures aren't related to this change.

@kou kou merged commit 89c0214 into apache:master Oct 1, 2022
@kou kou deleted the cpp-cmake-python branch October 1, 2022 11:46
@ursabot
Copy link

ursabot commented Oct 1, 2022

Benchmark runs are scheduled for baseline = 4dfa617 and contender = 89c0214. 89c0214 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.31% ⬆️0.0%] test-mac-arm
[Failed ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.11% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 89c0214f ec2-t3-xlarge-us-east-2
[Finished] 89c0214f test-mac-arm
[Failed] 89c0214f ursa-i9-9960x
[Finished] 89c0214f ursa-thinkcentre-m75q
[Finished] 4dfa6176 ec2-t3-xlarge-us-east-2
[Failed] 4dfa6176 test-mac-arm
[Failed] 4dfa6176 ursa-i9-9960x
[Finished] 4dfa6176 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

fatemehp pushed a commit to fatemehp/arrow that referenced this pull request Oct 17, 2022
…che#14273)

Restore it but it's marked as a deprecated option. Because the Python component in Apache Arrow C++ was moved to PyArrow by ARROW-16340. It' removed in a feature release.

Users should use CMake presets instead of ARROW_PYTHON but CMake presets requires CMake 3.19 or later.

Lead-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants