Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-13199: [R] add ubuntu 21.04 to nightly builds #10611

Closed
wants to merge 17 commits into from

Conversation

jonkeane
Copy link
Member

No description provided.

@github-actions
Copy link

@jonkeane
Copy link
Member Author

@github-actions crossbow submit test-r-rocker-r-edge-latest

@github-actions
Copy link

Revision: e342c87

Submitted crossbow builds: ursacomputing/crossbow @ actions-521

Task Status
test-r-rocker-r-edge-latest Azure

@kszucs
Copy link
Member

kszucs commented Jun 29, 2021

@jonkeane the change itself looks good to me, though the build is failing.

@jonkeane
Copy link
Member Author

Thanks for taking a look at this. I actually think that I might change the implementation and manage a docker file that does something similar in our repo (while this one is published to rocker, it looks like quite a bit of a one-off).

The failure is accurate: we are not currently able to build on gcc11. I've tried bumping both aws-c-common and aws-sdk-cpp versions. The first resolves one error, though the second one doesn't build due to changes in the way that aws-sdk-cpp handles dependencies. I'm still trying to get it, but it looks like they've transitioned to a setup where they require using git (+ submodules) to grab the dependencies.

@jonkeane
Copy link
Member Author

@github-actions crossbow submit test-r-ubuntu-21.04 test-r-gcc-11

@github-actions
Copy link

Revision: 910cd0f

Submitted crossbow builds: ursacomputing/crossbow @ actions-535

Task Status
test-r-gcc-11 Github Actions
test-r-ubuntu-21.04 Github Actions

@jonkeane
Copy link
Member Author

Ok, I've split these out into two different jobs:

  • one that is 21.04 directly (which fails when building R with a segfault on loading)
  • one that is testing gcc 11 and that shows issues building the aws-sdk

@jonkeane
Copy link
Member Author

I'm going to try and resolve the 21.04 issue now, but we might need to punt on gcc11 support for now, especially in light of https://issues.apache.org/jira/browse/ARROW-13134 which is pinning to a version of the aws-sdk that won't build cleanly on gcc11

@pitrou
Copy link
Member

pitrou commented Jun 30, 2021

Or you can disable S3 on that build for now?

@jonkeane
Copy link
Member Author

@github-actions crossbow submit -g r

@github-actions
Copy link

Revision: 42b54cf

Submitted crossbow builds: ursacomputing/crossbow @ actions-539

Task Status
conda-linux-gcc-py36-cpu-r40 Azure
conda-linux-gcc-py37-cpu-r41 Azure
conda-osx-clang-py36-r40 Azure
conda-osx-clang-py37-r41 Azure
conda-win-vs2017-py36-r40 Azure
conda-win-vs2017-py37-r41 Azure
homebrew-r-autobrew Github Actions
test-r-devdocs Github Actions
test-r-gcc-11 Github Actions
test-r-install-local Github Actions
test-r-linux-as-cran Github Actions
test-r-linux-rchk Github Actions
test-r-linux-valgrind Azure
test-r-minimal-build Azure
test-r-rhub-ubuntu-gcc-release-latest Azure
test-r-rocker-r-base-latest Azure
test-r-rstudio-r-base-3.6-bionic Azure
test-r-rstudio-r-base-3.6-centos7-devtoolset-8 Azure
test-r-rstudio-r-base-3.6-centos8 Azure
test-r-rstudio-r-base-3.6-opensuse15 Azure
test-r-rstudio-r-base-3.6-opensuse42 Azure
test-r-rtools-35 Github Actions
test-r-ubuntu-21.04 Github Actions
test-r-version-compatibility Github Actions
test-r-versions Github Actions
test-r-without-arrow Azure
test-ubuntu-18.04-r-sanitizer Azure

@jonkeane jonkeane force-pushed the ARROW-13199-r-ubuntu-21.04 branch from 42b54cf to 9e932c0 Compare July 1, 2021 02:11
@jonkeane
Copy link
Member Author

jonkeane commented Jul 1, 2021

@github-actions crossbow submit -g r

@github-actions
Copy link

github-actions bot commented Jul 1, 2021

Revision: 9e932c0

Submitted crossbow builds: ursacomputing/crossbow @ actions-541

Task Status
conda-linux-gcc-py36-cpu-r40 Azure
conda-linux-gcc-py37-cpu-r41 Azure
conda-osx-clang-py36-r40 Azure
conda-osx-clang-py37-r41 Azure
conda-win-vs2017-py36-r40 Azure
conda-win-vs2017-py37-r41 Azure
homebrew-r-autobrew Github Actions
test-r-devdocs Github Actions
test-r-gcc-11 Github Actions
test-r-install-local Github Actions
test-r-linux-as-cran Github Actions
test-r-linux-rchk Github Actions
test-r-linux-valgrind Azure
test-r-minimal-build Azure
test-r-rhub-ubuntu-gcc-release-latest Azure
test-r-rocker-r-base-latest Azure
test-r-rstudio-r-base-3.6-bionic Azure
test-r-rstudio-r-base-3.6-centos7-devtoolset-8 Azure
test-r-rstudio-r-base-3.6-centos8 Azure
test-r-rstudio-r-base-3.6-opensuse15 Azure
test-r-rstudio-r-base-3.6-opensuse42 Azure
test-r-rtools-35 Github Actions
test-r-ubuntu-21.04 Github Actions
test-r-version-compatibility Github Actions
test-r-versions Github Actions
test-r-without-arrow Azure
test-ubuntu-18.04-r-sanitizer Azure

@jonkeane
Copy link
Member Author

jonkeane commented Jul 1, 2021

@github-actions crossbow submit test-r-ubuntu-21.04 test-r-gcc-11

@github-actions
Copy link

github-actions bot commented Jul 1, 2021

Revision: 973f4b0

Submitted crossbow builds: ursacomputing/crossbow @ actions-548

Task Status
test-r-gcc-11 Github Actions
test-r-ubuntu-21.04 Github Actions

@@ -19,6 +19,9 @@ ARG base
FROM ${base}
ARG arch

ARG arrow_build_static="OFF"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove these.

@@ -84,7 +97,7 @@ COPY python/requirements-build.txt /arrow/python/
RUN pip install -r arrow/python/requirements-build.txt

ENV \
ARROW_BUILD_STATIC=OFF \
ARROW_BUILD_STATIC=${arrow_build_static} \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore the default values instead of using a build argument.

@@ -95,7 +108,7 @@ ENV \
ARROW_PARQUET=ON \
ARROW_PLASMA=OFF \
ARROW_PYTHON=ON \
ARROW_S3=ON \
ARROW_S3=${arrow_s3} \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

UBUNTU: 21.04
CLANG_TOOLS: 9 # can remove this when >=9 is the default
GCC_VERSION: 11
ARROW_S3: OFF # S3 support is not buildable with gcc11 right now
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add:

env: ...
flags: "-e ARROW_S3=OFF"

@@ -1006,6 +1008,8 @@ services:
arch: ${ARCH}
r: ${R}
base: ${REPO}:${ARCH}-ubuntu-${UBUNTU}-cpp
gcc_version: ${GCC_VERSION}
arrow_s3: ${ARROW_S3}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove arrow_s3.

@@ -1019,6 +1023,38 @@ services:
/arrow/ci/scripts/python_build.sh /arrow /build &&
/arrow/ci/scripts/r_test.sh /arrow"

ubuntu-r-static:
Copy link
Member

@kszucs kszucs Jul 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ubuntu-r-static ideally should be removed and called with:

archery docker run -e ARROW_BUILD_STATIC=ON -e ANOTHER_ENV_VAR=ETC ubuntu-r

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See an example of a static C++ build using the ubuntu-cpp image.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ha! That's what I was missing. Thanks for that pointer

Copy link
Member

@kszucs kszucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build time variables

The variables in the .env files are build time variables used during the docker build commands, in other words used to produce docker images. We pass these arguments as environment variables to the archery command itself:

UBUNTU=21.04 archery docker run ubuntu-cpp

Runtime variables

Whereas the "runtime" variables passed to the docker container to alter the behavior of the command we run inside the image:

/bin/bash -c "
        /arrow/ci/scripts/cpp_build.sh /arrow /build &&
        /arrow/ci/scripts/python_build.sh /arrow /build &&
        /arrow/ci/scripts/r_test.sh /arrow"

We typically list these runtime variables at the and of the dockerfiles and in the docker-compose.yml as ENV and environment variables (not build-arguments). We can pass custom environment variables to the containers using:

archery docker run --env ARROW_PARQUET=OFF -e ARROW_BUILD_TESTS=ON ubuntu-cpp

@jonkeane
Copy link
Member Author

jonkeane commented Jul 2, 2021

@github-actions crossbow submit test-r-ubuntu-21.04 test-r-gcc-11

@github-actions
Copy link

github-actions bot commented Jul 2, 2021

Revision: b209a48

Submitted crossbow builds: ursacomputing/crossbow @ actions-561

Task Status
test-r-gcc-11 Github Actions
test-r-ubuntu-21.04 Github Actions

@jonkeane
Copy link
Member Author

jonkeane commented Jul 2, 2021

Thanks for the help @kszucs , I've made the changes that implement what you suggested (I'll push them in a bit).

One difference using the same ubuntu-r docker layer for both non-static and static is that the non-static one will build a version of the cpp that we then ignore later when R is being installed. Since we're running the same command:

    command: >
      /bin/bash -c "
        /arrow/ci/scripts/cpp_build.sh /arrow /build &&
        /arrow/ci/scripts/python_build.sh /arrow /build &&
        /arrow/ci/scripts/r_test.sh /arrow"

That wouldn't be a huge deal, but I'm running into an issue with the GCC11 build where if we do run /arrow/ci/scripts/cpp_build.sh /arrow /build first, the r package fails to build the cpp when it needs to: https://github.com/ursacomputing/crossbow/runs/2973848414#step:7:6488 (note: it's possible this is resolveable with a different change, I haven't dug too deeply into it — and don't want to delay the other fixes in this ticket tracking down this particular failure)

If I change this command to be:

    command: >
      /bin/bash -c "
        /arrow/ci/scripts/r_test.sh /arrow"

it passes just fine. Is there a way with archery to override the docker command (e.g. from the tasks.yml file) (like I can do with the docker-compose commands below)? I've looked and tried a few things, but nothing seemed to work.

Another alternative would be to create another docker-compose service like the following (and we might even move some of the content in flags here too). Though this is similar to what I had before that we've changed

  ubuntu-r-static:
    extends: ubuntu-r
    command: /bin/bash -c "/arrow/ci/scripts/r_test.sh /arrow"

@jonkeane
Copy link
Member Author

jonkeane commented Jul 2, 2021

Hmm, it looks like that supplying a custom command should work out of the box now that I read the source, but this is what I'm getting when I try with this command:

UBUNTU=21.04 ARROW_R_DEV=TRUE CLANG_TOOLS=9 GCC_VERSION=11 docker-compose run -e ARROW_SOURCE_HOME="/arrow" -e FORCE_BUNDLED_BUILD=TRUE -e LIBARROW_BUILD=TRUE -e ARROW_S3=OFF ubuntu-r "/bin/bash -c '/arrow/ci/scripts/r_test.sh /arrow'"
Creating arrow_ubuntu-r_run ... done
Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: exec: "/bin/bash -c '/arrow/ci/scripts/r_test.sh /arrow'": stat /bin/bash -c '/arrow/ci/scripts/r_test.sh /arrow': no such file or directory: unknown
ERROR: 1

@jonkeane
Copy link
Member Author

jonkeane commented Jul 2, 2021

@github-actions crossbow submit test-r-ubuntu-21.04 test-r-gcc-11

@github-actions
Copy link

github-actions bot commented Jul 2, 2021

Revision: 5ad97c6

Submitted crossbow builds: ursacomputing/crossbow @ actions-563

Task Status
test-r-gcc-11 Github Actions
test-r-ubuntu-21.04 Github Actions

@pachadotdev
Copy link
Contributor

I moved to 21.04 on my laptop to avoid drivers problems... and now I'm solving problems to install Arrow, like moving to a better circle in Dante's Inferno

Here are the installation steps I used

git clone -b ARROW-13199-r-ubuntu-21.04 [email protected]:jonkeane/arrow.git ARROW-13199-r-ubuntu-21.04

cd ARROW-13199-r-ubuntu-21.04/r
ln -s '/home/pacha/github/arrow-setup/02-install-arrow.sh'
bash 02-install-arrow.sh

02-install-arrow.sh corresponds to https://github.com/pachadotdev/arrow-setup/blob/main/02-install-arrow.sh, and I installed with ninja configures as in https://github.com/pachadotdev/arrow-setup/blob/main/01-install-dependencies.sh

This is the output after building successfully from this branch:

> arrow_info()
Arrow package version: 4.0.1.9000

Capabilities:
               
dataset    TRUE
parquet    TRUE
s3         TRUE
utf8proc   TRUE
re2        TRUE
snappy     TRUE
gzip       TRUE
brotli     TRUE
zstd       TRUE
lz4        TRUE
lz4_frame  TRUE
lzo       FALSE
bz2        TRUE
jemalloc   TRUE
mimalloc  FALSE

Memory:
                  
Allocator jemalloc
Current    0 bytes
Max        0 bytes

Runtime:
                        
SIMD Level          avx2
Detected SIMD Level avx2

Build:
                                                             
C++ Library Version                            5.0.0-SNAPSHOT
C++ Compiler                                              GNU
C++ Compiler Version                                   10.3.0
Git ID               5ad97c6c43b936a64c7124f0407799a768a5acae

@jonkeane
Copy link
Member Author

jonkeane commented Jul 5, 2021

I've created https://issues.apache.org/jira/browse/ARROW-13261 to remove the additional service once GCC 11 building is fixed (either via https://issues.apache.org/jira/browse/ARROW-13241 or another ticket)

@jonkeane jonkeane closed this in 41c4143 Jul 5, 2021
@kszucs
Copy link
Member

kszucs commented Jul 6, 2021

@jonkeane Sorry, just seen the notification. Just wanted to confirm that your solution looks good to me.

@jonkeane
Copy link
Member Author

jonkeane commented Jul 6, 2021

No problem, thanks for the confirmation!

@kou
Copy link
Member

kou commented Jul 6, 2021

@jonkeane It seems that we can't use extends in docker-compose.yml.

$ archery docker run python-wheel-manylinux-2014 bash
...
ValueError: Found errors with docker-compose:
 - The Compose file '/home/kou/work/cpp/arrow.kou/docker-compose.yml' is invalid because:
 - Unsupported config option for services.ubuntu-r-only-r: 'extends'

@jonkeane
Copy link
Member Author

jonkeane commented Jul 7, 2021

Out of curiosity, what version of docker-compose are you running? It looks like it was removed from the v3 specification, but then added back in. I'm running version 1.29.1 and it seems to work for me.

Regardless of it being supported in new versions, we might want to revert this to the prior full but more repetitive specification I had earlier in the PR process (or we could figure out / fix overriding the command issue I ran into).

@kou
Copy link
Member

kou commented Jul 7, 2021

Thanks. I'm using 1.25.0 installed by apt: https://packages.debian.org/search?keywords=docker-compose

I'll use newer docker-compose.

kou added a commit to kou/arrow that referenced this pull request Jul 8, 2021
kou added a commit that referenced this pull request Jul 8, 2021
We need it for "extends".

See also:

  * https://issues.apache.org/jira/browse/ARROW-13199
  * #10611
  * docker/compose#7588

Closes #10681 from kou/require-docker-compose-1.27.0-or-later

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants