-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-3366: [R] Dockerfile for docker-compose setup #2770
Conversation
I don't really speak docker, but that LGTM |
@jameslamb great, thanks for implementing it! I've recently changed the semantics of the docker-compose containers to optimize the build times, We need to adjust the image a little bit. |
Is this ready to go? |
@wesm @romainfrancois sorry for the delay, no this is not ready to go yet. I will update it this weekend and |
@kszucs I got past the install issue (by updating Running
Makes it through building the C++ package and compiling the source code in the R package, but fails to install the R library with this error:
I found the same problem reported in #1319 so going to look into that diff. |
Oh also I rebased to |
Codecov Report
@@ Coverage Diff @@
## master #2770 +/- ##
==========================================
+ Coverage 86.63% 86.63% +<.01%
==========================================
Files 491 491
Lines 69438 69438
==========================================
+ Hits 60157 60158 +1
+ Misses 9189 9188 -1
Partials 92 92
Continue to review full report at Codecov.
|
I put the R build instructions (like But still stuck on this undefined symbol error. Any advice would be appreciated! |
@jameslamb it looks like one part of the toolchain used |
|
@kszucs thanks for the help with this! IMHO it wouldn't be worthwhile to cache the built R package, since checking whether it can be recompiled (given some change in the As for the test errors...I can replicate them inside the container but they don't show up on my Mac or on Travis, so definitely something I need to look into. I don't think we need to worry about the portability warning in the |
I have add #2888 to deal with one problem with the R package (undeclared dependency) but I'm still struggling. Current error I get when trying to install the package:
I'm actually able to install the package with
And can technically library it in
But that undefined symbol error comes up when I run I'm not sure what to do next. I've looked through the Travis setup for R (was succeeding at least as of this build) but nothing in there is striking me as materially different from what I've done. |
coming back to this tonight. I'm still stuck on that undefined symbol error. Will report back if I make progress. |
You have a build toolchain problem (compiler flags). One component is using the gcc5 ABI while another is not. |
In Travis CI I think we are using gcc < 5 so this can't happen there |
oh interesting ok! Looking into it. |
ahhhh I see this in
And confirmed I'm running gcc 7.x in the container. Ok this is progress! |
See my comment above for the flag you need to pass to the part that is being built with the newer ABI |
I'm already using this:
And I can tell from the R logs that R is respecting that flag. Is that the flag you're talking about? The symbol it's complaining about is in Is there some other environment variable I need to be setting to avoid trying to build the torch stuff? It's not obvious to me looking at |
So in the error message
This means that whatever shared library is linking to libarrow.so has inlined some functions from the Arrow headers with the gcc5 ABI. Unmangled the symbol is
so in some compilation unit the CXXFLAGS isn't being propagated |
Here's what I'm seeing running this locally: https://gist.github.com/wesm/ac8f6f8072620e0bb4c9de07f7deb5f3 So the extension appears to link fine. In this part
"checking whether package 'arrow' can be installed ... OK" took a LONG time. Is it compiling the extension again? I speculate that it's at this point that the CXXFLAGS in the environment is being cleansed. It seems logical to me that R CMD CHECK would want to be robust to snowflake-y environment variables @romainfrancois is there a way to force a certain value in CXXFLAGS to be appended without relying on an environment variable? |
@wesm I'm going to try setting it directly in Will report back shortly |
That wouldn't be a long term solution (since this variable needs to match libarrow.so, which might have been built with the newer ABI), but at least it will unblock us here for now |
yeah I mean at least if that works, we've at least isolated the problem and I can open an issue documenting it for someone to pick up in the future. |
I ALMOST GOT IT!!!! I think I'm one failing unit test away from this working. This test (https://github.com/apache/arrow/blob/master/r/tests/testthat/test-RecordBatch.R#L140) on I have a few ideas, testing now. (also, btw, I force pushed because I rebased to as-of-5-minutes-ago master). Updates to follow. |
@@ -137,7 +137,6 @@ test_that("read_record_batch handles various streams (ARROW-3450, ARROW-3505)", | |||
batch4 <- read_record_batch(mmap_file) | |||
batch5 <- read_record_batch(bytes) | |||
batch6 <- read_record_batch(buf_reader) | |||
expect_error(read_record_batch(buf_reader)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romainfrancois I believe this PR is now working! However, I want to talk about this test.
I had to remove this to get R CMD CHECK
or devtools::test()
to complete. I think the issue is that expect_error
cannot trap C++ exceptions. When running with this test restored, testing immediately stops with error:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted
Have you ever seen that issue with expect_error()
before? I tried googling around and nothing obvious jumped out to me. I have no idea how this could be working on Travis but not in my setup on docker :/.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created issue for it: https://issues.apache.org/jira/browse/ARROW-3833
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question is why an exception is being thrown at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s what Rcpp::stop does, but this is supposed to be caught by the generated code.
I’ll have a look but this should not happen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks @jameslamb !
Thank you to @romainfrancois and others who have pushed forward the R side of this project!
This PR is my attempt to address ARROW-3336, providing a testing container for the R package.
This follows up on work done by @kszucs in #2572 in an R-specific way.
NOTE: This PR is a WIP.
R CMD INSTALL
currently fails because it cannot find wherever I installedarrow
to. But I felt that this is far enough along to put up for review.