Skip to content

Commit

Permalink
ARROW-5488: [R] Workaround when C++ lib not available
Browse files Browse the repository at this point in the history
This is very much a work in progress. The idea is to replace the code generation that is usually done by `Rcpp::compileAttributes()` with something custom. This is driven by the `data-raw/codegen.R` file, which I'll polish. I'm opening this right now for feedback opportunities.

All of the `.cpp` files are guarded by :

```
#if defined(ARROW_R_WITH_ARROW)
...
#endif
```

And `ARROW_R_WITH_ARROW` is defined via the configure file, if the library was indeed used.

For example, this function:

```cpp
// [[arrow::export]]
std::shared_ptr<arrow::Column> ipc___feather___TableReader__GetColumn(
    const std::unique_ptr<arrow::ipc::feather::TableReader>& reader, int i) {
  std::shared_ptr<arrow::Column> column;
  STOP_IF_NOT_OK(reader->GetColumn(i, &column));
  return column;
}
```

triggers generation of this in the `generated.cpp` file:

```cpp
#if defined(ARROW_R_WITH_ARROW)
std::string ipc___feather___TableReader__GetColumnName(const std::unique_ptr<arrow::ipc::feather::TableReader>& reader, int i);
SEXP _arrow_ipc___feather___TableReader__GetColumnName(SEXP reader_sexp, SEXP i_sexp){
  BEGIN_RCPP
  Rcpp::traits::input_parameter<const std::unique_ptr<arrow::ipc::feather::TableReader>&>::type reader(reader_sexp);
Rcpp::traits::input_parameter<int>::type i(i_sexp);Rcpp::Shield<SEXP> rcpp_result_gen(Rcpp::wrap(ipc___feather___TableReader__GetColumnName( reader, i)));
return rcpp_result_gen;
  END_RCPP
}
#else
SEXP _arrow_ipc___feather___TableReader__GetColumnName(SEXP reader_sexp, SEXP i_sexp){
  BEGIN_RCPP
  Rcpp::stop("arrow C++ library not available");
  END_RCPP
}
#endif
```

So the generated R api `SEXP`y functions only call the real thing when the c++ library is available, otherwise they just throw an error.

and this in the `generated.R` file:

```r
ipc___feather___TableReader__GetColumnName <- function(reader, i) {
    .Call(`_arrow_ipc___feather___TableReader__GetColumnName` , reader, i )
}
```

This also needed some extra care in test functions so that the tests only run if Arrow is available.

@wesm comment from https://issues.apache.org/jira/browse/ARROW-5488 might be more practical and closer to what @jjallaire mentioned about how the `RcppParallel` package does it with Intel tbb:
https://github.com/RcppCore/RcppParallel/blob/master/R/hooks.R

> One possibility is to bundle the Arrow header files with the CRAN package and build against them, but do not include libarrow and libparquet when linking. When the library is loaded, the libraries must be loaded in-process via dlopen before loading the Rcpp extensions. The C++ libraries can be installed then after the fact

Author: Romain Francois <[email protected]>
Author: Romain François <[email protected]>

Closes #4471 from romainfrancois/ARROW-5488/workaround and squashes the following commits:

d13dbfbe <Romain Francois> update error message to mention arrow::install_arrow()
1b6a737c <Romain Francois> Merge branch 'ARROW-5488/workaround' of https://github.com/romainfrancois/arrow into ARROW-5488/workaround
418fc2d0 <Romain Francois> mention codegen in the README
9a1b3e66 <Romain Francois> glue() not needed here
496eaf52 <Romain Francois> when brew is available but apache-arrow is not installed, install it
b8a6576b <Romain Francois> no need for stringr
238fa14d <Romain Francois> rm message about ARROW_R_DEV not being set
97888116 <Romain François> Merge branch 'master' into ARROW-5488/workaround
346978d0 <Romain Francois> update docker files.
bb9eed2d <Romain Francois> not showing diff for generated files
1df1e395 <Romain Francois> phrase shim message more positively
3b043696 <Romain Francois> use `\dontrun{}` in examples
562670df <Romain Francois> RcppExport all the things
6c06581e <Romain Francois> Using arrow:::arrow_available()
75b0751a <Romain Francois> Move symbols.h back into arrow_types.h
bd7f30d7 <Romain Francois> 🐀
e8ac3ea9 <Romain Francois> using arrowExports.(R,cpp). update configure script
84fa149f <Romain Francois> lint
54ee38d4 <Romain Francois> 🐀
d9e4f944 <Romain Francois> added a second R job to install and check the package on a system without libarrow
0479a753 <Romain Francois> code generation without rap package, using purrr instead
008eaf4a <Romain Francois> update generated code, fix merge conflicts
f9575045 <Romain Francois> update test_that shim
1170236b <Romain Francois> not necessary
1ed55966 <Romain Francois> Workaround so that the R package still checks without the C++ library.
78f69af6 <Romain Francois> move symbols declaration to their own file, might end up be generated automatically later somehow.
  • Loading branch information
romainfrancois committed Jun 12, 2019
1 parent 2ec4583 commit beff97c
Show file tree
Hide file tree
Showing 4 changed files with 20 additions and 5 deletions.
2 changes: 2 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
r/R/RcppExports.R linguist-generated=true
r/R/arrowExports.R linguist-generated=true
r/src/RcppExports.cpp linguist-generated=true
r/src/arrowExports.cpp linguist-generated=true
r/man/*.Rd linguist-generated=true

17 changes: 16 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,22 @@ matrix:
- pushd ${TRAVIS_BUILD_DIR}/r
after_success:
- Rscript ../ci/travis_upload_r_coverage.R

- name: R_no_libarrow
language: r
cache: packages
latex: false
dist: xenial
before_install:
# Have to copy-paste this here because of how R's build steps work
- eval `python $TRAVIS_BUILD_DIR/ci/detect-changes.py`
- if [ $ARROW_CI_R_AFFECTED != "1" ]; then exit; fi
- |
if [ $TRAVIS_OS_NAME == "linux" ]; then
sudo bash -c "echo -e 'Acquire::Retries 10; Acquire::http::Timeout \"20\";' > /etc/apt/apt.conf.d/99-travis-retry"
sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test
sudo apt-get update -qq
fi
- pushd ${TRAVIS_BUILD_DIR}/r

after_failure:
- |
Expand Down
2 changes: 0 additions & 2 deletions ci/docker_build_r.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@ export ARROW_HOME=$CONDA_PREFIX
# Build arrow
pushd /arrow/r

rm src/RcppExports*
Rscript -e "Rcpp::compileAttributes()"
R CMD build --keep-empty-dirs .
R CMD INSTALL $(ls | grep arrow_*.tar.gz)

Expand Down
4 changes: 2 additions & 2 deletions dev/release/rat_exclude_files.txt
Original file line number Diff line number Diff line change
Expand Up @@ -193,8 +193,8 @@ csharp/test/Directory.Build.props
*.svg
*.devhelp2
*.scss
r/R/RcppExports.R
r/src/RcppExports.cpp
r/R/arrowExports.R
r/src/arrowExports.cpp
r/DESCRIPTION
r/LICENSE.md
r/NAMESPACE
Expand Down

0 comments on commit beff97c

Please sign in to comment.