Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-34105: [R] Provide extra output for failed builds #37727

Merged
merged 6 commits into from
Sep 15, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 16 additions & 8 deletions r/tools/nixlibs.R
Original file line number Diff line number Diff line change
Expand Up @@ -473,17 +473,25 @@ build_libarrow <- function(src_dir, dst_dir) {
env_vars <- env_vars_as_string(env_var_list)

cat("**** arrow", ifelse(quietly, "", paste("with", env_vars)), "\n")
status <- suppressWarnings(system(
paste(env_vars, "inst/build_arrow_static.sh"),
ignore.stdout = quietly, ignore.stderr = quietly

build_log_path <- tempfile(fileext = ".log")
status <- suppressWarnings(system2(
"bash",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you need this, we didn't before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think my earlier approach (which is what @amoeba has done an excellent job of fixing here!) which saves stuff to an R vector is not great, as it leaves a lot of room for R weirdness. Perhaps @nealrichardson 's suggestion from Zulip might be worth trying instead:

Probably a better way would be to redirect the build output to a file, and print the file name if the build fails so that you can inspect it (though an installation of a source package would put that in a temp dir that would probably be removed on exit)

Could also print the last N lines of that output on failure (would have to experiment to find a reasonable N), that might be enough guidance to debug something

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on using a file and cating that in full if build fails. With the nature of CMake errors it is not always the case that the error is in the last lines (due to parallelism) and the issue with non-portable flag false positives is not an issue this way as the build has already failed...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My experience with build failures matches what @assignUser refers to with having to scroll up (sometimes quite far) to find the source of the failure. One trick I use to bring the error down is to re-build without parallelism but I'm not sure how best to implement that here.

So I'd be in favor of (1) writing to a file instead of capturing and (2) cating the full file to hedge against the tempfile disappearing before someone can grab it. I'll do that now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nealrichardson, Re:

I'm not sure you need this, we didn't before.

Were you referring to how I broke up the args in the system2 call? Calling the script the old way does work with system but we switched to system2 to capture stdin/stdout and calling system2 the same way resulted in this error,

sh: SOURCE_DIR='../cpp' BUILD_DIR='/var/folders/vm/bbcxtl3s63l7gn9h1tvfh1mm0000gn/T//RtmpoTYnF4/file166a06be38857' DEST_DIR='libarrow/arrow-13.0.0.9000' CMAKE='/opt/homebrew/bin/cmake' EXTRA_CMAKE_FLAGS='' CC='clang -arch arm64' CXX='clang++ -arch arm64 -std=gnu++17' LDFLAGS='-L/opt/R/arm64/lib' N_JOBS='2' ARROW_S3='OFF' ARROW_GCS='OFF' inst/build_arrow_static.sh: No such file or directory

I think this is because of what the man page for system2 says, which is:

Unlike system, command is always quoted by shQuote, so it must be a single command without arguments.

So I felt like my change to the args made sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, with 4246503, we now redirect stdout/stderr to a tempfile and print that file out on a build failure. The way I set it up, it looks like

**** Error building Arrow C++.
**** Printing contents of build log because the build failed while ARROW_R_DEV was set to FALSE
/* contents of build log */
**** Complete build log is available at /var/folders/vm/bbcxtl3s63l7gn9h1tvfh1mm0000gn/T//RtmpKBAT6R/file75d2a7f76ba.log

I printed the build log last in case we print so much information in the prior step that the user's terminal truncates the first part of the output so the user can still find the file (if it exists at this point).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to the next line, looks like the mobile app attached the comment to the wrong line. I'm not sure that directly calling "bash" is necessary or a good idea--we didn't before, the script has a shebang at the top, and bash might not be the shell that's available, a point that CRAN has been pedantic about in the past.

Copy link
Member Author

@amoeba amoeba Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, that makes sense now. I thought the pre-existing reliance on bash in nixlibs.R was odd, and we actually have this bit:

arrow/r/tools/nixlibs.R

Lines 625 to 627 in d9ee3a7

if (!isTRUE(Sys.which("bash") != "")) {
stop("nixlibs.R requires bash to be installed and available in your PATH")
}

and another spot where we make the command bash,

deps_bash_success <- system2("bash", deps_bash, stdout = csv_tempfile) == 0

So this change adds some consistency. I'd be happy to changes things if/when CRAN has any issues. Thanks for catching it.

"inst/build_arrow_static.sh",
env = env_vars,
stdout = ifelse(quietly, build_log_path, ""),
stderr = ifelse(quietly, build_log_path, "")
))

if (status != 0) {
# It failed :(
cat(
"**** Error building Arrow C++.",
ifelse(env_is("ARROW_R_DEV", "true"), "", "Re-run with ARROW_R_DEV=true for debug information."),
"\n"
)
cat("**** Error building Arrow C++.", "\n")
if (quietly) {
cat("**** Printing contents of build log because the build failed",
"while ARROW_R_DEV was set to FALSE\n")
cat(readLines(build_log_path), sep = "\n")
cat("**** Complete build log may still be present at", build_log_path, "\n")
}
}
invisible(status)
}
Expand Down