Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20197][SPARKR] CRAN check fail with package installation #17516

Closed
wants to merge 2 commits into from

Conversation

felixcheung
Copy link
Member

@felixcheung felixcheung commented Apr 2, 2017

What changes were proposed in this pull request?

Test failed because SPARK_HOME is not set before Spark is installed.

@SparkQA
Copy link

SparkQA commented Apr 2, 2017

Test build #75469 has finished for PR 17516 at commit c89380e.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member Author

felixcheung commented Apr 2, 2017

unlike in Jenkins, R CMD check looks to be running each test R file with the current directory set to its directory (eg. spark/R/pkg/inst/tests/testthat/ for test_sparkSQL.R)

when running as R CMD check, SPARK_HOME = ~/.cache/spark/spark* whereas spark-warehouse and metastore_db are actually created under R/inst/tests/testthat (where the actual test .R files are)

unless there is a good way to run all tests the exact same way as R CMD check, we will need to revisit what we are checking here.

@shivaram
Copy link
Contributor

shivaram commented Apr 3, 2017

Looking at this now

@shivaram
Copy link
Contributor

shivaram commented Apr 3, 2017

So does the current patch pass R CMD check when run on the master branch ?

@felixcheung felixcheung changed the title [SPARK-20197][SPARKR][WIP] improve check for directory to cover differences with R CMD check [SPARK-20197][SPARKR] CRAN check fail with package installation Apr 7, 2017
@SparkQA
Copy link

SparkQA commented Apr 7, 2017

Test build #75589 has finished for PR 17516 at commit a3e8b35.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

shivaram commented Apr 7, 2017

Don't we also need the skip if cran statement ?

@felixcheung
Copy link
Member Author

felixcheung commented Apr 7, 2017 via email

@shivaram
Copy link
Contributor

shivaram commented Apr 7, 2017

The test passes even when we run the R CMD check --as-cran from a different directory ? I thought the fix in branch-2.1 was to get around that (my understanding could be wrong)

@felixcheung
Copy link
Member Author

felixcheung commented Apr 7, 2017

There are two parts to the branch-2.1 fix

First, the reason why the test failed was because SPARK_HOME was not set before calling spark.install() when running as a package. This would not be a problem when in Jenkins, but only when running with R CMD check SparkR*.tgz. The fix was to move spark.install to earlier.

Second, even after the change, while testing it, I found that R CMD check was getting spark-warehouse etc in the testthat directory, NOT in SPARK_HOME - therefore that test would be essentially a no-op or always passes anyway. I made the call to disable it (with skip_if_cran), but that had the unintended effect of also turning off that test in Jenkins, as we are testing with --as-cran

And so the attempt here in this PR to fix this for real in master. Since we are rolling our RC anytime, I don't want to delay the first fix (install.spark) only to sort out the 2nd part, which could come a bit later.

If you feel that's safer, we could also add skip_if_cran to this test in master - just know that it will also turn off this test in Jenkins. Since with R CMD check the spark-warehouse and metastore_db are not written to SPARK_HOME, but to testthat, this test will pass during the package test with R CMD check - so long as we merge this PR to move install.spark first

@shivaram
Copy link
Contributor

shivaram commented Apr 7, 2017

Got it. LGTM. Thanks for explanation. I'm fine with merging this to master !

@felixcheung
Copy link
Member Author

thanks, I find it rather odd but probably by design that the current directory is different when running R CMD check .tgz. will need to look at the more

@felixcheung
Copy link
Member Author

merged to master

@asfgit asfgit closed this in 8feb799 Apr 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants