SparkR Support #507

ifilonenko · 2017-09-24T15:44:17Z

What changes were proposed in this pull request?

Initial Spark R support

How was this patch tested?

Initial submission step
Unit Tests
Docker files (tested)
Integration Tests

ifilonenko · 2017-09-24T15:45:33Z

...src/main/scala/org/apache/spark/deploy/k8s/submit/DriverConfigurationStepsOrchestrator.scala

@@ -71,6 +76,7 @@ private[spark] class DriverConfigurationStepsOrchestrator(
        .map(_.split(","))
        .getOrElse(Array.empty[String]) ++
        additionalMainAppPythonFile.toSeq ++
+        additionalMainAppRFile.toSeq ++


Important here that, similar to Python Primary Resource, that the R File is distributed via --files.

ifilonenko · 2017-09-24T15:46:15Z

resource-managers/kubernetes/docker-minimal-bundle/src/main/docker/driver-r/Dockerfile

+ADD R /opt/spark/R
+
+RUN apk add --no-cache R && \
+    rm -r /root/.cache


Open here for any recommendations?

ifilonenko · 2017-09-24T16:02:31Z

Upon merging, this PR closes #506

ifilonenko · 2017-09-24T16:40:54Z

To run integration tests in a proper R environment, it is required that R_HOME is defined in the testing environment which means everyone would need to install R. Is that something that would be an issue or something that can be assumed if someone is building out a full dev environment for Spark? @foxish @erikerlandson

foxish · 2017-09-24T18:03:23Z

Hmm, interesting. The submitting node has an R dependency? or the driver?

ifilonenko · 2017-09-24T19:41:41Z

The R dependency is because there is a need to mimic the make-distribution environment in target/docker/R so that when we run ADD R opt/spark/R it is already packaged in the Docker environment; this is similar to how we setup PySpark.

ifilonenko · 2017-09-24T19:42:30Z

@ssuchter @varunkatta Integration test will pass after R is installed and R_HOME is defined in the jenkins environment.

PR is otherwise ready for review

ifilonenko · 2017-09-25T15:57:35Z

rerun integration tests please

ifilonenko · 2017-09-25T18:18:15Z

rerun integration tests please

ifilonenko · 2017-09-25T18:33:36Z

rerun integration tests please

ifilonenko · 2017-09-26T00:57:50Z

PR is ready for review @foxish @erikerlandson

liyinan926 · 2017-09-27T04:20:33Z

@ifilonenko you also need to update sbin/build-push-docker-images.sh to add the new images.

ifilonenko · 2017-09-27T17:54:33Z

rerun unit tests please

ifilonenko · 2017-09-27T18:21:13Z

Ready for merging to branch-2.2 unless any other concerns @erikerlandson @foxish @liyinan926

liyinan926 · 2017-09-27T18:26:08Z

LGTM. Thanks for the work!

ifilonenko · 2017-09-28T05:06:07Z

rerun unit tests please

ifilonenko · 2017-09-29T09:46:19Z

Unless there are any further comments, I think this is ready to merge

ifilonenko · 2017-10-03T19:50:12Z

All set to merge? @foxish @erikerlandson @liyinan926

liyinan926 · 2017-10-03T20:38:06Z

I think it's all good.

mccheah · 2017-10-09T22:50:30Z

...ation-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala

+  test("Run SparkR Job on file locally") {
+    assume(testBackend.name == MINIKUBE_TEST_BACKEND)
+
+    launchStagingServer(SSLOptions(), None)


Don't think we need a staging server if the file is in the image.

Actually I just think the test is misnamed - looks like this test is shipping the file while the test below expects the file to be on the container.

src/test/R/dataframe.R is what we are pushing up to the RSS, which is why it is being launched. Hmm, isn't that file locally stored? What would be the correct naming convention?

mccheah · 2017-10-09T22:52:18Z

.../test/scala/org/apache/spark/deploy/k8s/integrationtest/docker/SparkDockerImageBuilder.scala

-    val exitCode = process.waitFor()
-    if (exitCode != 0) {
-      logInfo(s"exitCode: $exitCode")
+    val exitCodePython = process.waitFor()


I've always been wondering - would it be possible to do this bootstrapping from Maven and not in the test code? Seems like this should be an environment step. Docker images should theoretically be built at the Maven step as well but we know that this is harder to do.

It technically is possible, but as we discussed for PySpark. I wasn't able to figure that out. Any recommendations would be helpful.

felixcheung · 2017-10-13T19:08:39Z

hi - what's left for this work? this #507 (comment)?

mccheah · 2017-10-13T19:43:57Z

I left some comments. Also would like to see this tested in a production environment, but maybe we can just merge it and follow up as feedback comes in.

ifilonenko · 2017-10-16T20:33:26Z

@mccheah as per your last comment. Is this okay to merge then?

mccheah · 2017-10-16T20:38:36Z

Can merge when CI passes - I just updated the branch.

…apache-spark-on-k8s into branch-2.2-kubernetes

ifilonenko · 2017-10-17T23:21:44Z

ready for merging: @foxish

foxish · 2017-10-18T00:19:25Z

Nvm. Please ignore last comment. That was in the merge commit.

* initial R support without integration tests * finished sparkR integration * case sensitive file names in unix * revert back to previous lower case in dockerfile * addition into the build-push-docker-images

…8s#507)

initial R support without integration tests

d376bb5

ifilonenko commented Sep 24, 2017

View reviewed changes

ifilonenko self-assigned this Sep 24, 2017

finished sparkR integration

d3af8f5

ifilonenko changed the title ~~[WIP] Spark R Support~~ SparkR Support Sep 24, 2017

ifilonenko added 2 commits September 25, 2017 15:19

case sensitive file names in unix

0b9210c

revert back to previous lower case in dockerfile

e539922

addition into the build-push-docker-images

71bbbf0

ifilonenko force-pushed the branch-2.2-kubernetes branch from 2e71189 to 71bbbf0 Compare September 28, 2017 03:23

felixcheung approved these changes Sep 29, 2017

View reviewed changes

mccheah reviewed Oct 9, 2017

View reviewed changes

Merge branch 'branch-2.2-kubernetes' into branch-2.2-kubernetes

c8a4bc3

ifilonenko added 2 commits October 17, 2017 18:31

merge conflicts

b092ed4

Merge branch 'branch-2.2-kubernetes' of https://github.com/bloomberg/…

e5f606d

…apache-spark-on-k8s into branch-2.2-kubernetes

foxish merged commit f94499b into apache-spark-on-k8s:branch-2.2-kubernetes Oct 18, 2017

ifilonenko pushed a commit to bloomberg/apache-spark-on-k8s that referenced this pull request Mar 19, 2019

SPARK-25299: Add rest of shuffle writer benchmarks (apache-spark-on-k…

ddc9905

…8s#507)

ifilonenko pushed a commit to bloomberg/apache-spark-on-k8s that referenced this pull request Apr 4, 2019

SPARK-25299: Add rest of shuffle writer benchmarks (apache-spark-on-k…

ef563da

…8s#507)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SparkR Support #507

SparkR Support #507

ifilonenko commented Sep 24, 2017 •

edited

Loading

ifilonenko Sep 24, 2017

ifilonenko Sep 24, 2017

ifilonenko commented Sep 24, 2017

ifilonenko commented Sep 24, 2017 •

edited

Loading

foxish commented Sep 24, 2017

ifilonenko commented Sep 24, 2017

ifilonenko commented Sep 24, 2017

ifilonenko commented Sep 25, 2017

ifilonenko commented Sep 25, 2017

ifilonenko commented Sep 25, 2017

ifilonenko commented Sep 26, 2017

liyinan926 commented Sep 27, 2017

ifilonenko commented Sep 27, 2017

ifilonenko commented Sep 27, 2017

liyinan926 commented Sep 27, 2017

ifilonenko commented Sep 28, 2017

ifilonenko commented Sep 29, 2017

ifilonenko commented Oct 3, 2017

liyinan926 commented Oct 3, 2017

mccheah Oct 9, 2017

mccheah Oct 9, 2017

ifilonenko Oct 16, 2017

mccheah Oct 9, 2017

ifilonenko Oct 16, 2017

felixcheung commented Oct 13, 2017

mccheah commented Oct 13, 2017

ifilonenko commented Oct 16, 2017

mccheah commented Oct 16, 2017

ifilonenko commented Oct 17, 2017

foxish commented Oct 18, 2017

SparkR Support #507

SparkR Support #507

Conversation

ifilonenko commented Sep 24, 2017 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

ifilonenko Sep 24, 2017

Choose a reason for hiding this comment

ifilonenko Sep 24, 2017

Choose a reason for hiding this comment

ifilonenko commented Sep 24, 2017

ifilonenko commented Sep 24, 2017 • edited Loading

foxish commented Sep 24, 2017

ifilonenko commented Sep 24, 2017

ifilonenko commented Sep 24, 2017

ifilonenko commented Sep 25, 2017

ifilonenko commented Sep 25, 2017

ifilonenko commented Sep 25, 2017

ifilonenko commented Sep 26, 2017

liyinan926 commented Sep 27, 2017

ifilonenko commented Sep 27, 2017

ifilonenko commented Sep 27, 2017

liyinan926 commented Sep 27, 2017

ifilonenko commented Sep 28, 2017

ifilonenko commented Sep 29, 2017

ifilonenko commented Oct 3, 2017

liyinan926 commented Oct 3, 2017

mccheah Oct 9, 2017

Choose a reason for hiding this comment

mccheah Oct 9, 2017

Choose a reason for hiding this comment

ifilonenko Oct 16, 2017

Choose a reason for hiding this comment

mccheah Oct 9, 2017

Choose a reason for hiding this comment

ifilonenko Oct 16, 2017

Choose a reason for hiding this comment

felixcheung commented Oct 13, 2017

mccheah commented Oct 13, 2017

ifilonenko commented Oct 16, 2017

mccheah commented Oct 16, 2017

ifilonenko commented Oct 17, 2017

foxish commented Oct 18, 2017

ifilonenko commented Sep 24, 2017 •

edited

Loading

ifilonenko commented Sep 24, 2017 •

edited

Loading