Skip to content

Commit

Permalink
Access the Driver Launcher Server over NodePort for app launch + subm…
Browse files Browse the repository at this point in the history
…it jars (apache#30)

* Revamp ports and service setup for the driver.

- Expose the driver-submission service on NodePort and contact that as
opposed to going through the API server proxy
- Restrict the ports that are exposed on the service to only the driver
submission service when uploading content and then only the Spark UI
after the job has started

* Move service creation down and more thorough error handling

* Fix missed merge conflict

* Add braces

* Fix bad merge

* Address comments and refactor run() more.

Method nesting was getting confusing so pulled out the inner class and
removed the extra method indirection from createDriverPod()

* Remove unused method

* Support SSL configuration for the driver application submission (apache#49)

* Support SSL when setting up the driver.

The user can provide a keyStore to load onto the driver pod and the
driver pod will use that keyStore to set up SSL on its server.

* Clean up SSL secrets after finishing submission.

We don't need to persist these after the pod has them mounted and is
running already.

* Fix compilation error

* Revert image change

* Address comments

* Programmatically generate certificates for integration tests.

* Address comments

* Resolve merge conflicts

* Fix bad merge

* Remove unnecessary braces

* Fix compiler error
  • Loading branch information
mccheah authored and ash211 committed Mar 8, 2017
1 parent 3b5901a commit 2e992be
Show file tree
Hide file tree
Showing 10 changed files with 637 additions and 252 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,16 @@ package org.apache.spark.deploy.rest

import javax.servlet.http.{HttpServlet, HttpServletRequest, HttpServletResponse}

import scala.io.Source

import com.fasterxml.jackson.core.JsonProcessingException
import org.eclipse.jetty.server.{HttpConnectionFactory, Server, ServerConnector}
import org.eclipse.jetty.http.HttpVersion
import org.eclipse.jetty.server.{HttpConfiguration, HttpConnectionFactory, Server, ServerConnector, SslConnectionFactory}
import org.eclipse.jetty.servlet.{ServletContextHandler, ServletHolder}
import org.eclipse.jetty.util.thread.{QueuedThreadPool, ScheduledExecutorScheduler}
import org.json4s._
import org.json4s.jackson.JsonMethods._
import scala.io.Source

import org.apache.spark.{SPARK_VERSION => sparkVersion, SparkConf}
import org.apache.spark.{SPARK_VERSION => sparkVersion, SparkConf, SSLOptions}
import org.apache.spark.internal.Logging
import org.apache.spark.util.Utils

Expand All @@ -50,7 +50,8 @@ import org.apache.spark.util.Utils
private[spark] abstract class RestSubmissionServer(
val host: String,
val requestedPort: Int,
val masterConf: SparkConf) extends Logging {
val masterConf: SparkConf,
val sslOptions: SSLOptions = SSLOptions()) extends Logging {
protected val submitRequestServlet: SubmitRequestServlet
protected val killRequestServlet: KillRequestServlet
protected val statusRequestServlet: StatusRequestServlet
Expand Down Expand Up @@ -79,19 +80,32 @@ private[spark] abstract class RestSubmissionServer(
* Return a 2-tuple of the started server and the bound port.
*/
private def doStart(startPort: Int): (Server, Int) = {
// TODO consider using JettyUtils#startServer to do this instead
val threadPool = new QueuedThreadPool
threadPool.setDaemon(true)
val server = new Server(threadPool)

val resolvedConnectionFactories = sslOptions
.createJettySslContextFactory()
.map(sslFactory => {
val sslConnectionFactory = new SslConnectionFactory(
sslFactory, HttpVersion.HTTP_1_1.asString())
val rawHttpConfiguration = new HttpConfiguration()
rawHttpConfiguration.setSecureScheme("https")
rawHttpConfiguration.setSecurePort(startPort)
val rawHttpConnectionFactory = new HttpConnectionFactory(rawHttpConfiguration)
Array(sslConnectionFactory, rawHttpConnectionFactory)
}).getOrElse(Array(new HttpConnectionFactory()))

val connector = new ServerConnector(
server,
null,
// Call this full constructor to set this, which forces daemon threads:
new ScheduledExecutorScheduler("RestSubmissionServer-JettyScheduler", true),
null,
-1,
-1,
new HttpConnectionFactory())
server,
null,
// Call this full constructor to set this, which forces daemon threads:
new ScheduledExecutorScheduler("RestSubmissionServer-JettyScheduler", true),
null,
-1,
-1,
resolvedConnectionFactories: _*)
connector.setHost(host)
connector.setPort(startPort)
server.addConnector(connector)
Expand Down
18 changes: 18 additions & 0 deletions docs/running-on-kubernetes.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,24 @@ To specify a main application resource that is in the Docker image, and if it ha
--conf spark.kubernetes.executor.docker.image=registry-host:5000/spark-executor:latest \
container:///home/applications/examples/example.jar

### Setting Up SSL For Submitting the Driver

When submitting to Kubernetes, a pod is started for the driver, and the pod starts an HTTP server. This HTTP server
receives the driver's configuration, including uploaded driver jars, from the client before starting the application.
Spark supports using SSL to encrypt the traffic in this bootstrapping process. It is recommended to configure this
whenever possible.

See the [security page](security.html) and [configuration](configuration.html) sections for more information on
configuring SSL; use the prefix `spark.ssl.kubernetes.driverlaunch` in configuring the SSL-related fields in the context
of submitting to Kubernetes. For example, to set the trustStore used when the local machine communicates with the driver
pod in starting the application, set `spark.ssl.kubernetes.driverlaunch.trustStore`.

One note about the keyStore is that it can be specified as either a file on the client machine or a file in the
container image's disk. Thus `spark.ssl.kubernetes.driverlaunch.keyStore` can be a URI with a scheme of either `file:`
or `container:`. A scheme of `file:` corresponds to the keyStore being located on the client machine; it is mounted onto
the driver container as a [secret volume](https://kubernetes.io/docs/user-guide/secrets/). When the URI has the scheme
`container:`, the file is assumed to already be on the container's disk at the appropriate path.

### Spark Properties

Below are some other common properties that are specific to Kubernetes. Most of the other configurations are the same
Expand Down
7 changes: 6 additions & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@
<parquet.version>1.8.1</parquet.version>
<hive.parquet.version>1.6.0</hive.parquet.version>
<feign.version>8.18.0</feign.version>
<bouncycastle.version>1.52</bouncycastle.version>
<jetty.version>9.2.16.v20160414</jetty.version>
<javaxservlet.version>3.1.0</javaxservlet.version>
<chill.version>0.8.0</chill.version>
Expand Down Expand Up @@ -331,7 +332,11 @@
<artifactId>okhttp</artifactId>
<version>3.4.1</version>
</dependency>

<dependency>
<groupId>org.bouncycastle</groupId>
<artifactId>bcpkix-jdk15on</artifactId>
<version>${bouncycastle.version}</version>
</dependency>
<!-- This artifact is a shaded version of ASM 5.0.4. The POM that was used to produce this
is at https://github.com/apache/geronimo-xbean/tree/xbean-4.4/xbean-asm5-shaded
For context on why we shade ASM, see SPARK-782 and SPARK-6152. -->
Expand Down
Loading

0 comments on commit 2e992be

Please sign in to comment.