Skip to content

Commit

Permalink
Merge pull request apache#388 from apache-spark-on-k8s/branch-2.2-kub…
Browse files Browse the repository at this point in the history
…ernetes-g

Branch 2.2 kubernetes
  • Loading branch information
foxish authored Jul 25, 2017
2 parents a2c7b21 + beb1361 commit a8330eb
Show file tree
Hide file tree
Showing 181 changed files with 13,281 additions and 80 deletions.
21 changes: 17 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,22 @@
sudo: required
dist: trusty

# 2. Choose language and target JDKs for parallel builds.
# 2. Choose language, target JDK and env's for parallel builds.
language: java
jdk:
- oraclejdk8
env: # Used by the install section below.
# Configure the unit test build for spark core and kubernetes modules,
# while excluding some flaky unit tests using a regex pattern.
- PHASE=test \
PROFILES="-Pmesos -Pyarn -Phadoop-2.7 -Pkubernetes" \
MODULES="-pl core,resource-managers/kubernetes/core -am" \
ARGS="-Dtest=none -Dsuffixes='^org\.apache\.spark\.(?!ExternalShuffleServiceSuite|SortShuffleSuite$|rdd\.LocalCheckpointSuite$|deploy\.SparkSubmitSuite$|deploy\.StandaloneDynamicAllocationSuite$).*'"
# Configure the full build.
- PHASE=install \
PROFILES="-Pmesos -Pyarn -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver" \
MODULES="" \
ARGS="-T 4 -q -DskipTests"

# 3. Setup cache directory for SBT and Maven.
cache:
Expand All @@ -40,11 +52,12 @@ cache:
notifications:
email: false

# 5. Run maven install before running lint-java.
# 5. Run maven build before running lints.
install:
- export MAVEN_SKIP_RC=1
- build/mvn -T 4 -q -DskipTests -Pmesos -Pyarn -Pkinesis-asl -Phive -Phive-thriftserver install
- build/mvn ${PHASE} ${PROFILES} ${MODULES} ${ARGS}

# 6. Run lint-java.
# 6. Run lints.
script:
- dev/lint-java
- dev/lint-scala
39 changes: 39 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,42 @@
# Apache Spark On Kubernetes

This repository, located at https://github.com/apache-spark-on-k8s/spark, contains a fork of Apache Spark that enables running Spark jobs natively on a Kubernetes cluster.

## What is this?

This is a collaboratively maintained project working on [SPARK-18278](https://issues.apache.org/jira/browse/SPARK-18278). The goal is to bring native support for Spark to use Kubernetes as a cluster manager, in a fully supported way on par with the Spark Standalone, Mesos, and Apache YARN cluster managers.

## Getting Started

- [Usage guide](https://apache-spark-on-k8s.github.io/userdocs/) shows how to run the code
- [Development docs](resource-managers/kubernetes/README.md) shows how to get set up for development
- Code is primarily located in the [resource-managers/kubernetes](resource-managers/kubernetes) folder

## Why does this fork exist?

Adding native integration for a new cluster manager is a large undertaking. If poorly executed, it could introduce bugs into Spark when run on other cluster managers, cause release blockers slowing down the overall Spark project, or require hotfixes which divert attention away from development towards managing additional releases. Any work this deep inside Spark needs to be done carefully to minimize the risk of those negative externalities.

At the same time, an increasing number of people from various companies and organizations desire to work together to natively run Spark on Kubernetes. The group needs a code repository, communication forum, issue tracking, and continuous integration, all in order to work together effectively on an open source product.

We've been asked by an Apache Spark Committer to work outside of the Apache infrastructure for a short period of time to allow this feature to be hardened and improved without creating risk for Apache Spark. The aim is to rapidly bring it to the point where it can be brought into the mainline Apache Spark repository for continued development within the Apache umbrella. If all goes well, this should be a short-lived fork rather than a long-lived one.

## Who are we?

This is a collaborative effort by several folks from different companies who are interested in seeing this feature be successful. Companies active in this project include (alphabetically):

- Bloomberg
- Google
- Haiwen
- Hyperpilot
- Intel
- Palantir
- Pepperdata
- Red Hat

--------------------

(original README below)

# Apache Spark

Spark is a fast and general cluster computing system for Big Data. It provides
Expand Down
12 changes: 11 additions & 1 deletion assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.3.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down Expand Up @@ -148,6 +148,16 @@
</dependency>
</dependencies>
</profile>
<profile>
<id>kubernetes</id>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-kubernetes_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>hive</id>
<dependencies>
Expand Down
2 changes: 1 addition & 1 deletion common/network-common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.3.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/network-shuffle/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.3.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.network.shuffle.kubernetes;

import org.apache.spark.network.client.RpcResponseCallback;
import org.apache.spark.network.client.TransportClient;
import org.apache.spark.network.sasl.SecretKeyHolder;
import org.apache.spark.network.shuffle.ExternalShuffleClient;
import org.apache.spark.network.shuffle.protocol.RegisterDriver;
import org.apache.spark.network.util.TransportConf;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.IOException;
import java.nio.ByteBuffer;

/**
* A client for talking to the external shuffle service in Kubernetes cluster mode.
*
* This is used by the each Spark executor to register with a corresponding external
* shuffle service on the cluster. The purpose is for cleaning up shuffle files
* reliably if the application exits unexpectedly.
*/
public class KubernetesExternalShuffleClient extends ExternalShuffleClient {
private static final Logger logger = LoggerFactory
.getLogger(KubernetesExternalShuffleClient.class);

/**
* Creates an Kubernetes external shuffle client that wraps the {@link ExternalShuffleClient}.
* Please refer to docs on {@link ExternalShuffleClient} for more information.
*/
public KubernetesExternalShuffleClient(
TransportConf conf,
SecretKeyHolder secretKeyHolder,
boolean saslEnabled) {
super(conf, secretKeyHolder, saslEnabled);
}

public void registerDriverWithShuffleService(String host, int port)
throws IOException, InterruptedException {
checkInit();
ByteBuffer registerDriver = new RegisterDriver(appId, 0).toByteBuffer();
TransportClient client = clientFactory.createClient(host, port);
client.sendRpc(registerDriver, new RegisterDriverCallback());
}

private class RegisterDriverCallback implements RpcResponseCallback {
@Override
public void onSuccess(ByteBuffer response) {
logger.info("Successfully registered app " + appId + " with external shuffle service.");
}

@Override
public void onFailure(Throwable e) {
logger.warn("Unable to register app " + appId + " with external shuffle service. " +
"Please manually remove shuffle data after driver exit. Error: " + e);
}
}

@Override
public void close() {
super.close();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
import org.apache.spark.network.client.TransportClient;
import org.apache.spark.network.sasl.SecretKeyHolder;
import org.apache.spark.network.shuffle.ExternalShuffleClient;
import org.apache.spark.network.shuffle.protocol.mesos.RegisterDriver;
import org.apache.spark.network.shuffle.protocol.RegisterDriver;
import org.apache.spark.network.util.TransportConf;

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
import io.netty.buffer.Unpooled;

import org.apache.spark.network.protocol.Encodable;
import org.apache.spark.network.shuffle.protocol.mesos.RegisterDriver;
import org.apache.spark.network.shuffle.protocol.mesos.ShuffleServiceHeartbeat;

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,19 +15,18 @@
* limitations under the License.
*/

package org.apache.spark.network.shuffle.protocol.mesos;
package org.apache.spark.network.shuffle.protocol;

import com.google.common.base.Objects;
import io.netty.buffer.ByteBuf;

import org.apache.spark.network.protocol.Encoders;
import org.apache.spark.network.shuffle.protocol.BlockTransferMessage;

// Needed by ScalaDoc. See SPARK-7726
import static org.apache.spark.network.shuffle.protocol.BlockTransferMessage.Type;

/**
* A message sent from the driver to register with the MesosExternalShuffleService.
* A message sent from the driver to register with an ExternalShuffleService.
*/
public class RegisterDriver extends BlockTransferMessage {
private final String appId;
Expand Down
2 changes: 1 addition & 1 deletion common/network-yarn/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.3.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/sketch/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.3.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/tags/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.3.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion common/unsafe/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.2.0</version>
<version>2.2.0-k8s-0.3.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
79 changes: 79 additions & 0 deletions conf/kubernetes-resource-staging-server.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: spark-resource-staging-server
spec:
replicas: 1
template:
metadata:
labels:
resource-staging-server-instance: default
spec:
volumes:
- name: resource-staging-server-properties
configMap:
name: spark-resource-staging-server-config
containers:
- name: spark-resource-staging-server
image: kubespark/spark-resource-staging-server:v2.1.0-kubernetes-0.2.0
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 100m
memory: 256Mi
volumeMounts:
- name: resource-staging-server-properties
mountPath: '/etc/spark-resource-staging-server'
args:
- '/etc/spark-resource-staging-server/resource-staging-server.properties'
---
apiVersion: v1
kind: ConfigMap
metadata:
name: spark-resource-staging-server-config
data:
resource-staging-server.properties: |
spark.kubernetes.resourceStagingServer.port=10000
spark.ssl.kubernetes.resourceStagingServer.enabled=false
# Other possible properties are listed below, primarily for setting up TLS. The paths given by KeyStore, password, and PEM files here should correspond to
# files that are securely mounted into the resource staging server container, via e.g. secret volumes.
# spark.ssl.kubernetes.resourceStagingServer.keyStore=/mnt/secrets/resource-staging-server/keyStore.jks
# spark.ssl.kubernetes.resourceStagingServer.keyStorePassword=changeit
# spark.ssl.kubernetes.resourceStagingServer.keyPassword=changeit
# spark.ssl.kubernetes.resourceStagingServer.keyStorePasswordFile=/mnt/secrets/resource-staging-server/keystore-password.txt
# spark.ssl.kubernetes.resourceStagingServer.keyPasswordFile=/mnt/secrets/resource-staging-server/keystore-key-password.txt
# spark.ssl.kubernetes.resourceStagingServer.keyPem=/mnt/secrets/resource-staging-server/key.pem
# spark.ssl.kubernetes.resourceStagingServer.serverCertPem=/mnt/secrets/resource-staging-server/cert.pem
---
apiVersion: v1
kind: Service
metadata:
name: spark-resource-staging-service
spec:
type: NodePort
selector:
resource-staging-server-instance: default
ports:
- protocol: TCP
port: 10000
targetPort: 10000
nodePort: 31000
Loading

0 comments on commit a8330eb

Please sign in to comment.