Skip to content

Commit

Permalink
[Spark-483] Add Kerberos principal and secret-based key tab to the CL…
Browse files Browse the repository at this point in the history
…I, also update libmesos in Dockerfile (apache#164)

* wip, decode base64 secrets

* improved logging

* change makefile back

* makefile...

* use libmesos bundle instead of private image

* update docs and remove tgt?

* remove dead code

* fix typo

* fixed hdfs.md with tgt instructions
  • Loading branch information
Arthur Rand authored Aug 9, 2017
1 parent 0751fcf commit 6ec5bb9
Show file tree
Hide file tree
Showing 7 changed files with 106 additions and 12 deletions.
19 changes: 18 additions & 1 deletion conf/spark-env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ mkdir -p "${HADOOP_CONF_DIR}"

cd $MESOS_SANDBOX

MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos.so
MESOS_NATIVE_JAVA_LIBRARY=/opt/mesosphere/libmesos-bundle/lib/libmesos.so

# For non-CNI, tell the Spark driver to bind to LIBPROCESS_IP
#
Expand All @@ -30,6 +30,23 @@ fi
# But this fails now due to MESOS-6391, so I'm setting it to /tmp
MESOS_DIRECTORY=/tmp

echo "spark-env: Printing environment" >&2
env >&2
echo "spark-env: User: $(whoami)" >&2

for f in $MESOS_SANDBOX/*.base64 ; do
echo "decoding $f" >&2
secret=$(basename ${f} .base64)
cat ${f} | base64 -d > ${secret}
done

if [[ -n "${KRB5_CONFIG_BASE64}" ]]; then
echo "spark-env: Copying krb config from $KRB5_CONFIG_BASE64 to /etc/" >&2
echo "${KRB5_CONFIG_BASE64}" | base64 -d > /etc/krb5.conf
else
echo "spark-env: No kerberos KDC config found" >&2
fi

# Options read when launching programs locally with
# ./bin/run-example or ./bin/spark-submit
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
Expand Down
2 changes: 2 additions & 0 deletions dispatcher/cli/dcos_spark/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
dcos spark run --help
dcos spark run --submit-args=<spark-args>
[--dcos-space=<dcos_space>]
[--kerberos-principal=kerberos_principal]
[--keytab-secret-path=<keytab_secret>]
[--docker-image=<docker-image>]
[--verbose]
dcos spark status <submissionId> [--verbose]
Expand Down
3 changes: 3 additions & 0 deletions dispatcher/cli/dcos_spark/constants.py
Original file line number Diff line number Diff line change
@@ -1 +1,4 @@
PATH_ENV = 'PATH'
KERBEROS_PRINCIPAL_ARG = "--kerberos-principal"
KEYTAB_SECRET_PATH_ARG = "--keytab-secret-path"
ENCODED_SUFFIX = ".base64"
69 changes: 65 additions & 4 deletions dispatcher/cli/dcos_spark/spark_submit.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,20 +154,80 @@ def show_help():
return 0


def format_kerberos_args(args):
def check_args():
if (args[constants.KERBEROS_PRINCIPAL_ARG] is None and
args[constants.KEYTAB_SECRET_PATH_ARG] is None):
return False # No Kerberos args
if args[constants.KERBEROS_PRINCIPAL_ARG] is not None:
if args[constants.KEYTAB_SECRET_PATH_ARG] is None:
print("Missing {} argument for keytab "
"secret. E.g. /hdfs.keytab"
.format(constants.KERBEROS_PRINCIPAL_ARG),
file=sys.stderr)
exit(1)
return True
if args[constants.KEYTAB_SECRET_PATH_ARG] is not None:
if args[constants.KERBEROS_PRINCIPAL_ARG] is None:
print("Missing {} argument for Kerberos principal, e.g. "
"hdfs/name-0.hdfs.autoip.dcos.thisdcos.directory@LOCAL"
.format(constants.KERBEROS_PRINCIPAL_ARG),
file=sys.stderr)
exit(1)
return True

def get_secret_file_from_path(encoded):
if args[constants.KEYTAB_SECRET_PATH_ARG] is not None:
f = args[constants.KEYTAB_SECRET_PATH_ARG].split("/")[-1]
return f + constants.ENCODED_SUFFIX if encoded else f
else:
return None

def get_krb5_config():
app = spark_app()
if "SPARK_MESOS_KRB5_CONF_BASE64" in app["env"]:
krb5 = app["env"]["SPARK_MESOS_KRB5_CONF_BASE64"]
return ["--conf",
"spark.mesos.driverEnv.KRB5_CONFIG_BASE64={}"
.format(krb5)]
else:
print("WARNING: You must specify a krb5.conf that is base64 "
"encoded with "
"--conf spark.mesos.driverEnv.KRB5_CONFIG_BASE64",
file=sys.stderr)
return []

add_args = check_args()
if add_args:
return [
"--principal",
"{}".format(args[constants.KERBEROS_PRINCIPAL_ARG]),
"--conf",
"spark.yarn.keytab={}".format(
get_secret_file_from_path(encoded=False)),
"--conf",
"spark.mesos.driver.secret.name={}".format(
args[constants.KEYTAB_SECRET_PATH_ARG]),
"--conf",
"spark.mesos.driver.secret.filename={}".format(
get_secret_file_from_path(encoded=True)),
"--conf",
"spark.mesos.containerizer=mesos"] + get_krb5_config()
else:
return []


def submit_job(dispatcher, docker_image, args):
"""
Run spark-submit.
:param dispatcher: Spark Dispatcher URL. Used to construct --master.
:type dispatcher: string
:param args: --submit-args value from `dcos spark run`
:param args: command line args value from `dcos spark run`
:type args: string
:param docker_image: Docker image to run the driver and executors in.
:type docker_image: string
:param verbose: If true, prints verbose information to stdout.
:type verbose: boolean
"""

submit_args = args["--submit-args"]
verbose = args["--verbose"] if args["--verbose"] is not None else False
app = spark_app()
Expand All @@ -183,6 +243,7 @@ def submit_job(dispatcher, docker_image, args):
"spark.mesos.task.labels=DCOS_SPACE:{}".format(dcos_space),
"--conf",
"spark.mesos.role={}".format(role)] + \
format_kerberos_args(args) + \
submit_args.split()

hdfs_url = _get_spark_hdfs_url()
Expand Down
16 changes: 12 additions & 4 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@
# docker build -t spark:git-`git rev-parse --short HEAD` .

# Basing from Mesos image so the Mesos native library is present.
FROM mesosphere/mesos-modules-private:dcos-ee-mesos-modules-1.8.5-rc2
MAINTAINER Michael Gummelt <[email protected]>
FROM ubuntu:14.04
MAINTAINER Michael Gummelt <[email protected]>, Arthur Rand <[email protected]>

# Set environment variables.
ENV DEBIAN_FRONTEND "noninteractive"
Expand All @@ -37,13 +37,21 @@ RUN apt-get update && \
apt-get install -y curl
RUN apt-get install -y r-base

RUN cd /usr/lib/jvm && \
RUN mkdir -p /opt/mesosphere/ && \
cd /opt/mesosphere && \
curl -L -O https://downloads.mesosphere.io/libmesos-bundle/libmesos-bundle-1.10-1.4-63e0814.tar.gz && \
tar zxf libmesos-bundle-1.10-1.4-63e0814.tar.gz && \
rm libmesos-bundle-1.10-1.4-63e0814.tar.gz

RUN mkdir -p /usr/lib/jvm/ && \
cd /usr/lib/jvm && \
curl -L -O https://downloads.mesosphere.com/java/jre-8u112-linux-x64-jce-unlimited.tar.gz && \
tar zxf jre-8u112-linux-x64-jce-unlimited.tar.gz && \
rm jre-8u112-linux-x64-jce-unlimited.tar.gz

ENV JAVA_HOME /usr/lib/jvm/jre1.8.0_112
ENV MESOS_NATIVE_JAVA_LIBRARY /usr/lib/libmesos.so
ENV MESOS_NATIVE_JAVA_LIBRARY /opt/mesosphere/libmesos-bundle/lib/libmesos.so
ENV LD_LIBRARY_PATH /opt/mesosphere/libmesos-bundle/lib/
ENV HADOOP_CONF_DIR /etc/hadoop

RUN mkdir /etc/hadoop
Expand Down
1 change: 1 addition & 0 deletions docker/runit/init.sh
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ if [[ -f hdfs-site.xml && -f core-site.xml ]]; then
fi

# Move kerberos config file, as specified by security.kerberos.krb5conf, into place.
# this only affects the krb5.conf file for the dispatcher
if [[ -n "${SPARK_MESOS_KRB5_CONF_BASE64}" ]]; then
echo "${SPARK_MESOS_KRB5_CONF_BASE64}" | base64 -d > /etc/krb5.conf
fi
Expand Down
8 changes: 5 additions & 3 deletions docs/hdfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,13 +76,15 @@ Keytabs are valid infinitely, while tickets can expire. Especially for long-runn

Submit the job with the keytab:

dcos spark run --submit-args="--principal user@REALM --keytab <keytab-file-path>..."
dcos spark run --kerberos-principal=user@REALM --keytab-secret-path=<secret_path> \
--submit-args=" ... "

### TGT Authentication

Submit the job with the ticket:

dcos spark run --principal user@REALM --tgt <ticket-file-path>
```$bash
dcos spark run --kerberos-principal user@REALM --submit-args="--tgt <ticket-file-path> ..."
```

**Note:** These credentials are security-critical. We highly recommended configuring SSL encryption between the Spark components when accessing Kerberos-secured HDFS clusters. See the Security section for information on how to do this.

Expand Down

0 comments on commit 6ec5bb9

Please sign in to comment.