Skip to content

Commit

Permalink
SPARK-1286: Make usage of spark-env.sh idempotent
Browse files Browse the repository at this point in the history
Various spark scripts load spark-env.sh. This can cause growth of any variables that may be appended to (SPARK_CLASSPATH, SPARK_REPL_OPTS) and it makes the precedence order for options specified in spark-env.sh less clear.

One use-case for the latter is that we want to set options from the command-line of spark-shell, but these options will be overridden by subsequent loading of spark-env.sh. If we were to load the spark-env.sh first and then set our command-line options, we could guarantee correct precedence order.

Note that we use SPARK_CONF_DIR if available to support the sbin/ scripts, which always set this variable from sbin/spark-config.sh. Otherwise, we default to the ../conf/ as usual.

Author: Aaron Davidson <[email protected]>

Closes apache#184 from aarondav/idem and squashes the following commits:

e291f91 [Aaron Davidson] Use "private" variables in load-spark-env.sh
8da8360 [Aaron Davidson] Add .sh extension to load-spark-env.sh
93a2471 [Aaron Davidson] SPARK-1286: Make usage of spark-env.sh idempotent
  • Loading branch information
aarondav committed Mar 25, 2014
1 parent b637f2d commit 007a733
Show file tree
Hide file tree
Showing 11 changed files with 45 additions and 34 deletions.
5 changes: 1 addition & 4 deletions bin/compute-classpath.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,7 @@ SCALA_VERSION=2.10
# Figure out where Spark is installed
FWDIR="$(cd `dirname $0`/..; pwd)"

# Load environment variables from conf/spark-env.sh, if it exists
if [ -e "$FWDIR/conf/spark-env.sh" ] ; then
. $FWDIR/conf/spark-env.sh
fi
. $FWDIR/bin/load-spark-env.sh

# Build up classpath
CLASSPATH="$SPARK_CLASSPATH:$FWDIR/conf"
Expand Down
35 changes: 35 additions & 0 deletions bin/load-spark-env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# This script loads spark-env.sh if it exists, and ensures it is only loaded once.
# spark-env.sh is loaded from SPARK_CONF_DIR if set, or within the current directory's
# conf/ subdirectory.

if [ -z "$SPARK_ENV_LOADED" ]; then
export SPARK_ENV_LOADED=1

# Returns the parent of the directory this script lives in.
parent_dir="$(cd `dirname $0`/..; pwd)"

use_conf_dir=${SPARK_CONF_DIR:-"$parent_dir/conf"}

if [ -f "${use_conf_dir}/spark-env.sh" ]; then
. "${use_conf_dir}/spark-env.sh"
fi
fi
5 changes: 1 addition & 4 deletions bin/pyspark
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,7 @@ if [ ! -f "$FWDIR/RELEASE" ]; then
fi
fi

# Load environment variables from conf/spark-env.sh, if it exists
if [ -e "$FWDIR/conf/spark-env.sh" ] ; then
. $FWDIR/conf/spark-env.sh
fi
. $FWDIR/bin/load-spark-env.sh

# Figure out which Python executable to use
if [ -z "$PYSPARK_PYTHON" ] ; then
Expand Down
5 changes: 1 addition & 4 deletions bin/run-example
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,7 @@ FWDIR="$(cd `dirname $0`/..; pwd)"
# Export this as SPARK_HOME
export SPARK_HOME="$FWDIR"

# Load environment variables from conf/spark-env.sh, if it exists
if [ -e "$FWDIR/conf/spark-env.sh" ] ; then
. $FWDIR/conf/spark-env.sh
fi
. $FWDIR/bin/load-spark-env.sh

if [ -z "$1" ]; then
echo "Usage: run-example <example-class> [<args>]" >&2
Expand Down
5 changes: 1 addition & 4 deletions bin/spark-class
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,7 @@ FWDIR="$(cd `dirname $0`/..; pwd)"
# Export this as SPARK_HOME
export SPARK_HOME="$FWDIR"

# Load environment variables from conf/spark-env.sh, if it exists
if [ -e "$FWDIR/conf/spark-env.sh" ] ; then
. $FWDIR/conf/spark-env.sh
fi
. $FWDIR/bin/load-spark-env.sh

if [ -z "$1" ]; then
echo "Usage: spark-class <class> [<args>]" >&2
Expand Down
4 changes: 1 addition & 3 deletions bin/spark-shell
Original file line number Diff line number Diff line change
Expand Up @@ -81,9 +81,7 @@ done
# Set MASTER from spark-env if possible
DEFAULT_SPARK_MASTER_PORT=7077
if [ -z "$MASTER" ]; then
if [ -e "$FWDIR/conf/spark-env.sh" ]; then
. "$FWDIR/conf/spark-env.sh"
fi
. $FWDIR/bin/load-spark-env.sh
if [ "x" != "x$SPARK_MASTER_IP" ]; then
if [ "y" != "y$SPARK_MASTER_PORT" ]; then
SPARK_MASTER_PORT="${SPARK_MASTER_PORT}"
Expand Down
4 changes: 1 addition & 3 deletions sbin/slaves.sh
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,7 @@ then
shift
fi

if [ -f "${SPARK_CONF_DIR}/spark-env.sh" ]; then
. "${SPARK_CONF_DIR}/spark-env.sh"
fi
. "$SPARK_PREFIX/bin/load-spark-env.sh"

if [ "$HOSTLIST" = "" ]; then
if [ "$SPARK_SLAVES" = "" ]; then
Expand Down
4 changes: 1 addition & 3 deletions sbin/spark-daemon.sh
Original file line number Diff line number Diff line change
Expand Up @@ -86,9 +86,7 @@ spark_rotate_log ()
fi
}

if [ -f "${SPARK_CONF_DIR}/spark-env.sh" ]; then
. "${SPARK_CONF_DIR}/spark-env.sh"
fi
. "$SPARK_PREFIX/bin/load-spark-env.sh"

if [ "$SPARK_IDENT_STRING" = "" ]; then
export SPARK_IDENT_STRING="$USER"
Expand Down
4 changes: 1 addition & 3 deletions sbin/start-master.sh
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,7 @@ done

. "$sbin/spark-config.sh"

if [ -f "${SPARK_CONF_DIR}/spark-env.sh" ]; then
. "${SPARK_CONF_DIR}/spark-env.sh"
fi
. "$SPARK_PREFIX/bin/load-spark-env.sh"

if [ "$SPARK_MASTER_PORT" = "" ]; then
SPARK_MASTER_PORT=7077
Expand Down
4 changes: 1 addition & 3 deletions sbin/start-slaves.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,7 @@ done

. "$sbin/spark-config.sh"

if [ -f "${SPARK_CONF_DIR}/spark-env.sh" ]; then
. "${SPARK_CONF_DIR}/spark-env.sh"
fi
. "$SPARK_PREFIX/bin/load-spark-env.sh"

# Find the port number for the master
if [ "$SPARK_MASTER_PORT" = "" ]; then
Expand Down
4 changes: 1 addition & 3 deletions sbin/stop-slaves.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,7 @@ sbin=`cd "$sbin"; pwd`

. "$sbin/spark-config.sh"

if [ -f "${SPARK_CONF_DIR}/spark-env.sh" ]; then
. "${SPARK_CONF_DIR}/spark-env.sh"
fi
. "$SPARK_PREFIX/bin/load-spark-env.sh"

# do before the below calls as they exec
if [ -e "$sbin"/../tachyon/bin/tachyon ]; then
Expand Down

0 comments on commit 007a733

Please sign in to comment.