Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20123][build]$SPARK_HOME variable might have spaces in it(e.g. $SPARK… #17452

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions R/check-cran.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,18 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
pushd $FWDIR > /dev/null
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
pushd "$FWDIR" > /dev/null

. $FWDIR/find-r.sh
. "$FWDIR/find-r.sh"

# Install the package (this is required for code in vignettes to run when building it later)
# Build the latest docs, but not vignettes, which is built with the package next
. $FWDIR/install-dev.sh
. "$FWDIR/install-dev.sh"

# Build source package with vignettes
SPARK_HOME="$(cd "${FWDIR}"/..; pwd)"
. "${SPARK_HOME}"/bin/load-spark-env.sh
. "${SPARK_HOME}/bin/load-spark-env.sh"
if [ -f "${SPARK_HOME}/RELEASE" ]; then
SPARK_JARS_DIR="${SPARK_HOME}/jars"
else
Expand All @@ -40,16 +40,16 @@ fi

if [ -d "$SPARK_JARS_DIR" ]; then
# Build a zip file containing the source package with vignettes
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD build $FWDIR/pkg
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/R" CMD build "$FWDIR/pkg"

find pkg/vignettes/. -not -name '.' -not -name '*.Rmd' -not -name '*.md' -not -name '*.pdf' -not -name '*.html' -delete
else
echo "Error Spark JARs not found in $SPARK_HOME"
echo "Error Spark JARs not found in '$SPARK_HOME'"
exit 1
fi

# Run check as-cran.
VERSION=`grep Version $FWDIR/pkg/DESCRIPTION | awk '{print $NF}'`
VERSION=`grep Version "$FWDIR/pkg/DESCRIPTION" | awk '{print $NF}'`

CRAN_CHECK_OPTIONS="--as-cran"

Expand All @@ -67,10 +67,10 @@ echo "Running CRAN check with $CRAN_CHECK_OPTIONS options"

if [ -n "$NO_TESTS" ] && [ -n "$NO_MANUAL" ]
then
"$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
"$R_SCRIPT_PATH/R" CMD check $CRAN_CHECK_OPTIONS "SparkR_$VERSION.tar.gz"
else
# This will run tests and/or build vignettes, and require SPARK_HOME
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/R" CMD check $CRAN_CHECK_OPTIONS "SparkR_$VERSION.tar.gz"
fi

popd > /dev/null
10 changes: 5 additions & 5 deletions R/create-docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,23 +33,23 @@ export FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
export SPARK_HOME="$(cd "`dirname "${BASH_SOURCE[0]}"`"/..; pwd)"

# Required for setting SPARK_SCALA_VERSION
. "${SPARK_HOME}"/bin/load-spark-env.sh
. "${SPARK_HOME}/bin/load-spark-env.sh"

echo "Using Scala $SPARK_SCALA_VERSION"

pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

# Install the package (this will also generate the Rd files)
. $FWDIR/install-dev.sh
. "$FWDIR/install-dev.sh"

# Now create HTML files

# knit_rd puts html in current working directory
mkdir -p pkg/html
pushd pkg/html

"$R_SCRIPT_PATH/"Rscript -e 'libDir <- "../../lib"; library(SparkR, lib.loc=libDir); library(knitr); knit_rd("SparkR", links = tools::findHTMLlinks(paste(libDir, "SparkR", sep="/")))'
"$R_SCRIPT_PATH/Rscript" -e 'libDir <- "../../lib"; library(SparkR, lib.loc=libDir); library(knitr); knit_rd("SparkR", links = tools::findHTMLlinks(paste(libDir, "SparkR", sep="/")))'

popd

Expand Down
8 changes: 4 additions & 4 deletions R/create-rd.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

# Generate Rd files if devtools is installed
"$R_SCRIPT_PATH/"Rscript -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'
"$R_SCRIPT_PATH/Rscript" -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'
14 changes: 7 additions & 7 deletions R/install-dev.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,21 +29,21 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
LIB_DIR="$FWDIR/lib"

mkdir -p $LIB_DIR
mkdir -p "$LIB_DIR"

pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

. $FWDIR/create-rd.sh
. "$FWDIR/create-rd.sh"

# Install SparkR to $LIB_DIR
"$R_SCRIPT_PATH/"R CMD INSTALL --library=$LIB_DIR $FWDIR/pkg/
"$R_SCRIPT_PATH/R" CMD INSTALL --library="$LIB_DIR" "$FWDIR/pkg/"

# Zip the SparkR package so that it can be distributed to worker nodes on YARN
cd $LIB_DIR
cd "$LIB_DIR"
jar cfM "$LIB_DIR/sparkr.zip" SparkR

popd > /dev/null
20 changes: 10 additions & 10 deletions R/install-source-package.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,28 +29,28 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

if [ -z "$VERSION" ]; then
VERSION=`grep Version $FWDIR/pkg/DESCRIPTION | awk '{print $NF}'`
VERSION=`grep Version "$FWDIR/pkg/DESCRIPTION" | awk '{print $NF}'`
fi

if [ ! -f "$FWDIR"/SparkR_"$VERSION".tar.gz ]; then
echo -e "R source package file $FWDIR/SparkR_$VERSION.tar.gz is not found."
if [ ! -f "$FWDIR/SparkR_$VERSION.tar.gz" ]; then
echo -e "R source package file '$FWDIR/SparkR_$VERSION.tar.gz' is not found."
echo -e "Please build R source package with check-cran.sh"
exit -1;
fi

echo "Removing lib path and installing from source package"
LIB_DIR="$FWDIR/lib"
rm -rf $LIB_DIR
mkdir -p $LIB_DIR
"$R_SCRIPT_PATH/"R CMD INSTALL SparkR_"$VERSION".tar.gz --library=$LIB_DIR
rm -rf "$LIB_DIR"
mkdir -p "$LIB_DIR"
"$R_SCRIPT_PATH/R" CMD INSTALL "SparkR_$VERSION.tar.gz" --library="$LIB_DIR"

# Zip the SparkR package so that it can be distributed to worker nodes on YARN
pushd $LIB_DIR > /dev/null
pushd "$LIB_DIR" > /dev/null
jar cfM "$LIB_DIR/sparkr.zip" SparkR
popd > /dev/null

Expand Down
32 changes: 16 additions & 16 deletions dev/make-distribution.sh
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ echo "Spark version is $VERSION"
if [ "$MAKE_TGZ" == "true" ]; then
echo "Making spark-$VERSION-bin-$NAME.tgz"
else
echo "Making distribution for Spark $VERSION in $DISTDIR..."
echo "Making distribution for Spark $VERSION in '$DISTDIR'..."
fi

# Build uber fat JAR
Expand Down Expand Up @@ -170,7 +170,7 @@ cp "$SPARK_HOME"/assembly/target/scala*/jars/* "$DISTDIR/jars/"

# Only create the yarn directory if the yarn artifacts were build.
if [ -f "$SPARK_HOME"/common/network-yarn/target/scala*/spark-*-yarn-shuffle.jar ]; then
mkdir "$DISTDIR"/yarn
mkdir "$DISTDIR/yarn"
cp "$SPARK_HOME"/common/network-yarn/target/scala*/spark-*-yarn-shuffle.jar "$DISTDIR/yarn"
fi

Expand All @@ -179,7 +179,7 @@ mkdir -p "$DISTDIR/examples/jars"
cp "$SPARK_HOME"/examples/target/scala*/jars/* "$DISTDIR/examples/jars"

# Deduplicate jars that have already been packaged as part of the main Spark dependencies.
for f in "$DISTDIR/examples/jars/"*; do
for f in "$DISTDIR"/examples/jars/*; do
name=$(basename "$f")
if [ -f "$DISTDIR/jars/$name" ]; then
rm "$DISTDIR/examples/jars/$name"
Expand All @@ -188,14 +188,14 @@ done

# Copy example sources (needed for python and SQL)
mkdir -p "$DISTDIR/examples/src/main"
cp -r "$SPARK_HOME"/examples/src/main "$DISTDIR/examples/src/"
cp -r "$SPARK_HOME/examples/src/main" "$DISTDIR/examples/src/"

# Copy license and ASF files
cp "$SPARK_HOME/LICENSE" "$DISTDIR"
cp -r "$SPARK_HOME/licenses" "$DISTDIR"
cp "$SPARK_HOME/NOTICE" "$DISTDIR"

if [ -e "$SPARK_HOME"/CHANGES.txt ]; then
if [ -e "$SPARK_HOME/CHANGES.txt" ]; then
cp "$SPARK_HOME/CHANGES.txt" "$DISTDIR"
fi

Expand All @@ -217,43 +217,43 @@ fi
# Make R package - this is used for both CRAN release and packing R layout into distribution
if [ "$MAKE_R" == "true" ]; then
echo "Building R source package"
R_PACKAGE_VERSION=`grep Version $SPARK_HOME/R/pkg/DESCRIPTION | awk '{print $NF}'`
R_PACKAGE_VERSION=`grep Version "$SPARK_HOME/R/pkg/DESCRIPTION" | awk '{print $NF}'`
pushd "$SPARK_HOME/R" > /dev/null
# Build source package and run full checks
# Do not source the check-cran.sh - it should be run from where it is for it to set SPARK_HOME
NO_TESTS=1 "$SPARK_HOME/"R/check-cran.sh
NO_TESTS=1 "$SPARK_HOME/R/check-cran.sh"

# Move R source package to match the Spark release version if the versions are not the same.
# NOTE(shivaram): `mv` throws an error on Linux if source and destination are same file
if [ "$R_PACKAGE_VERSION" != "$VERSION" ]; then
mv $SPARK_HOME/R/SparkR_"$R_PACKAGE_VERSION".tar.gz $SPARK_HOME/R/SparkR_"$VERSION".tar.gz
mv "$SPARK_HOME/R/SparkR_$R_PACKAGE_VERSION.tar.gz" "$SPARK_HOME/R/SparkR_$VERSION.tar.gz"
fi

# Install source package to get it to generate vignettes rds files, etc.
VERSION=$VERSION "$SPARK_HOME/"R/install-source-package.sh
VERSION=$VERSION "$SPARK_HOME/R/install-source-package.sh"
popd > /dev/null
else
echo "Skipping building R source package"
fi

# Copy other things
mkdir "$DISTDIR"/conf
cp "$SPARK_HOME"/conf/*.template "$DISTDIR"/conf
mkdir "$DISTDIR/conf"
cp "$SPARK_HOME"/conf/*.template "$DISTDIR/conf"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the two args need quoting separately?

Copy link
Author

@zuotingbing zuotingbing Mar 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh NO, it might be wrong if we quote the arg $SPARK_HOME/conf/*.template as a whole.
It works well already and the debug info as follows :
+ mkdir '/home/spark build/spark/dist/conf'
+ cp '/home/spark build/spark/conf/docker.properties.template' '/home/spark build/spark/conf/fairscheduler.xml.template' '/home/spark build/spark/conf/log4j.properties.template' '/home/spark build/spark/conf/metrics.properties.template' '/home/spark build/spark/conf/slaves.template' '/home/spark build/spark/conf/spark-defaults.conf.template' '/home/spark build/spark/conf/spark-env.sh.template' '/home/spark build/spark/dist/conf'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, yes that's also correct. But right now this is just one big argument to cp, not 2 or more. Doesn't it need to be more like cp "$SPARK_HOME"/conf/*.template "$DISTDIR/conf"?

Copy link
Author

@zuotingbing zuotingbing Mar 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course i use cp "$SPARK_HOME"/conf/*.template "$DISTDIR/conf" to get the debug info. You can write a test.sh and $ sh -x test.sh for debug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't see that this works.

touch foo
cp "f* bar"

usage ... source_file target_file

What am I missing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you quote f* bar , f* bar will be treated as one argument. The error is missing the destination file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK the line above is correct. I must be crazy, I thought I read it earlier as being one big quoted argument. If it wasn't that way before then I must have misread it and you can ignore this comment.

cp "$SPARK_HOME/README.md" "$DISTDIR"
cp -r "$SPARK_HOME/bin" "$DISTDIR"
cp -r "$SPARK_HOME/python" "$DISTDIR"

# Remove the python distribution from dist/ if we built it
if [ "$MAKE_PIP" == "true" ]; then
rm -f $DISTDIR/python/dist/pyspark-*.tar.gz
rm -f "$DISTDIR"/python/dist/pyspark-*.tar.gz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm -f "$DISTDIR/python/dist/pyspark-*.tar.gz"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that fail to work because * isn't expanded?

Copy link
Author

@zuotingbing zuotingbing Mar 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is wrong if we quote the arg $DISTDIR/python/dist/pyspark-*.tar.gz as a whole because * isn't expanded.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I see that other thread...

fi

cp -r "$SPARK_HOME/sbin" "$DISTDIR"
# Copy SparkR if it exists
if [ -d "$SPARK_HOME"/R/lib/SparkR ]; then
mkdir -p "$DISTDIR"/R/lib
cp -r "$SPARK_HOME/R/lib/SparkR" "$DISTDIR"/R/lib
cp "$SPARK_HOME/R/lib/sparkr.zip" "$DISTDIR"/R/lib
if [ -d "$SPARK_HOME/R/lib/SparkR" ]; then
mkdir -p "$DISTDIR/R/lib"
cp -r "$SPARK_HOME/R/lib/SparkR" "$DISTDIR/R/lib"
cp "$SPARK_HOME/R/lib/sparkr.zip" "$DISTDIR/R/lib"
fi

if [ "$MAKE_TGZ" == "true" ]; then
Expand Down