-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22960][k8s] Make build-push-docker-images.sh more dev-friendly. #20154
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,51 +19,131 @@ | |
# This script builds and pushes docker images when run from a release of Spark | ||
# with Kubernetes support. | ||
|
||
declare -A path=( [spark-driver]=kubernetes/dockerfiles/driver/Dockerfile \ | ||
[spark-executor]=kubernetes/dockerfiles/executor/Dockerfile \ | ||
[spark-init]=kubernetes/dockerfiles/init-container/Dockerfile ) | ||
function error { | ||
echo "$@" 1>&2 | ||
exit 1 | ||
} | ||
|
||
# Detect whether this is a git clone or a Spark distribution and adjust paths | ||
# accordingly. | ||
if [ -z "${SPARK_HOME}" ]; then | ||
SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)" | ||
fi | ||
. "${SPARK_HOME}/bin/load-spark-env.sh" | ||
|
||
if [ -f "$SPARK_HOME/RELEASE" ]; then | ||
IMG_PATH="kubernetes/dockerfiles" | ||
SPARK_JARS="jars" | ||
else | ||
IMG_PATH="resource-managers/kubernetes/docker/src/main/dockerfiles" | ||
SPARK_JARS="assembly/target/scala-$SPARK_SCALA_VERSION/jars" | ||
fi | ||
|
||
if [ ! -d "$IMG_PATH" ]; then | ||
error "Cannot find docker images. This script must be run from a runnable distribution of Apache Spark." | ||
fi | ||
|
||
declare -A path=( [spark-driver]="$IMG_PATH/driver/Dockerfile" \ | ||
[spark-executor]="$IMG_PATH/executor/Dockerfile" \ | ||
[spark-init]="$IMG_PATH/init-container/Dockerfile" ) | ||
|
||
function image_ref { | ||
local image="$1" | ||
local add_repo="${2:-1}" | ||
if [ $add_repo = 1 ] && [ -n "$REPO" ]; then | ||
image="$REPO/$image" | ||
fi | ||
if [ -n "$TAG" ]; then | ||
image="$image:$TAG" | ||
fi | ||
echo "$image" | ||
} | ||
|
||
function build { | ||
docker build -t spark-base -f kubernetes/dockerfiles/spark-base/Dockerfile . | ||
local base_image="$(image_ref spark-base 0)" | ||
docker build --build-arg "spark_jars=$SPARK_JARS" \ | ||
--build-arg "img_path=$IMG_PATH" \ | ||
-t "$base_image" \ | ||
-f "$IMG_PATH/spark-base/Dockerfile" . | ||
for image in "${!path[@]}"; do | ||
docker build -t ${REPO}/$image:${TAG} -f ${path[$image]} . | ||
docker build --build-arg "base_image=$base_image" -t "$(image_ref $image)" -f ${path[$image]} . | ||
done | ||
} | ||
|
||
|
||
function push { | ||
for image in "${!path[@]}"; do | ||
docker push ${REPO}/$image:${TAG} | ||
docker push "$(image_ref $image)" | ||
done | ||
} | ||
|
||
function usage { | ||
echo "This script must be run from a runnable distribution of Apache Spark." | ||
echo "Usage: ./sbin/build-push-docker-images.sh -r <repo> -t <tag> build" | ||
echo " ./sbin/build-push-docker-images.sh -r <repo> -t <tag> push" | ||
echo "for example: ./sbin/build-push-docker-images.sh -r docker.io/myrepo -t v2.3.0 push" | ||
cat <<EOF | ||
Usage: $0 [options] [command] | ||
Builds or pushes the built-in Spark Docker images. | ||
|
||
Commands: | ||
build Build images. | ||
push Push images to a registry. Requires a repository address to be provided, both | ||
when building and when pushing the images. | ||
|
||
Options: | ||
-r repo Repository address. | ||
-t tag Tag to apply to built images, or to identify images to be pushed. | ||
-m Use minikube's Docker daemon. | ||
|
||
Using minikube when building images will do so directly into minikube's Docker daemon. | ||
There is no need to push the images into minikube in that case, they'll be automatically | ||
available when running applications inside the minikube cluster. | ||
|
||
Check the following documentation for more information on using the minikube Docker daemon: | ||
|
||
https://kubernetes.io/docs/getting-started-guides/minikube/#reusing-the-docker-daemon | ||
|
||
Examples: | ||
- Build images in minikube with tag "testing" | ||
$0 -m -t testing build | ||
|
||
- Build and push images with tag "v2.3.0" to docker.io/myrepo | ||
$0 -r docker.io/myrepo -t v2.3.0 build | ||
$0 -r docker.io/myrepo -t v2.3.0 push | ||
EOF | ||
} | ||
|
||
if [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then | ||
usage | ||
exit 0 | ||
fi | ||
|
||
while getopts r:t: option | ||
REPO= | ||
TAG= | ||
while getopts mr:t: option | ||
do | ||
case "${option}" | ||
in | ||
r) REPO=${OPTARG};; | ||
t) TAG=${OPTARG};; | ||
m) | ||
if ! which minikube 1>/dev/null; then | ||
error "Cannot find minikube." | ||
fi | ||
eval $(minikube docker-env) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think building docker images right into the minikube VM's docker daemon is uncommon and not something we'd want to recommend. Users on minikube should also use a proper registry - (for example, there is a registry addon) that could be used. While this might be good to document as a local developer workflow, I'm apprehensive about adding a new flag just for this particular mode. Also one could invoke There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I started calling that command separately, but it's really annoying. This option is useful not just for Spark devs, but for people who want to try their own apps on minikube before trying them on a larger cluster, for example.
What's the alternative? Deploying your own registry? I struggled with that for hours and it's nearly impossible to get docker to talk to an insecure registry (or one with a self signed cert like minikube's). This approach just worked (tm). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see your point - this is considerably easier. I spoke with a minikube maintainer and it seems this is not as uncommon as I initially thought. So, this change looks good, but I'd prefer that we add some more explanation to the usage section, that this will build an image within the minikube environment - and also linking to https://kubernetes.io/docs/getting-started-guides/minikube/#reusing-the-docker-daemon. cc/ @aaron-prindle |
||
;; | ||
esac | ||
done | ||
|
||
if [ -z "$REPO" ] || [ -z "$TAG" ]; then | ||
case "${@: -1}" in | ||
build) | ||
build | ||
;; | ||
push) | ||
if [ -z "$REPO" ]; then | ||
usage | ||
exit 1 | ||
fi | ||
push | ||
;; | ||
*) | ||
usage | ||
else | ||
case "${@: -1}" in | ||
build) build;; | ||
push) push;; | ||
*) usage;; | ||
esac | ||
fi | ||
exit 1 | ||
;; | ||
esac |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update this comment? I presume now it should say runnable distribution, or from source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source directory is sort of a "runnable distribution" if Spark is built. I'd rather keep the message simple since it's mostly targeted at end users (not devs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM