From 67c50b7713b97bc155186a155cbdc538404714b6 Mon Sep 17 00:00:00 2001 From: Rob Vesse Date: Tue, 22 Jan 2019 10:31:17 -0800 Subject: [PATCH] [SPARK-26685][K8S] Correct placement of ARG declaration Latest Docker releases are stricter in their enforcement of build argument scope. The location of the `ARG spark_uid` declaration in the Python and R Dockerfiles means the variable is out of scope by the time it is used in a `USER` declaration resulting in a container running as root rather than the default/configured UID. Also with some of the refactoring of the script that has happened since my PR that introduced the configurable UID it turns out the `-u ` argument is not being properly passed to the Python and R image builds when those are opted into ## What changes were proposed in this pull request? This commit moves the `ARG` declaration to just before the argument is used such that it is in scope. It also ensures that Python and R image builds receive the build arguments that include the `spark_uid` argument where relevant ## How was this patch tested? Prior to the patch images are produced where the Python and R images ignore the default/configured UID: ``` > docker run -it --entrypoint /bin/bash rvesse/spark-py:uid456 bash-4.4# whoami root bash-4.4# id -u 0 bash-4.4# exit > docker run -it --entrypoint /bin/bash rvesse/spark:uid456 bash-4.4$ id -u 456 bash-4.4$ exit ``` Note that the Python image is still running as `root` having ignored the configured UID of 456 while the base image has the correct UID because the relevant `ARG` declaration is correctly in scope. After the patch the correct UID is observed: ``` > docker run -it --entrypoint /bin/bash rvesse/spark-r:uid456 bash-4.4$ id -u 456 bash-4.4$ exit exit > docker run -it --entrypoint /bin/bash rvesse/spark-py:uid456 bash-4.4$ id -u 456 bash-4.4$ exit exit > docker run -it --entrypoint /bin/bash rvesse/spark:uid456 bash-4.4$ id -u 456 bash-4.4$ exit ``` Closes #23611 from rvesse/SPARK-26685. Authored-by: Rob Vesse Signed-off-by: Marcelo Vanzin --- bin/docker-image-tool.sh | 3 ++- .../docker/src/main/dockerfiles/spark/bindings/R/Dockerfile | 2 +- .../src/main/dockerfiles/spark/bindings/python/Dockerfile | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/bin/docker-image-tool.sh b/bin/docker-image-tool.sh index 4f66137eb1c7a..efaf09e74c405 100755 --- a/bin/docker-image-tool.sh +++ b/bin/docker-image-tool.sh @@ -154,10 +154,11 @@ function build { fi local BINDING_BUILD_ARGS=( - ${BUILD_PARAMS} + ${BUILD_ARGS[@]} --build-arg base_img=$(image_ref spark) ) + local BASEDOCKERFILE=${BASEDOCKERFILE:-"kubernetes/dockerfiles/spark/Dockerfile"} local PYDOCKERFILE=${PYDOCKERFILE:-false} local RDOCKERFILE=${RDOCKERFILE:-false} diff --git a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/R/Dockerfile b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/R/Dockerfile index 9ded57c655104..34d449c9f08b9 100644 --- a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/R/Dockerfile +++ b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/R/Dockerfile @@ -16,7 +16,6 @@ # ARG base_img -ARG spark_uid=185 FROM $base_img WORKDIR / @@ -35,4 +34,5 @@ WORKDIR /opt/spark/work-dir ENTRYPOINT [ "/opt/entrypoint.sh" ] # Specify the User that the actual main process will run as +ARG spark_uid=185 USER ${spark_uid} diff --git a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/python/Dockerfile b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/python/Dockerfile index 36b91eb9a3aac..5044900d1d8a6 100644 --- a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/python/Dockerfile +++ b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/python/Dockerfile @@ -16,7 +16,6 @@ # ARG base_img -ARG spark_uid=185 FROM $base_img WORKDIR / @@ -46,4 +45,5 @@ WORKDIR /opt/spark/work-dir ENTRYPOINT [ "/opt/entrypoint.sh" ] # Specify the User that the actual main process will run as +ARG spark_uid=185 USER ${spark_uid}