Skip to content

Commit

Permalink
[SPARK-37875][K8S] Support ARM64 in Java 17 docker image
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This PR aims to support `ARM64` architecture in Java 17 docker image for Linux and Apple Silicon.

### Why are the changes needed?

Currently, `amd64` is hard-coded.
https://github.com/apache/spark/blob/371ab5a07c18cc456cc7ee5b8fa051d46e11b363/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17#L54

After this PR, `JAVA_HOME` is auto-detected in Java 17.
```
$ k logs spark-test-app-bea3e02b0481470b8b13a26cf21f1913
++ id -u
+ myuid=185
++ id -g
+ mygid=0
+ set +e
++ getent passwd 185
+ uidentry=
+ set -e
+ '[' -z '' ']'
+ '[' -w /etc/passwd ']'
+ echo '185:x:185:0:anonymous uid:/opt/spark:/bin/false'
+ '[' -z '' ']'
++ java -XshowSettings:properties -version
++ grep java.home
++ awk '{print $3}'
+ JAVA_HOME=/usr/lib/jvm/java-17-openjdk-arm64
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Run on Apple Silicon and Mac. Note that this doesn't mean all tests passed currently. For example, `TestUtils.withHttpServer` doesn't work in M1 environment because `Minikube` is running on `Docker Desktop`.
```
$ build/sbt -Pkubernetes -Pkubernetes-integration-tests "kubernetes-integration-tests/test" -Dspark.kubernetes.test.dockerFile=resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17
...
[info] KubernetesSuite:
[info] - Run SparkPi with no resources (10 seconds, 601 milliseconds)
[info] - Run SparkPi with no resources & statefulset allocation (11 seconds, 565 milliseconds)
[info] - Run SparkPi with a very long application name. (10 seconds, 625 milliseconds)
[info] - Use SparkLauncher.NO_RESOURCE (11 seconds, 383 milliseconds)
[info] - Run SparkPi with a master URL without a scheme. (11 seconds, 457 milliseconds)
[info] - Run SparkPi with an argument. (11 seconds, 367 milliseconds)
[info] - Run SparkPi with custom labels, annotations, and environment variables. (10 seconds, 451 milliseconds)
[info] - All pods have the same service account by default (10 seconds, 415 milliseconds)
[info] - Run extraJVMOptions check on driver (5 seconds, 241 milliseconds)
[info] - Run SparkRemoteFileTest using a remote data file *** FAILED *** (3 minutes, 3 seconds)
...
```

Closes apache#35176 from dongjoon-hyun/SPARK-37875.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
dongjoon-hyun authored and dchvn committed Jan 19, 2022
1 parent 93e587d commit cbd9b30
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,6 @@ COPY kubernetes/tests /opt/spark/tests
COPY data /opt/spark/data

ENV SPARK_HOME /opt/spark
ENV JAVA_HOME /usr/lib/jvm/java-17-openjdk-amd64/

WORKDIR /opt/spark/work-dir
RUN chmod g+w /opt/spark/work-dir
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,10 @@ if [ -z "$uidentry" ] ; then
fi
fi

if [ -z "$JAVA_HOME" ]; then
JAVA_HOME=$(java -XshowSettings:properties -version 2>&1 > /dev/null | grep 'java.home' | awk '{print $3}')
fi

SPARK_CLASSPATH="$SPARK_CLASSPATH:${SPARK_HOME}/jars/*"
env | grep SPARK_JAVA_OPT_ | sort -t_ -k4 -n | sed 's/[^=]*=\(.*\)/\1/g' > /tmp/java_opts.txt
readarray -t SPARK_EXECUTOR_JAVA_OPTS < /tmp/java_opts.txt
Expand Down

0 comments on commit cbd9b30

Please sign in to comment.