-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-40256][BUILD][K8S] Switch base image from openjdk to eclipse-temurin #37705
Conversation
cc @dongjoon-hyun @holdenk @HyukjinKwon @gengliangwang This might be a big change but we have to do (replace openjdk by some other base images). |
The alternate solution is we just swith search openjdk in debian:bullseye$ docker run -ti debian:bullseye-slim bash
root@c9531d160a2f:/# apt update > /dev/null
root@c9531d160a2f:/# apt search openjdk-8-jre
Sorting... Done
Full Text Search... Done
root@c9531d160a2f:/# apt search openjdk-11-jre
Sorting... Done
Full Text Search... Done
openjdk-11-jre/stable-security 11.0.16+8-1~deb11u1 arm64
OpenJDK Java runtime, using Hotspot JIT
openjdk-11-jre-headless/stable-security 11.0.16+8-1~deb11u1 arm64
OpenJDK Java runtime, using Hotspot JIT (headless)
openjdk-11-jre-zero/stable-security 11.0.16+8-1~deb11u1 arm64
Alternative JVM for OpenJDK, using Zero
root@c9531d160a2f:/# apt search openjdk-17-jre
Sorting... Done
Full Text Search... Done
openjdk-17-jre/stable-security 17.0.4+8-1~deb11u1 arm64
OpenJDK Java runtime, using Hotspot JIT
openjdk-17-jre-headless/stable-security 17.0.4+8-1~deb11u1 arm64
OpenJDK Java runtime, using Hotspot JIT (headless)
openjdk-17-jre-zero/stable-security 17.0.4+8-1~deb11u1 arm64
Alternative JVM for OpenJDK, using Zero
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give the situation, looks like a reasonable approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@dongjoon-hyun @gengliangwang @HyukjinKwon Thanks for your comments! I will test more today then make it ready for review |
I did a complete a e2e test on docker image: # openjdk base
./bin/docker-image-tool.sh -r yikunkero -p ./kubernetes/dockerfiles/spark/bindings/python/Dockerfile -R ./kubernetes/dockerfiles/spark/bindings/R/Dockerfile -t v3.3.0-temurin -X -b java_image_tag=11-jre-focal push
# eclipse-temurin(apply Dockerfile patch) on v3.3.0
./bin/docker-image-tool.sh -r yikunkero -p ./kubernetes/dockerfiles/spark/bindings/python/Dockerfile -R ./kubernetes/dockerfiles/spark/bindings/R/Dockerfile -t v3.3.0 -X -b java_image_tag=11-jre-slim push
All images works well, and no more image size gain (you can click each link to see detail). So I think it's ready for reivew now. |
The changes are straightforward and reasonable. Merging to master. |
Thank you, @Yikun , @HyukjinKwon , @gengliangwang . |
### _Why are the changes needed?_ eclipse-temurin is the successor for openjdk, see apache/spark#37705 "The core change is: the OS of base image changes debian-bullseye to ubuntu-focal (based on debian bullseye)." ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4288 from pan3793/image. Closes #4288 b7b6190 [Cheng Pan] Use eclipse-temurin:8-jdk-focal as default base image Authored-by: Cheng Pan <[email protected]> Signed-off-by: Cheng Pan <[email protected]>
### What changes were proposed in this pull request? This pr upgrade Apache Arrow from 13.0.0 to 14.0.0. ### Why are the changes needed? The Apache Arrow 14.0.0 release brings a number of enhancements and bug fixes. In terms of bug fixes, the release addresses several critical issues that were causing failures in integration jobs with Spark([GH-36332](apache/arrow#36332)) and problems with importing empty data arrays([GH-37056](apache/arrow#37056)). It also optimizes the process of appending variable length vectors([GH-37829](apache/arrow#37829)) and includes C++ libraries for MacOS AARCH 64 in Java-Jars([GH-38076](apache/arrow#38076)). The new features and improvements focus on enhancing the handling and manipulation of data. This includes the introduction of DefaultVectorComparators for large types([GH-25659](apache/arrow#25659)), support for extended expressions in ScannerBuilder([GH-34252](apache/arrow#34252)), and the exposure of the VectorAppender class([GH-37246](apache/arrow#37246)). The release also brings enhancements to the development and testing process, with the CI environment now using JDK 21([GH-36994](apache/arrow#36994)). In addition, the release introduces vector validation consistent with C++, ensuring consistency across different languages([GH-37702](apache/arrow#37702)). Furthermore, the usability of VarChar writers and binary writers has been improved with the addition of extra input methods([GH-37705](apache/arrow#37705)), and VarCharWriter now supports writing from `Text` and `String`([GH-37706](apache/arrow#37706)). The release also adds typed getters for StructVector, improving the ease of accessing data([GH-37863](apache/arrow#37863)). The full release notes as follows: - https://arrow.apache.org/release/14.0.0.html ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #43650 from LuciferYang/arrow-14. Lead-authored-by: yangjie01 <[email protected]> Co-authored-by: YangJie <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR switchs the base image from
openjdk
toeclipse-temurin
(original openjdk).The core change is: the OS of base image changes
debian-bullseye
toubuntu-focal
(based on debian bullseye).Why are the changes needed?
According to Retiring OpenJDK Project Builds for JDK 11 and JDK 8 docker-library/openjdk#505 and Adjust OpenJDK image deprecation notice to fully deprecate docker-library/docs#2162, openjdk:8/11 image is EOL and Eclipse Temurin replaces this, the original openjdk image will
remove the 11 and 8 tags (in October 2022, perhaps)
(we are using it in spark), so we have to switch this before it happens.The
openjdk
is not update anymore (the last releases were 8u342 and 11.0.16, Eclipse Temurin replace is recommanded by adoptopenjdk) that means even the 8/11 tag is not removed, we still need to switchopenjdk
.There were many docker official image already switch openjdk to eclipse-temurin.
According the jvm ecosystem report from Adjust OpenJDK image deprecation notice to fully deprecate docker-library/docs#2162 , AdoptOpenJDK(now donation to eclipse foundation and rename to eclipse temurin) builds of OpenJDK most popular in production.
An ideal long-term solution is that we only choose the jdk version and leave the adaptation of OS to the corresponding openjdk official image (just like eclipse-temurin are suppoort ubuntu, alpine, centos)
The alternate solution is we just swith
openjdk
image todebian-bullseye
with openjdk 11 installation. like: Switch base image from openjdk to debian Yikun/spark#163. But it makes spark image depends on debian OS more, that means we will diffcult to support the Java version which debian OS doesn't support (such as openjdk-8-jre is not be supported in current debian anymore).For the above reason, I think
eclipse-temurin
is a good choice.Does this PR introduce any user-facing change?
Yes, the docker images base image changes.
How was this patch tested?
CI passed, I also have a local test on: Yikun#162