Collection of Apache Spark docker images for OKDP Platform.
Currently, the images are built from the Apache Spark project distribution and the requirement may evolve to produce them from the source code.
The image relashionship is described by the following diagram:
Image | Description |
---|---|
JRE |
The JRE LTS base image supported by Apache Spark depending on the version. This includes Java 11/17/21. Please, check the reference versions or Apache Spark website for more information. |
spark-base |
The Apache Spark base image with official spark binaries (scala/java) and without OKDP extensions. |
spark |
The Apache Spark image with official spark binaries (scala/java) and OKDP extensions. |
spark-py |
The Apache Spark image with official spark binaries (scala/java), OKDP extensions and python support. |
spark-r |
The Apache Spark image with official spark binaries (scala/java), OKDP extensions and R support. |
The project builds the images with a long format tags. Each tag combines multiple compatible versions combinations.
There are multiple tags levels and the format to use depends on your convenience in term of stability and reproducibility.
The images are pushed to quay.io/okdp repository with the following tags:
Images | Tags |
---|---|
spark-base, spark | spark-<SPARK_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION> spark-<SPARK_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE> spark-<SPARK_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<RELEASE_VERSION> spark-<SPARK_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE>-<RELEASE_VERSION> |
spark-py | spark-<SPARK_VERSION>-python-<PYTHON_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION> spark-<SPARK_VERSION>-python-<PYTHON_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE> spark-<SPARK_VERSION>-python-<PYTHON_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<RELEASE_VERSION> spark-<SPARK_VERSION>-python-<PYTHON_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE>-<RELEASE_VERSION> |
spark-r | spark-<SPARK_VERSION>-r-<R_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION> spark-<SPARK_VERSION>-r-<R_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE> spark-<SPARK_VERSION>-r-<R_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<RELEASE_VERSION> spark-<SPARK_VERSION>-r-<R_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE>-<RELEASE_VERSION> |
Note
-
<RELEASE_VERSION>
corresponds to the Github release version or git tag without the leadingv
. Ex.: 1.0.0 -
<BUILD_DATE>
corresponds to the images build date with theYYYY-MM-DD
format. The latest release tag is rebuilt every week to ensure the OS image is up to date against the latest security updates.You may need to switch to the latest release version if your are using the long form tag image with a
<RELEASE_VERSION>
. Please, check the changelog to see the notable impacts.An example of
py-spark
image with a long form tag includingspark/java/scala/python
compatible versions and a<BUILD_DATE>
with a<RELEASE_VERSION>
is:quay.io/okdp/spark-py:spark-3.5.1-python-3.11-scala-2.13-java-17-2024-04-04-1.0.0
.The corresponding changelog is releases/tag/v1.0.0.
-
You can also use the latest tag without
<BUILD_DATE>
and<RELEASE_VERSION>
which is always up to date with the latest security updates.An example of
py-spark
image with the latest tag is:quay.io/okdp/spark-py:spark-3.5.1-python-3.11-scala-2.13-java-17