Skip to content

Commit

Permalink
[SPARK-44113][BUILD][INFRA][DOCS] Drop support for Scala 2.12
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
The main purpose of this PR is to remove support for Scala 2.12 in Apache Spark 4.0, the specific work includes:

1. Run `dev/change-scala-version.sh 2.13` to change the Scala version in all `pom.xml` files to 2.13
2. Cleaned up the parent `pom.xml` configuration related to Scala 2.12 and set the Scala 2.13 configuration as the default
3. Cleaned up the Scala 2.12 configuration in `SparkBuild.scala`
4. Cleaned up the support for Scala 2.12 in `python/run-tests.py` and `dev/run-tests.py`
5. Updated the sections related to Scala 2.12 in `docs/building-spark.md`, `docs/index.md`,`docs/spark-connect-overview.md`,`docs/storage-openstack-swift.md`,`docs/_config.yml` to Scala 2.13
6. Updated the parts related to Scala 2.12 in `dev/test-dependencies.sh`, `dev/scalafmt`, `dev/mima`,`dev/lint-scala` to Scala 2.13
7. Removed the support for Scala 2.12 in `dev/change-scala-version.sh` and cleaned up its invocation in Spark code
8. Updated `dev/deps/spark-deps-hadoop-3-hive-2.3` and `LICENSE-binary`
9. Removed task `scala-213` from `build_and_test.yml` because the daily test of other branches will not run this task, and the master branch already uses Scala 2.13 by default.
10. Replaced Scala 2.12 with Scala 2.13 in the `name` of the following `.yml` files: `build_ansi.yml`, `build_coverage.yml`, `build_java11.yml`,`build_java17.yml`, `build_java21.yml`,`build_maven.yml`, and `build_rockdb_as_ui_backend.yml`
11. Removed the support for Scala 2.12 in `benchmark.yml`
12. Moved files from `src/scala-2.13/` directory to `src/main/scala` and deleted all files in the `src/scala-2.12/` directory.
13. Comment out the code related to multiple Scala versions in `load-spark-env.cmd` and `load-spark-env.sh`.
14. Clean up redundant `build-helper-maven-plugin` configurations from `core/pom.xml`, `repl/pom.xml`, `sql/api/pom.xml`, `sql/catalyst/pom.xml`, `sql/core/pom.xml` and `connector/connect/common/pom.xml`

### Why are the changes needed?
The minimum supported Scala version for Apache Spark 4.0 is Scala 2.13.

### Does this PR introduce _any_ user-facing change?
Yes, Apache will no longer support Scala 2.12

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43008 from LuciferYang/SPARK-44113.

Lead-authored-by: yangjie01 <[email protected]>
Co-authored-by: YangJie <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
LuciferYang authored and dongjoon-hyun committed Sep 22, 2023
1 parent db02469 commit 3429202
Show file tree
Hide file tree
Showing 96 changed files with 326 additions and 1,849 deletions.
7 changes: 2 additions & 5 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,9 @@ on:
required: true
default: '8'
scala:
description: 'Scala version: 2.12 or 2.13'
description: 'Scala version: 2.13'
required: true
default: '2.12'
default: '2.13'
failfast:
description: 'Failfast: true or false'
required: true
Expand Down Expand Up @@ -170,7 +170,6 @@ jobs:
key: tpcds-${{ hashFiles('.github/workflows/benchmark.yml', 'sql/core/src/test/scala/org/apache/spark/sql/TPCDSSchema.scala') }}
- name: Run benchmarks
run: |
dev/change-scala-version.sh ${{ github.event.inputs.scala }}
./build/sbt -Pscala-${{ github.event.inputs.scala }} -Pyarn -Pmesos -Pkubernetes -Phive -Phive-thriftserver -Phadoop-cloud -Pkinesis-asl -Pspark-ganglia-lgpl Test/package
# Make less noisy
cp conf/log4j2.properties.template conf/log4j2.properties
Expand All @@ -181,8 +180,6 @@ jobs:
--jars "`find . -name '*-SNAPSHOT-tests.jar' -o -name '*avro*-SNAPSHOT.jar' | paste -sd ',' -`" \
"`find . -name 'spark-core*-SNAPSHOT-tests.jar'`" \
"${{ github.event.inputs.class }}"
# Revert to default Scala version to clean up unnecessary git diff
dev/change-scala-version.sh 2.12
# To keep the directory structure and file permissions, tar them
# See also https://github.com/actions/upload-artifact#maintaining-file-permissions-and-case-sensitive-files
echo "Preparing the benchmark results:"
Expand Down
50 changes: 1 addition & 49 deletions .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ jobs:
sparkr=`./dev/is-changed.py -m sparkr`
tpcds=`./dev/is-changed.py -m sql`
docker=`./dev/is-changed.py -m docker-integration-tests`
# 'build', 'scala-213', and 'java-other-versions' are always true for now.
# 'build' and 'java-other-versions' are always true for now.
# It does not save significant time and most of PRs trigger the build.
precondition="
{
Expand All @@ -95,7 +95,6 @@ jobs:
\"sparkr\": \"$sparkr\",
\"tpcds-1g\": \"$tpcds\",
\"docker-integration-tests\": \"$docker\",
\"scala-213\": \"true\",
\"java-other-versions\": \"true\",
\"lint\" : \"true\",
\"k8s-integration-tests\" : \"true\",
Expand Down Expand Up @@ -828,53 +827,6 @@ jobs:
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pmesos -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Djava.version=${JAVA_VERSION/-ea} install
rm -rf ~/.m2/repository/org/apache/spark
scala-213:
needs: precondition
if: fromJson(needs.precondition.outputs.required).scala-213 == 'true'
name: Scala 2.13 build with SBT
runs-on: ubuntu-22.04
timeout-minutes: 300
steps:
- name: Checkout Spark repository
uses: actions/checkout@v3
with:
fetch-depth: 0
repository: apache/spark
ref: ${{ inputs.branch }}
- name: Sync the current branch with the latest in Apache Spark
if: github.repository != 'apache/spark'
run: |
git fetch https://github.com/$GITHUB_REPOSITORY.git ${GITHUB_REF#refs/heads/}
git -c user.name='Apache Spark Test Account' -c user.email='[email protected]' merge --no-commit --progress --squash FETCH_HEAD
git -c user.name='Apache Spark Test Account' -c user.email='[email protected]' commit -m "Merged commit" --allow-empty
- name: Cache Scala, SBT and Maven
uses: actions/cache@v3
with:
path: |
build/apache-maven-*
build/scala-*
build/*.jar
~/.sbt
key: build-${{ hashFiles('**/pom.xml', 'project/build.properties', 'build/mvn', 'build/sbt', 'build/sbt-launch-lib.bash', 'build/spark-build-info') }}
restore-keys: |
build-
- name: Cache Coursier local repository
uses: actions/cache@v3
with:
path: ~/.cache/coursier
key: scala-213-coursier-${{ hashFiles('**/pom.xml', '**/plugins.sbt') }}
restore-keys: |
scala-213-coursier-
- name: Install Java 8
uses: actions/setup-java@v3
with:
distribution: zulu
java-version: 8
- name: Build with SBT
run: |
./dev/change-scala-version.sh 2.13
./build/sbt -Pyarn -Pmesos -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pkinesis-asl -Pdocker-integration-tests -Pkubernetes-integration-tests -Pspark-ganglia-lgpl -Pscala-2.13 compile Test/compile
# Any TPC-DS related updates on this job need to be applied to tpcds-1g-gen job of benchmark.yml as well
tpcds-1g:
needs: precondition
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_ansi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# under the License.
#

name: "Build / ANSI (master, Hadoop 3, JDK 8, Scala 2.12)"
name: "Build / ANSI (master, Hadoop 3, JDK 8, Scala 2.13)"

on:
schedule:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# under the License.
#

name: "Build / Coverage (master, Scala 2.12, Hadoop 3, JDK 8)"
name: "Build / Coverage (master, Scala 2.13, Hadoop 3, JDK 8)"

on:
schedule:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_java11.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# under the License.
#

name: "Build (master, Scala 2.12, Hadoop 3, JDK 11)"
name: "Build (master, Scala 2.13, Hadoop 3, JDK 11)"

on:
schedule:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_java17.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# under the License.
#

name: "Build (master, Scala 2.12, Hadoop 3, JDK 17)"
name: "Build (master, Scala 2.13, Hadoop 3, JDK 17)"

on:
schedule:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_java21.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# under the License.
#

name: "Build (master, Scala 2.12, Hadoop 3, JDK 21)"
name: "Build (master, Scala 2.13, Hadoop 3, JDK 21)"

on:
schedule:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_maven.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# under the License.
#

name: "Build using Maven (master, Scala 2.12, Hadoop 3, JDK 8)"
name: "Build using Maven (master, Scala 2.13, Hadoop 3, JDK 8)"

on:
schedule:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_rockdb_as_ui_backend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# under the License.
#

name: "Build / RocksDB as UI Backend (master, Hadoop 3, JDK 8, Scala 2.12)"
name: "Build / RocksDB as UI Backend (master, Hadoop 3, JDK 8, Scala 2.13)"

on:
schedule:
Expand Down
58 changes: 29 additions & 29 deletions LICENSE-binary
Original file line number Diff line number Diff line change
Expand Up @@ -209,34 +209,34 @@ org.apache.zookeeper:zookeeper
oro:oro
commons-configuration:commons-configuration
commons-digester:commons-digester
com.chuusai:shapeless_2.12
com.chuusai:shapeless_2.13
com.googlecode.javaewah:JavaEWAH
com.twitter:chill-java
com.twitter:chill_2.12
com.twitter:chill_2.13
com.univocity:univocity-parsers
javax.jdo:jdo-api
joda-time:joda-time
net.sf.opencsv:opencsv
org.apache.derby:derby
org.objenesis:objenesis
org.roaringbitmap:RoaringBitmap
org.scalanlp:breeze-macros_2.12
org.scalanlp:breeze_2.12
org.typelevel:macro-compat_2.12
org.scalanlp:breeze-macros_2.13
org.scalanlp:breeze_2.13
org.typelevel:macro-compat_2.13
org.yaml:snakeyaml
org.apache.xbean:xbean-asm7-shaded
com.squareup.okhttp3:logging-interceptor
com.squareup.okhttp3:okhttp
com.squareup.okio:okio
org.apache.spark:spark-catalyst_2.12
org.apache.spark:spark-kvstore_2.12
org.apache.spark:spark-launcher_2.12
org.apache.spark:spark-mllib-local_2.12
org.apache.spark:spark-network-common_2.12
org.apache.spark:spark-network-shuffle_2.12
org.apache.spark:spark-sketch_2.12
org.apache.spark:spark-tags_2.12
org.apache.spark:spark-unsafe_2.12
org.apache.spark:spark-catalyst_2.13
org.apache.spark:spark-kvstore_2.13
org.apache.spark:spark-launcher_2.13
org.apache.spark:spark-mllib-local_2.13
org.apache.spark:spark-network-common_2.13
org.apache.spark:spark-network-shuffle_2.13
org.apache.spark:spark-sketch_2.13
org.apache.spark:spark-tags_2.13
org.apache.spark:spark-unsafe_2.13
commons-httpclient:commons-httpclient
com.vlkan:flatbuffers
com.ning:compress-lzf
Expand Down Expand Up @@ -299,10 +299,10 @@ org.apache.orc:orc-mapreduce
org.mortbay.jetty:jetty
org.mortbay.jetty:jetty-util
com.jolbox:bonecp
org.json4s:json4s-ast_2.12
org.json4s:json4s-core_2.12
org.json4s:json4s-jackson_2.12
org.json4s:json4s-scalap_2.12
org.json4s:json4s-ast_2.13
org.json4s:json4s-core_2.13
org.json4s:json4s-jackson_2.13
org.json4s:json4s-scalap_2.13
com.carrotsearch:hppc
com.fasterxml.jackson.core:jackson-annotations
com.fasterxml.jackson.core:jackson-core
Expand All @@ -312,7 +312,7 @@ com.fasterxml.jackson.jaxrs:jackson-jaxrs-base
com.fasterxml.jackson.jaxrs:jackson-jaxrs-json-provider
com.fasterxml.jackson.module:jackson-module-jaxb-annotations
com.fasterxml.jackson.module:jackson-module-paranamer
com.fasterxml.jackson.module:jackson-module-scala_2.12
com.fasterxml.jackson.module:jackson-module-scala_2.13
com.github.mifmif:generex
com.google.code.findbugs:jsr305
com.google.code.gson:gson
Expand Down Expand Up @@ -385,8 +385,8 @@ org.eclipse.jetty:jetty-xml
org.scala-lang:scala-compiler
org.scala-lang:scala-library
org.scala-lang:scala-reflect
org.scala-lang.modules:scala-parser-combinators_2.12
org.scala-lang.modules:scala-xml_2.12
org.scala-lang.modules:scala-parser-combinators_2.13
org.scala-lang.modules:scala-xml_2.13
com.github.joshelser:dropwizard-metrics-hadoop-metrics2-reporter
com.zaxxer.HikariCP
org.apache.hive:hive-beeline
Expand Down Expand Up @@ -471,19 +471,19 @@ MIT License
-----------

com.microsoft.sqlserver:mssql-jdbc
org.typelevel:spire_2.12
org.typelevel:spire-macros_2.12
org.typelevel:spire-platform_2.12
org.typelevel:spire-util_2.12
org.typelevel:algebra_2.12:jar
org.typelevel:cats-kernel_2.12
org.typelevel:machinist_2.12
org.typelevel:spire_2.13
org.typelevel:spire-macros_2.13
org.typelevel:spire-platform_2.13
org.typelevel:spire-util_2.13
org.typelevel:algebra_2.13:jar
org.typelevel:cats-kernel_2.13
org.typelevel:machinist_2.13
net.razorvine:pickle
org.slf4j:jcl-over-slf4j
org.slf4j:jul-to-slf4j
org.slf4j:slf4j-api
org.slf4j:slf4j-log4j12
com.github.scopt:scopt_2.12
com.github.scopt:scopt_2.13
dev.ludovic.netlib:blas
dev.ludovic.netlib:arpack
dev.ludovic.netlib:lapack
Expand Down
4 changes: 2 additions & 2 deletions assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,12 @@
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.12</artifactId>
<artifactId>spark-parent_2.13</artifactId>
<version>4.0.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

<artifactId>spark-assembly_2.12</artifactId>
<artifactId>spark-assembly_2.13</artifactId>
<name>Spark Project Assembly</name>
<url>https://spark.apache.org/</url>
<packaging>pom</packaging>
Expand Down
30 changes: 15 additions & 15 deletions bin/load-spark-env.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -39,21 +39,21 @@ set SCALA_VERSION_2=2.12
set ASSEMBLY_DIR1="%SPARK_HOME%\assembly\target\scala-%SCALA_VERSION_1%"
set ASSEMBLY_DIR2="%SPARK_HOME%\assembly\target\scala-%SCALA_VERSION_2%"
set ENV_VARIABLE_DOC=https://spark.apache.org/docs/latest/configuration.html#environment-variables

if not defined SPARK_SCALA_VERSION (
if exist %ASSEMBLY_DIR2% if exist %ASSEMBLY_DIR1% (
echo Presence of build for multiple Scala versions detected ^(%ASSEMBLY_DIR1% and %ASSEMBLY_DIR2%^).
echo Remove one of them or, set SPARK_SCALA_VERSION=%SCALA_VERSION_1% in spark-env.cmd.
echo Visit %ENV_VARIABLE_DOC% for more details about setting environment variables in spark-env.cmd.
echo Either clean one of them or, set SPARK_SCALA_VERSION in spark-env.cmd.
exit 1
)
if exist %ASSEMBLY_DIR1% (
set SPARK_SCALA_VERSION=%SCALA_VERSION_1%
) else (
set SPARK_SCALA_VERSION=%SCALA_VERSION_2%
)
)
set SPARK_SCALA_VERSION=2.13
rem if not defined SPARK_SCALA_VERSION (
rem if exist %ASSEMBLY_DIR2% if exist %ASSEMBLY_DIR1% (
rem echo Presence of build for multiple Scala versions detected ^(%ASSEMBLY_DIR1% and %ASSEMBLY_DIR2%^).
rem echo Remove one of them or, set SPARK_SCALA_VERSION=%SCALA_VERSION_1% in spark-env.cmd.
rem echo Visit %ENV_VARIABLE_DOC% for more details about setting environment variables in spark-env.cmd.
rem echo Either clean one of them or, set SPARK_SCALA_VERSION in spark-env.cmd.
rem exit 1
rem )
rem if exist %ASSEMBLY_DIR1% (
rem set SPARK_SCALA_VERSION=%SCALA_VERSION_1%
rem ) else (
rem set SPARK_SCALA_VERSION=%SCALA_VERSION_2%
rem )
rem )
exit /b 0

:LoadSparkEnv
Expand Down
42 changes: 21 additions & 21 deletions bin/load-spark-env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -42,27 +42,27 @@ if [ -z "$SPARK_ENV_LOADED" ]; then
fi

# Setting SPARK_SCALA_VERSION if not already set.

if [ -z "$SPARK_SCALA_VERSION" ]; then
SCALA_VERSION_1=2.13
SCALA_VERSION_2=2.12

ASSEMBLY_DIR_1="${SPARK_HOME}/assembly/target/scala-${SCALA_VERSION_1}"
ASSEMBLY_DIR_2="${SPARK_HOME}/assembly/target/scala-${SCALA_VERSION_2}"
ENV_VARIABLE_DOC="https://spark.apache.org/docs/latest/configuration.html#environment-variables"
if [[ -d "$ASSEMBLY_DIR_1" && -d "$ASSEMBLY_DIR_2" ]]; then
echo "Presence of build for multiple Scala versions detected ($ASSEMBLY_DIR_1 and $ASSEMBLY_DIR_2)." 1>&2
echo "Remove one of them or, export SPARK_SCALA_VERSION=$SCALA_VERSION_1 in ${SPARK_ENV_SH}." 1>&2
echo "Visit ${ENV_VARIABLE_DOC} for more details about setting environment variables in spark-env.sh." 1>&2
exit 1
fi

if [[ -d "$ASSEMBLY_DIR_1" ]]; then
export SPARK_SCALA_VERSION=${SCALA_VERSION_1}
else
export SPARK_SCALA_VERSION=${SCALA_VERSION_2}
fi
fi
export SPARK_SCALA_VERSION=2.13
#if [ -z "$SPARK_SCALA_VERSION" ]; then
# SCALA_VERSION_1=2.13
# SCALA_VERSION_2=2.12
#
# ASSEMBLY_DIR_1="${SPARK_HOME}/assembly/target/scala-${SCALA_VERSION_1}"
# ASSEMBLY_DIR_2="${SPARK_HOME}/assembly/target/scala-${SCALA_VERSION_2}"
# ENV_VARIABLE_DOC="https://spark.apache.org/docs/latest/configuration.html#environment-variables"
# if [[ -d "$ASSEMBLY_DIR_1" && -d "$ASSEMBLY_DIR_2" ]]; then
# echo "Presence of build for multiple Scala versions detected ($ASSEMBLY_DIR_1 and $ASSEMBLY_DIR_2)." 1>&2
# echo "Remove one of them or, export SPARK_SCALA_VERSION=$SCALA_VERSION_1 in ${SPARK_ENV_SH}." 1>&2
# echo "Visit ${ENV_VARIABLE_DOC} for more details about setting environment variables in spark-env.sh." 1>&2
# exit 1
# fi
#
# if [[ -d "$ASSEMBLY_DIR_1" ]]; then
# export SPARK_SCALA_VERSION=${SCALA_VERSION_1}
# else
# export SPARK_SCALA_VERSION=${SCALA_VERSION_2}
# fi
#fi

# Append jline option to enable the Beeline process to run in background.
if [[ ( ! $(ps -o stat= -p $$) =~ "+" ) && ! ( -p /dev/stdin ) ]]; then
Expand Down
4 changes: 2 additions & 2 deletions common/kvstore/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,12 @@
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.12</artifactId>
<artifactId>spark-parent_2.13</artifactId>
<version>4.0.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

<artifactId>spark-kvstore_2.12</artifactId>
<artifactId>spark-kvstore_2.13</artifactId>
<packaging>jar</packaging>
<name>Spark Project Local DB</name>
<url>https://spark.apache.org/</url>
Expand Down
Loading

0 comments on commit 3429202

Please sign in to comment.