[FIX] make coalesce test deterministic in RDDSuite #387

mengxr · 2014-04-11T03:45:22Z

Make coalesce test deterministic by setting pre-defined seeds. (Saw random failures in other PRs.)

pwendell · 2014-04-11T03:46:50Z

Awesome - thanks for looking at this. /cc @alig who wrote this test.

AmplabJenkins · 2014-04-11T03:48:11Z

Merged build triggered.

AmplabJenkins · 2014-04-11T03:48:21Z

Merged build started.

AmplabJenkins · 2014-04-11T04:27:43Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-04-11T04:27:44Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14046/

pwendell · 2014-04-12T02:42:36Z

Thanks merged!

Make coalesce test deterministic by setting pre-defined seeds. (Saw random failures in other PRs.) Author: Xiangrui Meng <[email protected]> Closes #387 from mengxr/fix-random and squashes the following commits: 59bc16f [Xiangrui Meng] make coalesce test deterministic in RDDSuite

Make coalesce test deterministic by setting pre-defined seeds. (Saw random failures in other PRs.) Author: Xiangrui Meng <[email protected]> Closes apache#387 from mengxr/fix-random and squashes the following commits: 59bc16f [Xiangrui Meng] make coalesce test deterministic in RDDSuite

* [SPARK-24683][K8S] Add main class if no resource is provided for k8s. * Add integration test.

[WIP]Remove domain mapping hosts workaround

### What changes were proposed in this pull request? The pr aims to upgrade `commons-text` from `1.10.0` to `1.11.0`. ### Why are the changes needed? Release note: https://commons.apache.org/proper/commons-text/changes-report.html#a1.11.0 includes some bug fix, eg: - Fix StringTokenizer.getTokenList to return an independent modifiable list. Fixes [TEXT-219](https://issues.apache.org/jira/browse/TEXT-219). - Fix TextStringBuilder to over-allocate when ensuring capacity #452. Fixes [TEXT-228](https://issues.apache.org/jira/browse/TEXT-228). - TextStringBuidler#hashCode() allocates a String on each call #387. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43590 from panbingkun/SPARK-45732. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>

The pr aims to upgrade `commons-text` from `1.10.0` to `1.11.0`. Release note: https://commons.apache.org/proper/commons-text/changes-report.html#a1.11.0 includes some bug fix, eg: - Fix StringTokenizer.getTokenList to return an independent modifiable list. Fixes [TEXT-219](https://issues.apache.org/jira/browse/TEXT-219). - Fix TextStringBuilder to over-allocate when ensuring capacity apache#452. Fixes [TEXT-228](https://issues.apache.org/jira/browse/TEXT-228). - TextStringBuidler#hashCode() allocates a String on each call apache#387. No. Pass GA. No. Closes apache#43590 from panbingkun/SPARK-45732. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit d38f074) [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10 Upgrade Apache commons-text from 1.9 to 1.10.0 [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) No. Pass github action Closes apache#38262 from bjornjorgensen/commons-text-1.10. Authored-by: Bjørn <[email protected]> Signed-off-by: Yuming Wang <[email protected]> (cherry picked from commit 99abc94) [SPARK-38231][BUILD] Upgrade commons-text to 1.9 This PR aims to upgrade commons-text to 1.9. 1.9 is the latest and popular than 1.6. - https://commons.apache.org/proper/commons-text/changes-report.html#a1.9 - https://mvnrepository.com/artifact/org.apache.commons/commons-text No Pass GA Closes apache#35542 from LuciferYang/upgrade-common-text. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 70f5bfd)

The pr aims to upgrade `commons-text` from `1.10.0` to `1.11.0`. Release note: https://commons.apache.org/proper/commons-text/changes-report.html#a1.11.0 includes some bug fix, eg: - Fix StringTokenizer.getTokenList to return an independent modifiable list. Fixes [TEXT-219](https://issues.apache.org/jira/browse/TEXT-219). - Fix TextStringBuilder to over-allocate when ensuring capacity apache#452. Fixes [TEXT-228](https://issues.apache.org/jira/browse/TEXT-228). - TextStringBuidler#hashCode() allocates a String on each call apache#387. No. Pass GA. No. Closes apache#43590 from panbingkun/SPARK-45732. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit d38f074) [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10 Upgrade Apache commons-text from 1.9 to 1.10.0 [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) No. Pass github action Closes apache#38262 from bjornjorgensen/commons-text-1.10. Authored-by: Bjørn <[email protected]> Signed-off-by: Yuming Wang <[email protected]> (cherry picked from commit 99abc94) [SPARK-38231][BUILD] Upgrade commons-text to 1.9 This PR aims to upgrade commons-text to 1.9. 1.9 is the latest and popular than 1.6. - https://commons.apache.org/proper/commons-text/changes-report.html#a1.9 - https://mvnrepository.com/artifact/org.apache.commons/commons-text No Pass GA Closes apache#35542 from LuciferYang/upgrade-common-text. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 70f5bfd) (cherry picked from commit 5cb61e7)

The pr aims to upgrade `commons-text` from `1.10.0` to `1.11.0`. Release note: https://commons.apache.org/proper/commons-text/changes-report.html#a1.11.0 includes some bug fix, eg: - Fix StringTokenizer.getTokenList to return an independent modifiable list. Fixes [TEXT-219](https://issues.apache.org/jira/browse/TEXT-219). - Fix TextStringBuilder to over-allocate when ensuring capacity apache#452. Fixes [TEXT-228](https://issues.apache.org/jira/browse/TEXT-228). - TextStringBuidler#hashCode() allocates a String on each call apache#387. No. Pass GA. No. Closes apache#43590 from panbingkun/SPARK-45732. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit d38f074) [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10 Upgrade Apache commons-text from 1.9 to 1.10.0 [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) No. Pass github action Closes apache#38262 from bjornjorgensen/commons-text-1.10. Authored-by: Bjørn <[email protected]> Signed-off-by: Yuming Wang <[email protected]> (cherry picked from commit 99abc94) [SPARK-38231][BUILD] Upgrade commons-text to 1.9 This PR aims to upgrade commons-text to 1.9. 1.9 is the latest and popular than 1.6. - https://commons.apache.org/proper/commons-text/changes-report.html#a1.9 - https://mvnrepository.com/artifact/org.apache.commons/commons-text No Pass GA Closes apache#35542 from LuciferYang/upgrade-common-text. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 70f5bfd) (cherry picked from commit 5cb61e7) # Conflicts: # pom.xml

* ODP-1304 [SPARK-44914][BUILD] Upgrade Apache Ivy to 2.5.2 This PR aims to upgrade Apache Ivy to 2.5.2 and protect old Ivy-based systems like old Spark from Apache Ivy 2.5.2's incompatibility by introducing a new `.ivy2.5.2` directory. - Apache Spark 4.0.0 will create this once and reuse this directory while all the other systems like old Sparks uses the old one, `.ivy2`. So, the behavior is the same with the case where Apache Spark 4.0.0 is installed and used in a new machine. - For the environments with `User-provided Ivy-path`es, the user might hit the incompatibility still. However, the users can mitigate them because they already have full control on `Ivy-path`es. This was tried once and reverted logically due to Java 11 and Java 17 failures in Daily CIs. - apache#42613 - apache#42668 Currently, PR Builder also fails as of now. If the PR passes CIes, we can achieve the following. - [Release notes](https://lists.apache.org/thread/9gcz4xrsn8c7o9gb377xfzvkb8jltffr) - FIX: CVE-2022-46751: Apache Ivy Is Vulnerable to XML External Entity Injections No. Pass the CIs including `HiveExternalCatalogVersionsSuite`. No. Closes apache#45075 from dongjoon-hyun/SPARK-44914. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 3baa60a) [SPARK-44968][BUILD] Downgrade ivy from 2.5.2 to 2.5.1 ### What changes were proposed in this pull request? After upgrading Ivy from 2.5.1 to 2.5.2 in SPARK-44914, daily tests for Java 11 and Java 17 began to experience ABORTED in the `HiveExternalCatalogVersionsSuite` test. Java 11 - https://github.com/apache/spark/actions/runs/5953716283/job/16148657660 - https://github.com/apache/spark/actions/runs/5966131923/job/16185159550 Java 17 - https://github.com/apache/spark/actions/runs/5956925790/job/16158714165 - https://github.com/apache/spark/actions/runs/5969348559/job/16195073478 ``` 2023-08-23T23:00:49.6547573Z [info] 2023-08-23 16:00:48.209 - stdout> : java.lang.RuntimeException: problem during retrieve of org.apache.spark#spark-submit-parent-4c061f04-b951-4d06-8909-cde5452988d9: java.lang.RuntimeException: Multiple artifacts of the module log4j#log4j;1.2.17 are retrieved to the same file! Update the retrieve pattern to fix this error. 2023-08-23T23:00:49.6548745Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:238) 2023-08-23T23:00:49.6549572Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:89) 2023-08-23T23:00:49.6550334Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.Ivy.retrieve(Ivy.java:551) 2023-08-23T23:00:49.6551079Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1464) 2023-08-23T23:00:49.6552024Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.$anonfun$downloadVersion$2(IsolatedClientLoader.scala:138) 2023-08-23T23:00:49.6552884Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.util.package$.quietly(package.scala:42) 2023-08-23T23:00:49.6553755Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.downloadVersion(IsolatedClientLoader.scala:138) 2023-08-23T23:00:49.6554705Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.liftedTree1$1(IsolatedClientLoader.scala:65) 2023-08-23T23:00:49.6555637Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.forVersion(IsolatedClientLoader.scala:64) 2023-08-23T23:00:49.6556554Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:443) 2023-08-23T23:00:49.6557340Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:356) 2023-08-23T23:00:49.6558187Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:71) 2023-08-23T23:00:49.6559061Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:70) 2023-08-23T23:00:49.6559962Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$databaseExists$1(HiveExternalCatalog.scala:224) 2023-08-23T23:00:49.6560766Z [info] 2023-08-23 16:00:48.209 - stdout> at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) 2023-08-23T23:00:49.6561584Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102) 2023-08-23T23:00:49.6562510Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:224) 2023-08-23T23:00:49.6563435Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:150) 2023-08-23T23:00:49.6564323Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:140) 2023-08-23T23:00:49.6565340Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:45) 2023-08-23T23:00:49.6566321Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveSessionStateBuilder.$anonfun$catalog$1(HiveSessionStateBuilder.scala:60) 2023-08-23T23:00:49.6567363Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog$lzycompute(SessionCatalog.scala:118) 2023-08-23T23:00:49.6568372Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog(SessionCatalog.scala:118) 2023-08-23T23:00:49.6569393Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:490) 2023-08-23T23:00:49.6570685Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:155) 2023-08-23T23:00:49.6571842Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:113) 2023-08-23T23:00:49.6572932Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:111) 2023-08-23T23:00:49.6573996Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:125) 2023-08-23T23:00:49.6575045Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:97) 2023-08-23T23:00:49.6576066Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) 2023-08-23T23:00:49.6576937Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) 2023-08-23T23:00:49.6577807Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90) 2023-08-23T23:00:49.6578620Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6579432Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) 2023-08-23T23:00:49.6580357Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97) 2023-08-23T23:00:49.6581331Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:93) 2023-08-23T23:00:49.6582239Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481) 2023-08-23T23:00:49.6583101Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82) 2023-08-23T23:00:49.6584088Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481) 2023-08-23T23:00:49.6585236Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6586519Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) 2023-08-23T23:00:49.6587686Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) 2023-08-23T23:00:49.6588898Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6590014Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6590993Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:457) 2023-08-23T23:00:49.6591930Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:93) 2023-08-23T23:00:49.6592914Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:80) 2023-08-23T23:00:49.6593856Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:78) 2023-08-23T23:00:49.6594687Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:219) 2023-08-23T23:00:49.6595379Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99) 2023-08-23T23:00:49.6596103Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6596807Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96) 2023-08-23T23:00:49.6597520Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618) 2023-08-23T23:00:49.6598276Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6599022Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613) 2023-08-23T23:00:49.6599819Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 2023-08-23T23:00:49.6600723Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) 2023-08-23T23:00:49.6601707Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 2023-08-23T23:00:49.6602513Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/java.lang.reflect.Method.invoke(Method.java:568) 2023-08-23T23:00:49.6603272Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) 2023-08-23T23:00:49.6604007Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) 2023-08-23T23:00:49.6604724Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.Gateway.invoke(Gateway.java:282) 2023-08-23T23:00:49.6605416Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) 2023-08-23T23:00:49.6606209Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.commands.CallCommand.execute(CallCommand.java:79) 2023-08-23T23:00:49.6606969Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) 2023-08-23T23:00:49.6607743Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.ClientServerConnection.run(ClientServerConnection.java:106) 2023-08-23T23:00:49.6608415Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/java.lang.Thread.run(Thread.java:833) 2023-08-23T23:00:49.6609288Z [info] 2023-08-23 16:00:48.209 - stdout> Caused by: java.lang.RuntimeException: Multiple artifacts of the module log4j#log4j;1.2.17 are retrieved to the same file! Update the retrieve pattern to fix this error. 2023-08-23T23:00:49.6610288Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.determineArtifactsToCopy(RetrieveEngine.java:426) 2023-08-23T23:00:49.6611332Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:122) 2023-08-23T23:00:49.6612046Z [info] 2023-08-23 16:00:48.209 - stdout> ... 66 more 2023-08-23T23:00:49.6612498Z [info] 2023-08-23 16:00:48.209 - stdout> ``` So this pr downgrade ivy from 2.5.2 to 2.5.1 to restore Java 11/17 daily tests. ### Why are the changes needed? To restore Java 11/17 daily tests. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By changing the default Java version in `build_and_test.yml` to 17 for verification, the tests succeed after downgrading the Ivy to 2.5.1. - https://github.com/LuciferYang/spark/actions/runs/5972232677/job/16209970934 <img width="1116" alt="image" src="https://github.com/apache/spark/assets/1475305/cd4002d8-893d-4845-8b2e-c01ff3106f7f"> ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#42668 from LuciferYang/test-java17. Authored-by: yangjie01 <[email protected]> Signed-off-by: yangjie01 <[email protected]> (cherry picked from commit 4f8a199) [SPARK-44914][BUILD] Upgrade `Apache ivy` from 2.5.1 to 2.5.2 Upgrade Apache ivy from 2.5.1 to 2.5.2 [Release notes](https://lists.apache.org/thread/9gcz4xrsn8c7o9gb377xfzvkb8jltffr) [CVE-2022-46751](https://www.cve.org/CVERecord?id=CVE-2022-46751) The fix apache/ant-ivy@2be17bc No. Pass GA No. Closes apache#42613 from bjornjorgensen/ivy-2.5.2. Authored-by: Bjørn Jørgensen <[email protected]> Signed-off-by: yangjie01 <[email protected]> (cherry picked from commit 611e17e) [SPARK-41030][BUILD] Upgrade `Apache Ivy` to 2.5.1 Upgrade `Apache Ivy` from 2.5.0 to 2.5.1 [Release notes](https://ant.apache.org/ivy/history/2.5.1/release-notes.html) [CVE-2022-37865](https://www.cve.org/CVERecord?id=CVE-2022-37865) and [CVE-2022-37866](https://nvd.nist.gov/vuln/detail/CVE-2022-37866) No. Pass GA Closes apache#38539 from bjornjorgensen/ivy-2.5.1. Authored-by: Bjørn <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 4bbdca6) (cherry picked from commit 0e5fa79) # Conflicts: # dev/deps/spark-deps-hadoop-2-hive-2.3 # dev/deps/spark-deps-hadoop-3-hive-2.3 # docs/core-migration-guide.md # pom.xml * ODP-1303 [SPARK-45732][BUILD] Upgrade commons-text to 1.11.0 The pr aims to upgrade `commons-text` from `1.10.0` to `1.11.0`. Release note: https://commons.apache.org/proper/commons-text/changes-report.html#a1.11.0 includes some bug fix, eg: - Fix StringTokenizer.getTokenList to return an independent modifiable list. Fixes [TEXT-219](https://issues.apache.org/jira/browse/TEXT-219). - Fix TextStringBuilder to over-allocate when ensuring capacity apache#452. Fixes [TEXT-228](https://issues.apache.org/jira/browse/TEXT-228). - TextStringBuidler#hashCode() allocates a String on each call apache#387. No. Pass GA. No. Closes apache#43590 from panbingkun/SPARK-45732. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit d38f074) [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10 Upgrade Apache commons-text from 1.9 to 1.10.0 [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) No. Pass github action Closes apache#38262 from bjornjorgensen/commons-text-1.10. Authored-by: Bjørn <[email protected]> Signed-off-by: Yuming Wang <[email protected]> (cherry picked from commit 99abc94) [SPARK-38231][BUILD] Upgrade commons-text to 1.9 This PR aims to upgrade commons-text to 1.9. 1.9 is the latest and popular than 1.6. - https://commons.apache.org/proper/commons-text/changes-report.html#a1.9 - https://mvnrepository.com/artifact/org.apache.commons/commons-text No Pass GA Closes apache#35542 from LuciferYang/upgrade-common-text. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 70f5bfd) (cherry picked from commit 5cb61e7) # Conflicts: # pom.xml * ODP-1302 [SPARK-43225][BUILD][SQL] Remove jackson-core-asl and jackson-mapper-asl from pre-built distribution - Remove `jackson-core-asl` from maven dependency. - Change the scope of `jackson-mapper-asl` from compile to test. - Replace all `Hive.get(conf)` with `Hive.getWithoutRegisterFns(conf)`. To fix CVE issue: https://github.com/apache/spark/security/dependabot/50. No. manual test. Closes apache#40893 from wangyum/SPARK-43225. Lead-authored-by: Yuming Wang <[email protected]> Co-authored-by: Yuming Wang <[email protected]> Signed-off-by: Sean Owen <[email protected]> (cherry picked from commit 9c237d7) [SPARK-43868][SQL][TESTS] Remove `originalUDFs` from `TestHive` to ensure `ObjectHashAggregateExecBenchmark` can run successfully on Github Action This pr remove `originalUDFs` from `TestHive` to ensure `ObjectHashAggregateExecBenchmark` can run successfully on Github Action. After SPARK-43225, `org.codehaus.jackson:jackson-mapper-asl` becomes a test scope dependency, so when using GA to run benchmark, it is not in the classpath because GA uses https://github.com/apache/spark/blob/d61c77cac17029ee27319e6b766b48d314a4dd31/.github/workflows/benchmark.yml#L179-L183 iunstead of the sbt `Test/runMain`. `ObjectHashAggregateExecBenchmark` used `TestHive`, and `TestHive` will always call `org.apache.hadoop.hive.ql.exec.FunctionRegistry#getFunctionNames` to init `originalUDFs` before this pr, so when we run `ObjectHashAggregateExecBenchmark` on GitHub Actions, there will be the following exceptions: (cherry picked from commit 1c10e28) # Conflicts: # pom.xml --------- Co-authored-by: Dongjoon Hyun <[email protected]> Co-authored-by: Yuming Wang <[email protected]>

The pr aims to upgrade `commons-text` from `1.10.0` to `1.11.0`. Release note: https://commons.apache.org/proper/commons-text/changes-report.html#a1.11.0 includes some bug fix, eg: - Fix StringTokenizer.getTokenList to return an independent modifiable list. Fixes [TEXT-219](https://issues.apache.org/jira/browse/TEXT-219). - Fix TextStringBuilder to over-allocate when ensuring capacity apache#452. Fixes [TEXT-228](https://issues.apache.org/jira/browse/TEXT-228). - TextStringBuidler#hashCode() allocates a String on each call apache#387. No. Pass GA. No. Closes apache#43590 from panbingkun/SPARK-45732. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit d38f074) [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10 Upgrade Apache commons-text from 1.9 to 1.10.0 [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) No. Pass github action Closes apache#38262 from bjornjorgensen/commons-text-1.10. Authored-by: Bjørn <[email protected]> Signed-off-by: Yuming Wang <[email protected]> (cherry picked from commit 99abc94) [SPARK-38231][BUILD] Upgrade commons-text to 1.9 This PR aims to upgrade commons-text to 1.9. 1.9 is the latest and popular than 1.6. - https://commons.apache.org/proper/commons-text/changes-report.html#a1.9 - https://mvnrepository.com/artifact/org.apache.commons/commons-text No Pass GA Closes apache#35542 from LuciferYang/upgrade-common-text. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 70f5bfd) (cherry picked from commit 5cb61e7) # Conflicts: # pom.xml (cherry picked from commit 7ba99ec)

* ODP-1304 [SPARK-44914][BUILD] Upgrade Apache Ivy to 2.5.2 This PR aims to upgrade Apache Ivy to 2.5.2 and protect old Ivy-based systems like old Spark from Apache Ivy 2.5.2's incompatibility by introducing a new `.ivy2.5.2` directory. - Apache Spark 4.0.0 will create this once and reuse this directory while all the other systems like old Sparks uses the old one, `.ivy2`. So, the behavior is the same with the case where Apache Spark 4.0.0 is installed and used in a new machine. - For the environments with `User-provided Ivy-path`es, the user might hit the incompatibility still. However, the users can mitigate them because they already have full control on `Ivy-path`es. This was tried once and reverted logically due to Java 11 and Java 17 failures in Daily CIs. - apache#42613 - apache#42668 Currently, PR Builder also fails as of now. If the PR passes CIes, we can achieve the following. - [Release notes](https://lists.apache.org/thread/9gcz4xrsn8c7o9gb377xfzvkb8jltffr) - FIX: CVE-2022-46751: Apache Ivy Is Vulnerable to XML External Entity Injections No. Pass the CIs including `HiveExternalCatalogVersionsSuite`. No. Closes apache#45075 from dongjoon-hyun/SPARK-44914. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 3baa60a) [SPARK-44968][BUILD] Downgrade ivy from 2.5.2 to 2.5.1 ### What changes were proposed in this pull request? After upgrading Ivy from 2.5.1 to 2.5.2 in SPARK-44914, daily tests for Java 11 and Java 17 began to experience ABORTED in the `HiveExternalCatalogVersionsSuite` test. Java 11 - https://github.com/apache/spark/actions/runs/5953716283/job/16148657660 - https://github.com/apache/spark/actions/runs/5966131923/job/16185159550 Java 17 - https://github.com/apache/spark/actions/runs/5956925790/job/16158714165 - https://github.com/apache/spark/actions/runs/5969348559/job/16195073478 ``` 2023-08-23T23:00:49.6547573Z [info] 2023-08-23 16:00:48.209 - stdout> : java.lang.RuntimeException: problem during retrieve of org.apache.spark#spark-submit-parent-4c061f04-b951-4d06-8909-cde5452988d9: java.lang.RuntimeException: Multiple artifacts of the module log4j#log4j;1.2.17 are retrieved to the same file! Update the retrieve pattern to fix this error. 2023-08-23T23:00:49.6548745Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:238) 2023-08-23T23:00:49.6549572Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:89) 2023-08-23T23:00:49.6550334Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.Ivy.retrieve(Ivy.java:551) 2023-08-23T23:00:49.6551079Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1464) 2023-08-23T23:00:49.6552024Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.$anonfun$downloadVersion$2(IsolatedClientLoader.scala:138) 2023-08-23T23:00:49.6552884Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.util.package$.quietly(package.scala:42) 2023-08-23T23:00:49.6553755Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.downloadVersion(IsolatedClientLoader.scala:138) 2023-08-23T23:00:49.6554705Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.liftedTree1$1(IsolatedClientLoader.scala:65) 2023-08-23T23:00:49.6555637Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.forVersion(IsolatedClientLoader.scala:64) 2023-08-23T23:00:49.6556554Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:443) 2023-08-23T23:00:49.6557340Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:356) 2023-08-23T23:00:49.6558187Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:71) 2023-08-23T23:00:49.6559061Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:70) 2023-08-23T23:00:49.6559962Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$databaseExists$1(HiveExternalCatalog.scala:224) 2023-08-23T23:00:49.6560766Z [info] 2023-08-23 16:00:48.209 - stdout> at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) 2023-08-23T23:00:49.6561584Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102) 2023-08-23T23:00:49.6562510Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:224) 2023-08-23T23:00:49.6563435Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:150) 2023-08-23T23:00:49.6564323Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:140) 2023-08-23T23:00:49.6565340Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:45) 2023-08-23T23:00:49.6566321Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveSessionStateBuilder.$anonfun$catalog$1(HiveSessionStateBuilder.scala:60) 2023-08-23T23:00:49.6567363Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog$lzycompute(SessionCatalog.scala:118) 2023-08-23T23:00:49.6568372Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog(SessionCatalog.scala:118) 2023-08-23T23:00:49.6569393Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:490) 2023-08-23T23:00:49.6570685Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:155) 2023-08-23T23:00:49.6571842Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:113) 2023-08-23T23:00:49.6572932Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:111) 2023-08-23T23:00:49.6573996Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:125) 2023-08-23T23:00:49.6575045Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:97) 2023-08-23T23:00:49.6576066Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) 2023-08-23T23:00:49.6576937Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) 2023-08-23T23:00:49.6577807Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90) 2023-08-23T23:00:49.6578620Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6579432Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) 2023-08-23T23:00:49.6580357Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97) 2023-08-23T23:00:49.6581331Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:93) 2023-08-23T23:00:49.6582239Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481) 2023-08-23T23:00:49.6583101Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82) 2023-08-23T23:00:49.6584088Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481) 2023-08-23T23:00:49.6585236Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6586519Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) 2023-08-23T23:00:49.6587686Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) 2023-08-23T23:00:49.6588898Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6590014Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6590993Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:457) 2023-08-23T23:00:49.6591930Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:93) 2023-08-23T23:00:49.6592914Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:80) 2023-08-23T23:00:49.6593856Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:78) 2023-08-23T23:00:49.6594687Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:219) 2023-08-23T23:00:49.6595379Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99) 2023-08-23T23:00:49.6596103Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6596807Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96) 2023-08-23T23:00:49.6597520Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618) 2023-08-23T23:00:49.6598276Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6599022Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613) 2023-08-23T23:00:49.6599819Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 2023-08-23T23:00:49.6600723Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) 2023-08-23T23:00:49.6601707Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 2023-08-23T23:00:49.6602513Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/java.lang.reflect.Method.invoke(Method.java:568) 2023-08-23T23:00:49.6603272Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) 2023-08-23T23:00:49.6604007Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) 2023-08-23T23:00:49.6604724Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.Gateway.invoke(Gateway.java:282) 2023-08-23T23:00:49.6605416Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) 2023-08-23T23:00:49.6606209Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.commands.CallCommand.execute(CallCommand.java:79) 2023-08-23T23:00:49.6606969Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) 2023-08-23T23:00:49.6607743Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.ClientServerConnection.run(ClientServerConnection.java:106) 2023-08-23T23:00:49.6608415Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/java.lang.Thread.run(Thread.java:833) 2023-08-23T23:00:49.6609288Z [info] 2023-08-23 16:00:48.209 - stdout> Caused by: java.lang.RuntimeException: Multiple artifacts of the module log4j#log4j;1.2.17 are retrieved to the same file! Update the retrieve pattern to fix this error. 2023-08-23T23:00:49.6610288Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.determineArtifactsToCopy(RetrieveEngine.java:426) 2023-08-23T23:00:49.6611332Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:122) 2023-08-23T23:00:49.6612046Z [info] 2023-08-23 16:00:48.209 - stdout> ... 66 more 2023-08-23T23:00:49.6612498Z [info] 2023-08-23 16:00:48.209 - stdout> ``` So this pr downgrade ivy from 2.5.2 to 2.5.1 to restore Java 11/17 daily tests. ### Why are the changes needed? To restore Java 11/17 daily tests. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By changing the default Java version in `build_and_test.yml` to 17 for verification, the tests succeed after downgrading the Ivy to 2.5.1. - https://github.com/LuciferYang/spark/actions/runs/5972232677/job/16209970934 <img width="1116" alt="image" src="https://github.com/apache/spark/assets/1475305/cd4002d8-893d-4845-8b2e-c01ff3106f7f"> ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#42668 from LuciferYang/test-java17. Authored-by: yangjie01 <[email protected]> Signed-off-by: yangjie01 <[email protected]> (cherry picked from commit 4f8a199) [SPARK-44914][BUILD] Upgrade `Apache ivy` from 2.5.1 to 2.5.2 Upgrade Apache ivy from 2.5.1 to 2.5.2 [Release notes](https://lists.apache.org/thread/9gcz4xrsn8c7o9gb377xfzvkb8jltffr) [CVE-2022-46751](https://www.cve.org/CVERecord?id=CVE-2022-46751) The fix apache/ant-ivy@2be17bc No. Pass GA No. Closes apache#42613 from bjornjorgensen/ivy-2.5.2. Authored-by: Bjørn Jørgensen <[email protected]> Signed-off-by: yangjie01 <[email protected]> (cherry picked from commit 611e17e) [SPARK-41030][BUILD] Upgrade `Apache Ivy` to 2.5.1 Upgrade `Apache Ivy` from 2.5.0 to 2.5.1 [Release notes](https://ant.apache.org/ivy/history/2.5.1/release-notes.html) [CVE-2022-37865](https://www.cve.org/CVERecord?id=CVE-2022-37865) and [CVE-2022-37866](https://nvd.nist.gov/vuln/detail/CVE-2022-37866) No. Pass GA Closes apache#38539 from bjornjorgensen/ivy-2.5.1. Authored-by: Bjørn <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 4bbdca6) (cherry picked from commit 0e5fa79) # Conflicts: # dev/deps/spark-deps-hadoop-2-hive-2.3 # dev/deps/spark-deps-hadoop-3-hive-2.3 # docs/core-migration-guide.md # pom.xml * ODP-1303 [SPARK-45732][BUILD] Upgrade commons-text to 1.11.0 The pr aims to upgrade `commons-text` from `1.10.0` to `1.11.0`. Release note: https://commons.apache.org/proper/commons-text/changes-report.html#a1.11.0 includes some bug fix, eg: - Fix StringTokenizer.getTokenList to return an independent modifiable list. Fixes [TEXT-219](https://issues.apache.org/jira/browse/TEXT-219). - Fix TextStringBuilder to over-allocate when ensuring capacity apache#452. Fixes [TEXT-228](https://issues.apache.org/jira/browse/TEXT-228). - TextStringBuidler#hashCode() allocates a String on each call apache#387. No. Pass GA. No. Closes apache#43590 from panbingkun/SPARK-45732. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit d38f074) [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10 Upgrade Apache commons-text from 1.9 to 1.10.0 [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) No. Pass github action Closes apache#38262 from bjornjorgensen/commons-text-1.10. Authored-by: Bjørn <[email protected]> Signed-off-by: Yuming Wang <[email protected]> (cherry picked from commit 99abc94) [SPARK-38231][BUILD] Upgrade commons-text to 1.9 This PR aims to upgrade commons-text to 1.9. 1.9 is the latest and popular than 1.6. - https://commons.apache.org/proper/commons-text/changes-report.html#a1.9 - https://mvnrepository.com/artifact/org.apache.commons/commons-text No Pass GA Closes apache#35542 from LuciferYang/upgrade-common-text. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 70f5bfd) (cherry picked from commit 5cb61e7) # Conflicts: # pom.xml * ODP-1302 [SPARK-43225][BUILD][SQL] Remove jackson-core-asl and jackson-mapper-asl from pre-built distribution - Remove `jackson-core-asl` from maven dependency. - Change the scope of `jackson-mapper-asl` from compile to test. - Replace all `Hive.get(conf)` with `Hive.getWithoutRegisterFns(conf)`. To fix CVE issue: https://github.com/apache/spark/security/dependabot/50. No. manual test. Closes apache#40893 from wangyum/SPARK-43225. Lead-authored-by: Yuming Wang <[email protected]> Co-authored-by: Yuming Wang <[email protected]> Signed-off-by: Sean Owen <[email protected]> (cherry picked from commit 9c237d7) [SPARK-43868][SQL][TESTS] Remove `originalUDFs` from `TestHive` to ensure `ObjectHashAggregateExecBenchmark` can run successfully on Github Action This pr remove `originalUDFs` from `TestHive` to ensure `ObjectHashAggregateExecBenchmark` can run successfully on Github Action. After SPARK-43225, `org.codehaus.jackson:jackson-mapper-asl` becomes a test scope dependency, so when using GA to run benchmark, it is not in the classpath because GA uses https://github.com/apache/spark/blob/d61c77cac17029ee27319e6b766b48d314a4dd31/.github/workflows/benchmark.yml#L179-L183 iunstead of the sbt `Test/runMain`. `ObjectHashAggregateExecBenchmark` used `TestHive`, and `TestHive` will always call `org.apache.hadoop.hive.ql.exec.FunctionRegistry#getFunctionNames` to init `originalUDFs` before this pr, so when we run `ObjectHashAggregateExecBenchmark` on GitHub Actions, there will be the following exceptions: (cherry picked from commit 1c10e28) # Conflicts: # pom.xml --------- Co-authored-by: Dongjoon Hyun <[email protected]> Co-authored-by: Yuming Wang <[email protected]>

The pr aims to upgrade `commons-text` from `1.10.0` to `1.11.0`. Release note: https://commons.apache.org/proper/commons-text/changes-report.html#a1.11.0 includes some bug fix, eg: - Fix StringTokenizer.getTokenList to return an independent modifiable list. Fixes [TEXT-219](https://issues.apache.org/jira/browse/TEXT-219). - Fix TextStringBuilder to over-allocate when ensuring capacity apache#452. Fixes [TEXT-228](https://issues.apache.org/jira/browse/TEXT-228). - TextStringBuidler#hashCode() allocates a String on each call apache#387. No. Pass GA. No. Closes apache#43590 from panbingkun/SPARK-45732. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit d38f074) [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10 Upgrade Apache commons-text from 1.9 to 1.10.0 [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) No. Pass github action Closes apache#38262 from bjornjorgensen/commons-text-1.10. Authored-by: Bjørn <[email protected]> Signed-off-by: Yuming Wang <[email protected]> (cherry picked from commit 99abc94) [SPARK-38231][BUILD] Upgrade commons-text to 1.9 This PR aims to upgrade commons-text to 1.9. 1.9 is the latest and popular than 1.6. - https://commons.apache.org/proper/commons-text/changes-report.html#a1.9 - https://mvnrepository.com/artifact/org.apache.commons/commons-text No Pass GA Closes apache#35542 from LuciferYang/upgrade-common-text. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 70f5bfd) (cherry picked from commit 5cb61e7) # Conflicts: # pom.xml (cherry picked from commit 7ba99ec)

* ODP-1304 [SPARK-44914][BUILD] Upgrade Apache Ivy to 2.5.2 This PR aims to upgrade Apache Ivy to 2.5.2 and protect old Ivy-based systems like old Spark from Apache Ivy 2.5.2's incompatibility by introducing a new `.ivy2.5.2` directory. - Apache Spark 4.0.0 will create this once and reuse this directory while all the other systems like old Sparks uses the old one, `.ivy2`. So, the behavior is the same with the case where Apache Spark 4.0.0 is installed and used in a new machine. - For the environments with `User-provided Ivy-path`es, the user might hit the incompatibility still. However, the users can mitigate them because they already have full control on `Ivy-path`es. This was tried once and reverted logically due to Java 11 and Java 17 failures in Daily CIs. - apache#42613 - apache#42668 Currently, PR Builder also fails as of now. If the PR passes CIes, we can achieve the following. - [Release notes](https://lists.apache.org/thread/9gcz4xrsn8c7o9gb377xfzvkb8jltffr) - FIX: CVE-2022-46751: Apache Ivy Is Vulnerable to XML External Entity Injections No. Pass the CIs including `HiveExternalCatalogVersionsSuite`. No. Closes apache#45075 from dongjoon-hyun/SPARK-44914. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 3baa60a) [SPARK-44968][BUILD] Downgrade ivy from 2.5.2 to 2.5.1 ### What changes were proposed in this pull request? After upgrading Ivy from 2.5.1 to 2.5.2 in SPARK-44914, daily tests for Java 11 and Java 17 began to experience ABORTED in the `HiveExternalCatalogVersionsSuite` test. Java 11 - https://github.com/apache/spark/actions/runs/5953716283/job/16148657660 - https://github.com/apache/spark/actions/runs/5966131923/job/16185159550 Java 17 - https://github.com/apache/spark/actions/runs/5956925790/job/16158714165 - https://github.com/apache/spark/actions/runs/5969348559/job/16195073478 ``` 2023-08-23T23:00:49.6547573Z [info] 2023-08-23 16:00:48.209 - stdout> : java.lang.RuntimeException: problem during retrieve of org.apache.spark#spark-submit-parent-4c061f04-b951-4d06-8909-cde5452988d9: java.lang.RuntimeException: Multiple artifacts of the module log4j#log4j;1.2.17 are retrieved to the same file! Update the retrieve pattern to fix this error. 2023-08-23T23:00:49.6548745Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:238) 2023-08-23T23:00:49.6549572Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:89) 2023-08-23T23:00:49.6550334Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.Ivy.retrieve(Ivy.java:551) 2023-08-23T23:00:49.6551079Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1464) 2023-08-23T23:00:49.6552024Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.$anonfun$downloadVersion$2(IsolatedClientLoader.scala:138) 2023-08-23T23:00:49.6552884Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.util.package$.quietly(package.scala:42) 2023-08-23T23:00:49.6553755Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.downloadVersion(IsolatedClientLoader.scala:138) 2023-08-23T23:00:49.6554705Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.liftedTree1$1(IsolatedClientLoader.scala:65) 2023-08-23T23:00:49.6555637Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.forVersion(IsolatedClientLoader.scala:64) 2023-08-23T23:00:49.6556554Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:443) 2023-08-23T23:00:49.6557340Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:356) 2023-08-23T23:00:49.6558187Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:71) 2023-08-23T23:00:49.6559061Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:70) 2023-08-23T23:00:49.6559962Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$databaseExists$1(HiveExternalCatalog.scala:224) 2023-08-23T23:00:49.6560766Z [info] 2023-08-23 16:00:48.209 - stdout> at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) 2023-08-23T23:00:49.6561584Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102) 2023-08-23T23:00:49.6562510Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:224) 2023-08-23T23:00:49.6563435Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:150) 2023-08-23T23:00:49.6564323Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:140) 2023-08-23T23:00:49.6565340Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:45) 2023-08-23T23:00:49.6566321Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveSessionStateBuilder.$anonfun$catalog$1(HiveSessionStateBuilder.scala:60) 2023-08-23T23:00:49.6567363Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog$lzycompute(SessionCatalog.scala:118) 2023-08-23T23:00:49.6568372Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog(SessionCatalog.scala:118) 2023-08-23T23:00:49.6569393Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:490) 2023-08-23T23:00:49.6570685Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:155) 2023-08-23T23:00:49.6571842Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:113) 2023-08-23T23:00:49.6572932Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:111) 2023-08-23T23:00:49.6573996Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:125) 2023-08-23T23:00:49.6575045Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:97) 2023-08-23T23:00:49.6576066Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) 2023-08-23T23:00:49.6576937Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) 2023-08-23T23:00:49.6577807Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90) 2023-08-23T23:00:49.6578620Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6579432Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) 2023-08-23T23:00:49.6580357Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97) 2023-08-23T23:00:49.6581331Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:93) 2023-08-23T23:00:49.6582239Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481) 2023-08-23T23:00:49.6583101Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82) 2023-08-23T23:00:49.6584088Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481) 2023-08-23T23:00:49.6585236Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6586519Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) 2023-08-23T23:00:49.6587686Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) 2023-08-23T23:00:49.6588898Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6590014Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6590993Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:457) 2023-08-23T23:00:49.6591930Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:93) 2023-08-23T23:00:49.6592914Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:80) 2023-08-23T23:00:49.6593856Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:78) 2023-08-23T23:00:49.6594687Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:219) 2023-08-23T23:00:49.6595379Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99) 2023-08-23T23:00:49.6596103Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6596807Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96) 2023-08-23T23:00:49.6597520Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618) 2023-08-23T23:00:49.6598276Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6599022Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613) 2023-08-23T23:00:49.6599819Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 2023-08-23T23:00:49.6600723Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) 2023-08-23T23:00:49.6601707Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 2023-08-23T23:00:49.6602513Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/java.lang.reflect.Method.invoke(Method.java:568) 2023-08-23T23:00:49.6603272Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) 2023-08-23T23:00:49.6604007Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) 2023-08-23T23:00:49.6604724Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.Gateway.invoke(Gateway.java:282) 2023-08-23T23:00:49.6605416Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) 2023-08-23T23:00:49.6606209Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.commands.CallCommand.execute(CallCommand.java:79) 2023-08-23T23:00:49.6606969Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) 2023-08-23T23:00:49.6607743Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.ClientServerConnection.run(ClientServerConnection.java:106) 2023-08-23T23:00:49.6608415Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/java.lang.Thread.run(Thread.java:833) 2023-08-23T23:00:49.6609288Z [info] 2023-08-23 16:00:48.209 - stdout> Caused by: java.lang.RuntimeException: Multiple artifacts of the module log4j#log4j;1.2.17 are retrieved to the same file! Update the retrieve pattern to fix this error. 2023-08-23T23:00:49.6610288Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.determineArtifactsToCopy(RetrieveEngine.java:426) 2023-08-23T23:00:49.6611332Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:122) 2023-08-23T23:00:49.6612046Z [info] 2023-08-23 16:00:48.209 - stdout> ... 66 more 2023-08-23T23:00:49.6612498Z [info] 2023-08-23 16:00:48.209 - stdout> ``` So this pr downgrade ivy from 2.5.2 to 2.5.1 to restore Java 11/17 daily tests. ### Why are the changes needed? To restore Java 11/17 daily tests. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By changing the default Java version in `build_and_test.yml` to 17 for verification, the tests succeed after downgrading the Ivy to 2.5.1. - https://github.com/LuciferYang/spark/actions/runs/5972232677/job/16209970934 <img width="1116" alt="image" src="https://github.com/apache/spark/assets/1475305/cd4002d8-893d-4845-8b2e-c01ff3106f7f"> ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#42668 from LuciferYang/test-java17. Authored-by: yangjie01 <[email protected]> Signed-off-by: yangjie01 <[email protected]> (cherry picked from commit 4f8a199) [SPARK-44914][BUILD] Upgrade `Apache ivy` from 2.5.1 to 2.5.2 Upgrade Apache ivy from 2.5.1 to 2.5.2 [Release notes](https://lists.apache.org/thread/9gcz4xrsn8c7o9gb377xfzvkb8jltffr) [CVE-2022-46751](https://www.cve.org/CVERecord?id=CVE-2022-46751) The fix apache/ant-ivy@2be17bc No. Pass GA No. Closes apache#42613 from bjornjorgensen/ivy-2.5.2. Authored-by: Bjørn Jørgensen <[email protected]> Signed-off-by: yangjie01 <[email protected]> (cherry picked from commit 611e17e) [SPARK-41030][BUILD] Upgrade `Apache Ivy` to 2.5.1 Upgrade `Apache Ivy` from 2.5.0 to 2.5.1 [Release notes](https://ant.apache.org/ivy/history/2.5.1/release-notes.html) [CVE-2022-37865](https://www.cve.org/CVERecord?id=CVE-2022-37865) and [CVE-2022-37866](https://nvd.nist.gov/vuln/detail/CVE-2022-37866) No. Pass GA Closes apache#38539 from bjornjorgensen/ivy-2.5.1. Authored-by: Bjørn <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 4bbdca6) (cherry picked from commit 0e5fa79) # Conflicts: # dev/deps/spark-deps-hadoop-2-hive-2.3 # dev/deps/spark-deps-hadoop-3-hive-2.3 # docs/core-migration-guide.md # pom.xml * ODP-1303 [SPARK-45732][BUILD] Upgrade commons-text to 1.11.0 The pr aims to upgrade `commons-text` from `1.10.0` to `1.11.0`. Release note: https://commons.apache.org/proper/commons-text/changes-report.html#a1.11.0 includes some bug fix, eg: - Fix StringTokenizer.getTokenList to return an independent modifiable list. Fixes [TEXT-219](https://issues.apache.org/jira/browse/TEXT-219). - Fix TextStringBuilder to over-allocate when ensuring capacity apache#452. Fixes [TEXT-228](https://issues.apache.org/jira/browse/TEXT-228). - TextStringBuidler#hashCode() allocates a String on each call apache#387. No. Pass GA. No. Closes apache#43590 from panbingkun/SPARK-45732. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit d38f074) [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10 Upgrade Apache commons-text from 1.9 to 1.10.0 [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) No. Pass github action Closes apache#38262 from bjornjorgensen/commons-text-1.10. Authored-by: Bjørn <[email protected]> Signed-off-by: Yuming Wang <[email protected]> (cherry picked from commit 99abc94) [SPARK-38231][BUILD] Upgrade commons-text to 1.9 This PR aims to upgrade commons-text to 1.9. 1.9 is the latest and popular than 1.6. - https://commons.apache.org/proper/commons-text/changes-report.html#a1.9 - https://mvnrepository.com/artifact/org.apache.commons/commons-text No Pass GA Closes apache#35542 from LuciferYang/upgrade-common-text. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 70f5bfd) (cherry picked from commit 5cb61e7) # Conflicts: # pom.xml * ODP-1302 [SPARK-43225][BUILD][SQL] Remove jackson-core-asl and jackson-mapper-asl from pre-built distribution - Remove `jackson-core-asl` from maven dependency. - Change the scope of `jackson-mapper-asl` from compile to test. - Replace all `Hive.get(conf)` with `Hive.getWithoutRegisterFns(conf)`. To fix CVE issue: https://github.com/apache/spark/security/dependabot/50. No. manual test. Closes apache#40893 from wangyum/SPARK-43225. Lead-authored-by: Yuming Wang <[email protected]> Co-authored-by: Yuming Wang <[email protected]> Signed-off-by: Sean Owen <[email protected]> (cherry picked from commit 9c237d7) [SPARK-43868][SQL][TESTS] Remove `originalUDFs` from `TestHive` to ensure `ObjectHashAggregateExecBenchmark` can run successfully on Github Action This pr remove `originalUDFs` from `TestHive` to ensure `ObjectHashAggregateExecBenchmark` can run successfully on Github Action. After SPARK-43225, `org.codehaus.jackson:jackson-mapper-asl` becomes a test scope dependency, so when using GA to run benchmark, it is not in the classpath because GA uses https://github.com/apache/spark/blob/d61c77cac17029ee27319e6b766b48d314a4dd31/.github/workflows/benchmark.yml#L179-L183 iunstead of the sbt `Test/runMain`. `ObjectHashAggregateExecBenchmark` used `TestHive`, and `TestHive` will always call `org.apache.hadoop.hive.ql.exec.FunctionRegistry#getFunctionNames` to init `originalUDFs` before this pr, so when we run `ObjectHashAggregateExecBenchmark` on GitHub Actions, there will be the following exceptions: (cherry picked from commit 1c10e28) # Conflicts: # pom.xml --------- Co-authored-by: Dongjoon Hyun <[email protected]> Co-authored-by: Yuming Wang <[email protected]>

The pr aims to upgrade `commons-text` from `1.10.0` to `1.11.0`. Release note: https://commons.apache.org/proper/commons-text/changes-report.html#a1.11.0 includes some bug fix, eg: - Fix StringTokenizer.getTokenList to return an independent modifiable list. Fixes [TEXT-219](https://issues.apache.org/jira/browse/TEXT-219). - Fix TextStringBuilder to over-allocate when ensuring capacity apache#452. Fixes [TEXT-228](https://issues.apache.org/jira/browse/TEXT-228). - TextStringBuidler#hashCode() allocates a String on each call apache#387. No. Pass GA. No. Closes apache#43590 from panbingkun/SPARK-45732. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit d38f074) [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10 Upgrade Apache commons-text from 1.9 to 1.10.0 [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) No. Pass github action Closes apache#38262 from bjornjorgensen/commons-text-1.10. Authored-by: Bjørn <[email protected]> Signed-off-by: Yuming Wang <[email protected]> (cherry picked from commit 99abc94) [SPARK-38231][BUILD] Upgrade commons-text to 1.9 This PR aims to upgrade commons-text to 1.9. 1.9 is the latest and popular than 1.6. - https://commons.apache.org/proper/commons-text/changes-report.html#a1.9 - https://mvnrepository.com/artifact/org.apache.commons/commons-text No Pass GA Closes apache#35542 from LuciferYang/upgrade-common-text. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 70f5bfd) (cherry picked from commit 5cb61e7) # Conflicts: # pom.xml

* ODP-1304 [SPARK-44914][BUILD] Upgrade Apache Ivy to 2.5.2 This PR aims to upgrade Apache Ivy to 2.5.2 and protect old Ivy-based systems like old Spark from Apache Ivy 2.5.2's incompatibility by introducing a new `.ivy2.5.2` directory. - Apache Spark 4.0.0 will create this once and reuse this directory while all the other systems like old Sparks uses the old one, `.ivy2`. So, the behavior is the same with the case where Apache Spark 4.0.0 is installed and used in a new machine. - For the environments with `User-provided Ivy-path`es, the user might hit the incompatibility still. However, the users can mitigate them because they already have full control on `Ivy-path`es. This was tried once and reverted logically due to Java 11 and Java 17 failures in Daily CIs. - apache#42613 - apache#42668 Currently, PR Builder also fails as of now. If the PR passes CIes, we can achieve the following. - [Release notes](https://lists.apache.org/thread/9gcz4xrsn8c7o9gb377xfzvkb8jltffr) - FIX: CVE-2022-46751: Apache Ivy Is Vulnerable to XML External Entity Injections No. Pass the CIs including `HiveExternalCatalogVersionsSuite`. No. Closes apache#45075 from dongjoon-hyun/SPARK-44914. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 3baa60a) [SPARK-44968][BUILD] Downgrade ivy from 2.5.2 to 2.5.1 ### What changes were proposed in this pull request? After upgrading Ivy from 2.5.1 to 2.5.2 in SPARK-44914, daily tests for Java 11 and Java 17 began to experience ABORTED in the `HiveExternalCatalogVersionsSuite` test. Java 11 - https://github.com/apache/spark/actions/runs/5953716283/job/16148657660 - https://github.com/apache/spark/actions/runs/5966131923/job/16185159550 Java 17 - https://github.com/apache/spark/actions/runs/5956925790/job/16158714165 - https://github.com/apache/spark/actions/runs/5969348559/job/16195073478 ``` 2023-08-23T23:00:49.6547573Z [info] 2023-08-23 16:00:48.209 - stdout> : java.lang.RuntimeException: problem during retrieve of org.apache.spark#spark-submit-parent-4c061f04-b951-4d06-8909-cde5452988d9: java.lang.RuntimeException: Multiple artifacts of the module log4j#log4j;1.2.17 are retrieved to the same file! Update the retrieve pattern to fix this error. 2023-08-23T23:00:49.6548745Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:238) 2023-08-23T23:00:49.6549572Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:89) 2023-08-23T23:00:49.6550334Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.Ivy.retrieve(Ivy.java:551) 2023-08-23T23:00:49.6551079Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1464) 2023-08-23T23:00:49.6552024Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.$anonfun$downloadVersion$2(IsolatedClientLoader.scala:138) 2023-08-23T23:00:49.6552884Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.util.package$.quietly(package.scala:42) 2023-08-23T23:00:49.6553755Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.downloadVersion(IsolatedClientLoader.scala:138) 2023-08-23T23:00:49.6554705Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.liftedTree1$1(IsolatedClientLoader.scala:65) 2023-08-23T23:00:49.6555637Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.client.IsolatedClientLoader$.forVersion(IsolatedClientLoader.scala:64) 2023-08-23T23:00:49.6556554Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:443) 2023-08-23T23:00:49.6557340Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:356) 2023-08-23T23:00:49.6558187Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:71) 2023-08-23T23:00:49.6559061Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:70) 2023-08-23T23:00:49.6559962Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$databaseExists$1(HiveExternalCatalog.scala:224) 2023-08-23T23:00:49.6560766Z [info] 2023-08-23 16:00:48.209 - stdout> at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) 2023-08-23T23:00:49.6561584Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102) 2023-08-23T23:00:49.6562510Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:224) 2023-08-23T23:00:49.6563435Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:150) 2023-08-23T23:00:49.6564323Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:140) 2023-08-23T23:00:49.6565340Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:45) 2023-08-23T23:00:49.6566321Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.hive.HiveSessionStateBuilder.$anonfun$catalog$1(HiveSessionStateBuilder.scala:60) 2023-08-23T23:00:49.6567363Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog$lzycompute(SessionCatalog.scala:118) 2023-08-23T23:00:49.6568372Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog(SessionCatalog.scala:118) 2023-08-23T23:00:49.6569393Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:490) 2023-08-23T23:00:49.6570685Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:155) 2023-08-23T23:00:49.6571842Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:113) 2023-08-23T23:00:49.6572932Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:111) 2023-08-23T23:00:49.6573996Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:125) 2023-08-23T23:00:49.6575045Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:97) 2023-08-23T23:00:49.6576066Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) 2023-08-23T23:00:49.6576937Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) 2023-08-23T23:00:49.6577807Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90) 2023-08-23T23:00:49.6578620Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6579432Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) 2023-08-23T23:00:49.6580357Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97) 2023-08-23T23:00:49.6581331Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:93) 2023-08-23T23:00:49.6582239Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481) 2023-08-23T23:00:49.6583101Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82) 2023-08-23T23:00:49.6584088Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481) 2023-08-23T23:00:49.6585236Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6586519Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) 2023-08-23T23:00:49.6587686Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) 2023-08-23T23:00:49.6588898Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6590014Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) 2023-08-23T23:00:49.6590993Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:457) 2023-08-23T23:00:49.6591930Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:93) 2023-08-23T23:00:49.6592914Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:80) 2023-08-23T23:00:49.6593856Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:78) 2023-08-23T23:00:49.6594687Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:219) 2023-08-23T23:00:49.6595379Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99) 2023-08-23T23:00:49.6596103Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6596807Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96) 2023-08-23T23:00:49.6597520Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618) 2023-08-23T23:00:49.6598276Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) 2023-08-23T23:00:49.6599022Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613) 2023-08-23T23:00:49.6599819Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 2023-08-23T23:00:49.6600723Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) 2023-08-23T23:00:49.6601707Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 2023-08-23T23:00:49.6602513Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/java.lang.reflect.Method.invoke(Method.java:568) 2023-08-23T23:00:49.6603272Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) 2023-08-23T23:00:49.6604007Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) 2023-08-23T23:00:49.6604724Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.Gateway.invoke(Gateway.java:282) 2023-08-23T23:00:49.6605416Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) 2023-08-23T23:00:49.6606209Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.commands.CallCommand.execute(CallCommand.java:79) 2023-08-23T23:00:49.6606969Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) 2023-08-23T23:00:49.6607743Z [info] 2023-08-23 16:00:48.209 - stdout> at py4j.ClientServerConnection.run(ClientServerConnection.java:106) 2023-08-23T23:00:49.6608415Z [info] 2023-08-23 16:00:48.209 - stdout> at java.base/java.lang.Thread.run(Thread.java:833) 2023-08-23T23:00:49.6609288Z [info] 2023-08-23 16:00:48.209 - stdout> Caused by: java.lang.RuntimeException: Multiple artifacts of the module log4j#log4j;1.2.17 are retrieved to the same file! Update the retrieve pattern to fix this error. 2023-08-23T23:00:49.6610288Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.determineArtifactsToCopy(RetrieveEngine.java:426) 2023-08-23T23:00:49.6611332Z [info] 2023-08-23 16:00:48.209 - stdout> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:122) 2023-08-23T23:00:49.6612046Z [info] 2023-08-23 16:00:48.209 - stdout> ... 66 more 2023-08-23T23:00:49.6612498Z [info] 2023-08-23 16:00:48.209 - stdout> ``` So this pr downgrade ivy from 2.5.2 to 2.5.1 to restore Java 11/17 daily tests. ### Why are the changes needed? To restore Java 11/17 daily tests. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By changing the default Java version in `build_and_test.yml` to 17 for verification, the tests succeed after downgrading the Ivy to 2.5.1. - https://github.com/LuciferYang/spark/actions/runs/5972232677/job/16209970934 <img width="1116" alt="image" src="https://github.com/apache/spark/assets/1475305/cd4002d8-893d-4845-8b2e-c01ff3106f7f"> ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#42668 from LuciferYang/test-java17. Authored-by: yangjie01 <[email protected]> Signed-off-by: yangjie01 <[email protected]> (cherry picked from commit 4f8a199) [SPARK-44914][BUILD] Upgrade `Apache ivy` from 2.5.1 to 2.5.2 Upgrade Apache ivy from 2.5.1 to 2.5.2 [Release notes](https://lists.apache.org/thread/9gcz4xrsn8c7o9gb377xfzvkb8jltffr) [CVE-2022-46751](https://www.cve.org/CVERecord?id=CVE-2022-46751) The fix apache/ant-ivy@2be17bc No. Pass GA No. Closes apache#42613 from bjornjorgensen/ivy-2.5.2. Authored-by: Bjørn Jørgensen <[email protected]> Signed-off-by: yangjie01 <[email protected]> (cherry picked from commit 611e17e) [SPARK-41030][BUILD] Upgrade `Apache Ivy` to 2.5.1 Upgrade `Apache Ivy` from 2.5.0 to 2.5.1 [Release notes](https://ant.apache.org/ivy/history/2.5.1/release-notes.html) [CVE-2022-37865](https://www.cve.org/CVERecord?id=CVE-2022-37865) and [CVE-2022-37866](https://nvd.nist.gov/vuln/detail/CVE-2022-37866) No. Pass GA Closes apache#38539 from bjornjorgensen/ivy-2.5.1. Authored-by: Bjørn <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 4bbdca6) (cherry picked from commit 0e5fa79) # Conflicts: # dev/deps/spark-deps-hadoop-2-hive-2.3 # dev/deps/spark-deps-hadoop-3-hive-2.3 # docs/core-migration-guide.md # pom.xml * ODP-1303 [SPARK-45732][BUILD] Upgrade commons-text to 1.11.0 The pr aims to upgrade `commons-text` from `1.10.0` to `1.11.0`. Release note: https://commons.apache.org/proper/commons-text/changes-report.html#a1.11.0 includes some bug fix, eg: - Fix StringTokenizer.getTokenList to return an independent modifiable list. Fixes [TEXT-219](https://issues.apache.org/jira/browse/TEXT-219). - Fix TextStringBuilder to over-allocate when ensuring capacity apache#452. Fixes [TEXT-228](https://issues.apache.org/jira/browse/TEXT-228). - TextStringBuidler#hashCode() allocates a String on each call apache#387. No. Pass GA. No. Closes apache#43590 from panbingkun/SPARK-45732. Authored-by: panbingkun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit d38f074) [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10 Upgrade Apache commons-text from 1.9 to 1.10.0 [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) No. Pass github action Closes apache#38262 from bjornjorgensen/commons-text-1.10. Authored-by: Bjørn <[email protected]> Signed-off-by: Yuming Wang <[email protected]> (cherry picked from commit 99abc94) [SPARK-38231][BUILD] Upgrade commons-text to 1.9 This PR aims to upgrade commons-text to 1.9. 1.9 is the latest and popular than 1.6. - https://commons.apache.org/proper/commons-text/changes-report.html#a1.9 - https://mvnrepository.com/artifact/org.apache.commons/commons-text No Pass GA Closes apache#35542 from LuciferYang/upgrade-common-text. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 70f5bfd) (cherry picked from commit 5cb61e7) # Conflicts: # pom.xml * ODP-1302 [SPARK-43225][BUILD][SQL] Remove jackson-core-asl and jackson-mapper-asl from pre-built distribution - Remove `jackson-core-asl` from maven dependency. - Change the scope of `jackson-mapper-asl` from compile to test. - Replace all `Hive.get(conf)` with `Hive.getWithoutRegisterFns(conf)`. To fix CVE issue: https://github.com/apache/spark/security/dependabot/50. No. manual test. Closes apache#40893 from wangyum/SPARK-43225. Lead-authored-by: Yuming Wang <[email protected]> Co-authored-by: Yuming Wang <[email protected]> Signed-off-by: Sean Owen <[email protected]> (cherry picked from commit 9c237d7) [SPARK-43868][SQL][TESTS] Remove `originalUDFs` from `TestHive` to ensure `ObjectHashAggregateExecBenchmark` can run successfully on Github Action This pr remove `originalUDFs` from `TestHive` to ensure `ObjectHashAggregateExecBenchmark` can run successfully on Github Action. After SPARK-43225, `org.codehaus.jackson:jackson-mapper-asl` becomes a test scope dependency, so when using GA to run benchmark, it is not in the classpath because GA uses https://github.com/apache/spark/blob/d61c77cac17029ee27319e6b766b48d314a4dd31/.github/workflows/benchmark.yml#L179-L183 iunstead of the sbt `Test/runMain`. `ObjectHashAggregateExecBenchmark` used `TestHive`, and `TestHive` will always call `org.apache.hadoop.hive.ql.exec.FunctionRegistry#getFunctionNames` to init `originalUDFs` before this pr, so when we run `ObjectHashAggregateExecBenchmark` on GitHub Actions, there will be the following exceptions: (cherry picked from commit 1c10e28) # Conflicts: # pom.xml --------- Co-authored-by: Dongjoon Hyun <[email protected]> Co-authored-by: Yuming Wang <[email protected]>

make coalesce test deterministic in RDDSuite

59bc16f

mengxr mentioned this pull request Apr 11, 2014

[SPARK-1225, 1241] [MLLIB] Add AreaUnderCurve and BinaryClassificationMetrics #364

Closed

asfgit closed this in 7038b00 Apr 12, 2014

mengxr deleted the fix-random branch May 7, 2014 00:09

mccheah added a commit to mccheah/spark that referenced this pull request Nov 28, 2018

[SPARK-24683][K8S] Fix k8s no resource on palantir/master (apache#387)

eddf77b

* [SPARK-24683][K8S] Add main class if no resource is provided for k8s. * Add integration test.

bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019

Merge pull request apache#387 from theopenlab/remove-hosts-workaround

9530d70

[WIP]Remove domain mapping hosts workaround

panbingkun mentioned this pull request Oct 31, 2023

[SPARK-45732][BUILD] Upgrade commons-text to 1.11.0 #43590

Closed

senthh mentioned this pull request Aug 13, 2024

ODP-2032|ODP-1095 Critical CVE fixes patch acceldata-io/spark3#18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX] make coalesce test deterministic in RDDSuite #387

[FIX] make coalesce test deterministic in RDDSuite #387

mengxr commented Apr 11, 2014

pwendell commented Apr 11, 2014

AmplabJenkins commented Apr 11, 2014

AmplabJenkins commented Apr 11, 2014

AmplabJenkins commented Apr 11, 2014

AmplabJenkins commented Apr 11, 2014

pwendell commented Apr 12, 2014

[FIX] make coalesce test deterministic in RDDSuite #387

[FIX] make coalesce test deterministic in RDDSuite #387

Conversation

mengxr commented Apr 11, 2014

pwendell commented Apr 11, 2014

AmplabJenkins commented Apr 11, 2014

AmplabJenkins commented Apr 11, 2014

AmplabJenkins commented Apr 11, 2014

AmplabJenkins commented Apr 11, 2014

pwendell commented Apr 12, 2014