[BUG] test_column_add_after_partition failed in databricks 10.4 runtime #8727

pxLi · 2023-07-17T01:39:20Z

Describe the bug
2 cases failed only databricks 10.4 runtime (321db). It passed in databricks runtimes (11.3, 12.2)

[2023-07-16T07:13:09.563Z] FAILED ../../src/main/python/schema_evolution_test.py::test_column_add_after_partition[parquet][IGNORE_ORDER({'local': True})] - py4j.protocol.Py4JJavaError: An error occurred while calling o1848911.saveA...

[2023-07-16T07:13:09.564Z] FAILED ../../src/main/python/schema_evolution_test.py::test_column_add_after_partition[orc][IGNORE_ORDER({'local': True})] - py4j.protocol.Py4JJavaError: An error occurred while calling o1849036.saveA...


[2023-07-16T07:13:09.561Z] ../../src/main/python/asserts.py:561: in assert_gpu_and_cpu_are_equal_collect

[2023-07-16T07:13:09.561Z]     _assert_gpu_and_cpu_are_equal(func, 'COLLECT', conf=conf, is_cpu_first=is_cpu_first)

[2023-07-16T07:13:09.561Z] ../../src/main/python/asserts.py:480: in _assert_gpu_and_cpu_are_equal

[2023-07-16T07:13:09.561Z]     run_on_cpu()

[2023-07-16T07:13:09.561Z] ../../src/main/python/asserts.py:466: in run_on_cpu

[2023-07-16T07:13:09.561Z]     from_cpu = with_cpu_session(bring_back, conf=conf)

[2023-07-16T07:13:09.561Z] ../../src/main/python/spark_session.py:116: in with_cpu_session

[2023-07-16T07:13:09.561Z]     return with_spark_session(func, conf=copy)

[2023-07-16T07:13:09.561Z] ../../src/main/python/spark_session.py:100: in with_spark_session

[2023-07-16T07:13:09.561Z]     ret = func(_spark)

[2023-07-16T07:13:09.561Z] ../../src/main/python/asserts.py:200: in <lambda>

[2023-07-16T07:13:09.561Z]     bring_back = lambda spark: limit_func(spark).collect()

[2023-07-16T07:13:09.561Z] ../../src/main/python/schema_evolution_test.py:76: in testf

[2023-07-16T07:13:09.561Z]     df.write\

[2023-07-16T07:13:09.561Z] /databricks/spark/python/pyspark/sql/readwriter.py:806: in saveAsTable

[2023-07-16T07:13:09.561Z]     self._jwrite.saveAsTable(name)

[2023-07-16T07:13:09.561Z] /databricks/spark/python/lib/py4j-0.10.9.1-src.zip/py4j/java_gateway.py:1304: in __call__

[2023-07-16T07:13:09.561Z]     return_value = get_return_value(

[2023-07-16T07:13:09.561Z] /databricks/spark/python/pyspark/sql/utils.py:117: in deco

[2023-07-16T07:13:09.561Z]     return f(*a, **kw)

[2023-07-16T07:13:09.561Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2023-07-16T07:13:09.561Z] 

[2023-07-16T07:13:09.561Z] answer = 'xro1849037'

[2023-07-16T07:13:09.561Z] gateway_client = <py4j.clientserver.JavaClient object at 0x7f3753f5e7f0>

[2023-07-16T07:13:09.561Z] target_id = 'o1849036', name = 'saveAsTable'

[2023-07-16T07:13:09.561Z] 

[2023-07-16T07:13:09.561Z]     def get_return_value(answer, gateway_client, target_id=None, name=None):

[2023-07-16T07:13:09.561Z]         """Converts an answer received from the Java gateway into a Python object.

[2023-07-16T07:13:09.561Z]     

[2023-07-16T07:13:09.561Z]         For example, string representation of integers are converted to Python

[2023-07-16T07:13:09.561Z]         integer, string representation of objects are converted to JavaObject

[2023-07-16T07:13:09.561Z]         instances, etc.

[2023-07-16T07:13:09.561Z]     

[2023-07-16T07:13:09.561Z]         :param answer: the string returned by the Java gateway

[2023-07-16T07:13:09.561Z]         :param gateway_client: the gateway client used to communicate with the Java

[2023-07-16T07:13:09.561Z]             Gateway. Only necessary if the answer is a reference (e.g., object,

[2023-07-16T07:13:09.561Z]             list, map)

[2023-07-16T07:13:09.561Z]         :param target_id: the name of the object from which the answer comes from

[2023-07-16T07:13:09.561Z]             (e.g., *object1* in `object1.hello()`). Optional.

[2023-07-16T07:13:09.561Z]         :param name: the name of the member from which the answer comes from

[2023-07-16T07:13:09.561Z]             (e.g., *hello* in `object1.hello()`). Optional.

[2023-07-16T07:13:09.561Z]         """

[2023-07-16T07:13:09.561Z]         if is_error(answer)[0]:

[2023-07-16T07:13:09.561Z]             if len(answer) > 1:

[2023-07-16T07:13:09.561Z]                 type = answer[1]

[2023-07-16T07:13:09.561Z]                 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)

[2023-07-16T07:13:09.561Z]                 if answer[1] == REFERENCE_TYPE:

[2023-07-16T07:13:09.561Z] >                   raise Py4JJavaError(

[2023-07-16T07:13:09.561Z]                         "An error occurred while calling {0}{1}{2}.\n".

[2023-07-16T07:13:09.561Z]                         format(target_id, ".", name), value)

[2023-07-16T07:13:09.561Z] E                   py4j.protocol.Py4JJavaError: An error occurred while calling o1849036.saveAsTable.

[2023-07-16T07:13:09.561Z] E                   : java.lang.NullPointerException

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.execution.datasources.PartitioningUtils$.removeLeadingZerosFromNumberTypePartition(PartitioningUtils.scala:457)

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.execution.datasources.PartitioningUtils$.$anonfun$getPathFragment$1(PartitioningUtils.scala:450)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.Iterator.foreach(Iterator.scala:943)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.Iterator.foreach$(Iterator.scala:943)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.IterableLike.foreach(IterableLike.scala:74)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.types.StructType.foreach(StructType.scala:104)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.TraversableLike.map(TraversableLike.scala:286)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.TraversableLike.map$(TraversableLike.scala:279)

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.types.StructType.map(StructType.scala:104)

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.execution.datasources.PartitioningUtils$.getPathFragment(PartitioningUtils.scala:447)

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.$anonfun$getCustomPartitionLocations$1(InsertIntoHadoopFsRelationCommand.scala:275)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.getCustomPartitionLocations(InsertIntoHadoopFsRelationCommand.scala:273)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:111)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:594)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:238)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:170)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:126)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:124)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:138)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:160)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$8(SQLExecution.scala:239)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:386)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:186)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:968)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:141)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:336)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:160)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:156)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:590)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:168)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:590)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:268)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:264)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:566)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:156)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:324)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:156)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:141)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:132)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:186)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:959)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:800)

[2023-07-16T07:13:09.563Z] E                   	at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:778)

[2023-07-16T07:13:09.563Z] E                   	at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:655)

[2023-07-16T07:13:09.563Z] E                   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

[2023-07-16T07:13:09.563Z] E                   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

[2023-07-16T07:13:09.563Z] E                   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

[2023-07-16T07:13:09.563Z] E                   	at java.lang.reflect.Method.invoke(Method.java:498)

[2023-07-16T07:13:09.563Z] E                   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)

[2023-07-16T07:13:09.563Z] E                   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)

[2023-07-16T07:13:09.563Z] E                   	at py4j.Gateway.invoke(Gateway.java:295)

[2023-07-16T07:13:09.563Z] E                   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)

[2023-07-16T07:13:09.563Z] E                   	at py4j.commands.CallCommand.execute(CallCommand.java:79)

[2023-07-16T07:13:09.563Z] E                   	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195)

[2023-07-16T07:13:09.563Z] E                   	at py4j.ClientServerConnection.run(ClientServerConnection.java:115)

[2023-07-16T07:13:09.563Z] E                   	at java.lang.Thread.run(Thread.java:750)

[2023-07-16T07:13:09.563Z] 

[2023-07-16T07:13:09.563Z] /databricks/spark/python/lib/py4j-0.10.9.1-src.zip/py4j/protocol.py:326: Py4JJavaError

[2023-07-16T07:13:09.563Z] ----------------------------- Captured stdout call -----------------------------

[2023-07-16T07:13:09.563Z] ### CPU RUN ###

Steps/Code to reproduce bug
run case in databricks 10.4 runtime

Expected behavior
pass the test

The text was updated successfully, but these errors were encountered:

NvTimLiu · 2023-07-17T01:44:39Z

Related PR: #8705 @jlowe can you help to take a look, thx~

pxLi added bug Something isn't working ? - Needs Triage Need team to review and classify test Only impacts tests labels Jul 17, 2023

pxLi mentioned this issue Jul 17, 2023

[BUG] Test "parquet_write_test.py::test_hive_timestamp_value[INJECT_OOM]" failed on Databricks #8726

Closed

NvTimLiu assigned jlowe Jul 17, 2023

jlowe mentioned this issue Jul 17, 2023

Avoid generating numeric null partition values on Databricks 10.4 [databricks] #8733

Merged

jlowe removed the ? - Needs Triage Need team to review and classify label Jul 17, 2023

jlowe closed this as completed in #8733 Jul 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] test_column_add_after_partition failed in databricks 10.4 runtime #8727

[BUG] test_column_add_after_partition failed in databricks 10.4 runtime #8727

pxLi commented Jul 17, 2023

NvTimLiu commented Jul 17, 2023 •

edited

Loading

[BUG] test_column_add_after_partition failed in databricks 10.4 runtime #8727

[BUG] test_column_add_after_partition failed in databricks 10.4 runtime #8727

Comments

pxLi commented Jul 17, 2023

NvTimLiu commented Jul 17, 2023 • edited Loading

NvTimLiu commented Jul 17, 2023 •

edited

Loading