Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] test_column_add_after_partition failed in databricks 10.4 runtime #8727

Closed
pxLi opened this issue Jul 17, 2023 · 1 comment · Fixed by #8733
Closed

[BUG] test_column_add_after_partition failed in databricks 10.4 runtime #8727

pxLi opened this issue Jul 17, 2023 · 1 comment · Fixed by #8733
Assignees
Labels
bug Something isn't working test Only impacts tests

Comments

@pxLi
Copy link
Collaborator

pxLi commented Jul 17, 2023

Describe the bug
2 cases failed only databricks 10.4 runtime (321db). It passed in databricks runtimes (11.3, 12.2)

[2023-07-16T07:13:09.563Z] FAILED ../../src/main/python/schema_evolution_test.py::test_column_add_after_partition[parquet][IGNORE_ORDER({'local': True})] - py4j.protocol.Py4JJavaError: An error occurred while calling o1848911.saveA...

[2023-07-16T07:13:09.564Z] FAILED ../../src/main/python/schema_evolution_test.py::test_column_add_after_partition[orc][IGNORE_ORDER({'local': True})] - py4j.protocol.Py4JJavaError: An error occurred while calling o1849036.saveA...

[2023-07-16T07:13:09.561Z] ../../src/main/python/asserts.py:561: in assert_gpu_and_cpu_are_equal_collect

[2023-07-16T07:13:09.561Z]     _assert_gpu_and_cpu_are_equal(func, 'COLLECT', conf=conf, is_cpu_first=is_cpu_first)

[2023-07-16T07:13:09.561Z] ../../src/main/python/asserts.py:480: in _assert_gpu_and_cpu_are_equal

[2023-07-16T07:13:09.561Z]     run_on_cpu()

[2023-07-16T07:13:09.561Z] ../../src/main/python/asserts.py:466: in run_on_cpu

[2023-07-16T07:13:09.561Z]     from_cpu = with_cpu_session(bring_back, conf=conf)

[2023-07-16T07:13:09.561Z] ../../src/main/python/spark_session.py:116: in with_cpu_session

[2023-07-16T07:13:09.561Z]     return with_spark_session(func, conf=copy)

[2023-07-16T07:13:09.561Z] ../../src/main/python/spark_session.py:100: in with_spark_session

[2023-07-16T07:13:09.561Z]     ret = func(_spark)

[2023-07-16T07:13:09.561Z] ../../src/main/python/asserts.py:200: in <lambda>

[2023-07-16T07:13:09.561Z]     bring_back = lambda spark: limit_func(spark).collect()

[2023-07-16T07:13:09.561Z] ../../src/main/python/schema_evolution_test.py:76: in testf

[2023-07-16T07:13:09.561Z]     df.write\

[2023-07-16T07:13:09.561Z] /databricks/spark/python/pyspark/sql/readwriter.py:806: in saveAsTable

[2023-07-16T07:13:09.561Z]     self._jwrite.saveAsTable(name)

[2023-07-16T07:13:09.561Z] /databricks/spark/python/lib/py4j-0.10.9.1-src.zip/py4j/java_gateway.py:1304: in __call__

[2023-07-16T07:13:09.561Z]     return_value = get_return_value(

[2023-07-16T07:13:09.561Z] /databricks/spark/python/pyspark/sql/utils.py:117: in deco

[2023-07-16T07:13:09.561Z]     return f(*a, **kw)

[2023-07-16T07:13:09.561Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2023-07-16T07:13:09.561Z] 

[2023-07-16T07:13:09.561Z] answer = 'xro1849037'

[2023-07-16T07:13:09.561Z] gateway_client = <py4j.clientserver.JavaClient object at 0x7f3753f5e7f0>

[2023-07-16T07:13:09.561Z] target_id = 'o1849036', name = 'saveAsTable'

[2023-07-16T07:13:09.561Z] 

[2023-07-16T07:13:09.561Z]     def get_return_value(answer, gateway_client, target_id=None, name=None):

[2023-07-16T07:13:09.561Z]         """Converts an answer received from the Java gateway into a Python object.

[2023-07-16T07:13:09.561Z]     

[2023-07-16T07:13:09.561Z]         For example, string representation of integers are converted to Python

[2023-07-16T07:13:09.561Z]         integer, string representation of objects are converted to JavaObject

[2023-07-16T07:13:09.561Z]         instances, etc.

[2023-07-16T07:13:09.561Z]     

[2023-07-16T07:13:09.561Z]         :param answer: the string returned by the Java gateway

[2023-07-16T07:13:09.561Z]         :param gateway_client: the gateway client used to communicate with the Java

[2023-07-16T07:13:09.561Z]             Gateway. Only necessary if the answer is a reference (e.g., object,

[2023-07-16T07:13:09.561Z]             list, map)

[2023-07-16T07:13:09.561Z]         :param target_id: the name of the object from which the answer comes from

[2023-07-16T07:13:09.561Z]             (e.g., *object1* in `object1.hello()`). Optional.

[2023-07-16T07:13:09.561Z]         :param name: the name of the member from which the answer comes from

[2023-07-16T07:13:09.561Z]             (e.g., *hello* in `object1.hello()`). Optional.

[2023-07-16T07:13:09.561Z]         """

[2023-07-16T07:13:09.561Z]         if is_error(answer)[0]:

[2023-07-16T07:13:09.561Z]             if len(answer) > 1:

[2023-07-16T07:13:09.561Z]                 type = answer[1]

[2023-07-16T07:13:09.561Z]                 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)

[2023-07-16T07:13:09.561Z]                 if answer[1] == REFERENCE_TYPE:

[2023-07-16T07:13:09.561Z] >                   raise Py4JJavaError(

[2023-07-16T07:13:09.561Z]                         "An error occurred while calling {0}{1}{2}.\n".

[2023-07-16T07:13:09.561Z]                         format(target_id, ".", name), value)

[2023-07-16T07:13:09.561Z] E                   py4j.protocol.Py4JJavaError: An error occurred while calling o1849036.saveAsTable.

[2023-07-16T07:13:09.561Z] E                   : java.lang.NullPointerException

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.execution.datasources.PartitioningUtils$.removeLeadingZerosFromNumberTypePartition(PartitioningUtils.scala:457)

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.execution.datasources.PartitioningUtils$.$anonfun$getPathFragment$1(PartitioningUtils.scala:450)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.Iterator.foreach(Iterator.scala:943)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.Iterator.foreach$(Iterator.scala:943)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.IterableLike.foreach(IterableLike.scala:74)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.types.StructType.foreach(StructType.scala:104)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.TraversableLike.map(TraversableLike.scala:286)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.TraversableLike.map$(TraversableLike.scala:279)

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.types.StructType.map(StructType.scala:104)

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.execution.datasources.PartitioningUtils$.getPathFragment(PartitioningUtils.scala:447)

[2023-07-16T07:13:09.561Z] E                   	at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.$anonfun$getCustomPartitionLocations$1(InsertIntoHadoopFsRelationCommand.scala:275)

[2023-07-16T07:13:09.561Z] E                   	at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)

[2023-07-16T07:13:09.562Z] E                   	at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.getCustomPartitionLocations(InsertIntoHadoopFsRelationCommand.scala:273)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:111)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:594)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:238)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:170)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:126)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:124)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:138)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:160)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$8(SQLExecution.scala:239)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:386)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:186)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:968)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:141)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:336)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:160)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:156)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:590)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:168)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:590)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:268)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:264)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:566)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:156)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:324)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:156)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:141)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:132)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:186)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:959)

[2023-07-16T07:13:09.562Z] E                   	at org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:800)

[2023-07-16T07:13:09.563Z] E                   	at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:778)

[2023-07-16T07:13:09.563Z] E                   	at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:655)

[2023-07-16T07:13:09.563Z] E                   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

[2023-07-16T07:13:09.563Z] E                   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

[2023-07-16T07:13:09.563Z] E                   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

[2023-07-16T07:13:09.563Z] E                   	at java.lang.reflect.Method.invoke(Method.java:498)

[2023-07-16T07:13:09.563Z] E                   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)

[2023-07-16T07:13:09.563Z] E                   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)

[2023-07-16T07:13:09.563Z] E                   	at py4j.Gateway.invoke(Gateway.java:295)

[2023-07-16T07:13:09.563Z] E                   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)

[2023-07-16T07:13:09.563Z] E                   	at py4j.commands.CallCommand.execute(CallCommand.java:79)

[2023-07-16T07:13:09.563Z] E                   	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195)

[2023-07-16T07:13:09.563Z] E                   	at py4j.ClientServerConnection.run(ClientServerConnection.java:115)

[2023-07-16T07:13:09.563Z] E                   	at java.lang.Thread.run(Thread.java:750)

[2023-07-16T07:13:09.563Z] 

[2023-07-16T07:13:09.563Z] /databricks/spark/python/lib/py4j-0.10.9.1-src.zip/py4j/protocol.py:326: Py4JJavaError

[2023-07-16T07:13:09.563Z] ----------------------------- Captured stdout call -----------------------------

[2023-07-16T07:13:09.563Z] ### CPU RUN ###

Steps/Code to reproduce bug
run case in databricks 10.4 runtime

Expected behavior
pass the test

@pxLi pxLi added bug Something isn't working ? - Needs Triage Need team to review and classify test Only impacts tests labels Jul 17, 2023
@NvTimLiu
Copy link
Collaborator

NvTimLiu commented Jul 17, 2023

Related PR: #8705 @jlowe can you help to take a look, thx~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working test Only impacts tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants