[BUG] Random cast error while writting to CosmosDB with pyspark #42329
Labels
Client
This issue points to a problem in the data-plane of the library.
Cosmos
customer-reported
Issues that are reported by GitHub users external to the Azure organization.
needs-team-attention
Workflow: This issue needs attention from Azure service team or SDK team
question
The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Service Attention
Workflow: This issue is responsible by Azure service team.
Describe the bug
I have some pyspark code that loads a csv file and appens the data into a CosmosDB container. Sometime I get an error, I can rerun the code and it can work even if the data or the cluster did not change.
Before I was using Databricks Runtime 10.4 TLS with an other version of cosmosdb library into the cluster. And I never had this issue. The issue started to occure only when I've updated databricks runtime to 12.2 TLS +
azure-cosmos-spark_3-3_2-12:4.30.0
Exception or Stack Trace
Code Snippet
Expected behavior
No errors or a retry since it does not fail all the time. The data are made of strings, doubles and nans.
Setup:
openjdk 19.0.1 2022-10-18
OpenJDK Runtime Environment (build 19.0.1+10-21)
OpenJDK 64-Bit Server VM (build 19.0.1+10-21, mixed mode, sharing)
spark.sql.catalog.cosmosCatalog com.azure.cosmos.spark.CosmosCatalog
spark.jars.packages com.azure.cosmos.spark:azure-cosmos-spark_3-3_2-12:4.30.0
The text was updated successfully, but these errors were encountered: