You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
When a table is partitioned by any column:
If I try to write a different schema with WriteMode::Default, it's written incomplete instead of errored
If I try to write a different schema with WriteMode::MergeSchema, the new schema columns are missing
What you expected to happen:
On the first scenario, the write must fail because the schema evolution is not allowed
On the second scenario, the new columns should have been written and the table schema should have evolved
How to reproduce it:
Instead of providing a minimum isolated code, I've altered the current test:
#[tokio::test]asyncfntest_write_mismatched_schema(){let batch = get_record_batch(None,false);let partition_cols = vec!["id".to_owned()];let table = create_initialized_table(&partition_cols).await;letmut writer = RecordBatchWriter::for_table(&table).unwrap();// Write the first batch with the first schema to the table
writer.write(batch).await.unwrap();let adds = writer.flush().await.unwrap();assert_eq!(adds.len(), 2);// Create a second batch with a different schemalet second_schema = Arc::new(ArrowSchema::new(vec![Field::new("id", DataType::Utf8, true),
Field::new("value", DataType::Int32, true),
Field::new("modified", DataType::Utf8, true),
Field::new("name", DataType::Utf8, true),
]));let second_batch = RecordBatch::try_new(
second_schema,vec![Arc::new(StringArray::from(vec![Some("A"), Some("B")])),
Arc::new(Int32Array::from(vec![Some(1), Some(2)])),
Arc::new(StringArray::from(vec![Some("2021-02-02"),
Some("2021-02-01"),
])),
Arc::new(StringArray::from(vec![Some("will"), Some("robert")])),
],).unwrap();let result = writer.write(second_batch).await;assert!(result.is_err());match result {Ok(_) => {assert!(false, "Should not have successfully written");}Err(e) => {match e {DeltaTableError::SchemaMismatch{ .. } => {// this is expected}
others => {assert!(false, "Got the wrong error: {others:?}");}}}};}
More details:
The test fails becase the result is not an error, it was successfully written
Also, if the second_schema doesn't have all of the columns from the first one, another error is returned instead of the SchemaMismatch
If WriteMode::MergeSchema is used, the new name column is not written and the schema has not evolved
The text was updated successfully, but these errors were encountered:
ion-elgreco
changed the title
MergeSchema not working when the table is partitioned
RecordBatchWriter: MergeSchema not working when the table is partitioned
Mar 28, 2024
Environment
Delta-rs version: 0.17.1
Binding: Rust
Environment:
Bug
What happened:
When a table is partitioned by any column:
WriteMode::Default
, it's written incomplete instead of erroredWriteMode::MergeSchema
, the new schema columns are missingWhat you expected to happen:
How to reproduce it:
Instead of providing a minimum isolated code, I've altered the current test:
More details:
second_schema
doesn't have all of the columns from the first one, another error is returned instead of theSchemaMismatch
WriteMode::MergeSchema
is used, the newname
column is not written and the schema has not evolvedThe text was updated successfully, but these errors were encountered: