Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

append is deleting records #2716

Closed
machov opened this issue Jul 29, 2024 · 2 comments
Closed

append is deleting records #2716

machov opened this issue Jul 29, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@machov
Copy link

machov commented Jul 29, 2024

Environment

append() is deleting records

Any idea why?

Delta-rs version: 0.18.2

Binding:

Environment:

  • Cloud provider: Azure
  • OS: Linux
  • Other: Python version 3.10.12

Bug

What happened:

The following write_deltalake function is deleting records and not only appending

import pyarrow as pa

        pa_table = pa.Table.from_pandas(validated_df, schema=merged_schema, preserve_index=False)
        write_deltalake(
            abfss_table_path,
            pa_table,
            mode="append",
            schema_mode="merge",
            engine="rust",
            storage_options=storage_options,
        )

What you expected to happen:

I'd expect row count to increase, however row count decreased significantly!

How to reproduce it:

This is a very large table with 300k+ records

Resulting delta table became 30k+ records, deleting hundreds of thousands

More details:

@machov machov added the bug Something isn't working label Jul 29, 2024
@ion-elgreco
Copy link
Collaborator

@machov I can look into it if you give a reproducible example

@machov
Copy link
Author

machov commented Jul 29, 2024

good point, let me look into that, closing for now,

@machov machov closed this as not planned Won't fix, can't repro, duplicate, stale Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants