Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The written delta file has corrupted structure #956

Closed
Anna050689 opened this issue Nov 23, 2022 · 8 comments
Closed

The written delta file has corrupted structure #956

Anna050689 opened this issue Nov 23, 2022 · 8 comments
Labels
bug Something isn't working

Comments

@Anna050689
Copy link

Anna050689 commented Nov 23, 2022

Environment

Delta-rs version:0.6.3

Binding: python 3.9.13

Environment:

  • Cloud provider: Azure
  • OS: Windows
  • Other:

Bug

What happened:
The dataframe was written in invalid format to AWS S3. The _delta_log has the next structure:
image

What you expected to happen:

How to reproduce it:

More details:

@Anna050689 Anna050689 added the bug Something isn't working label Nov 23, 2022
@djouallah
Copy link

same issue with gcp !!!

@wjones127
Copy link
Collaborator

What do you mean by corrupted? Does it cause another reader to error?

This looks like a temporary file written in a failed transaction. We create temporary files and them atomically rename them to create the final commit. It should be removed by a vacuum (but let me know if it doesn’t—-that would be a bug.)

@djouallah
Copy link

@wjones127 same error here, #878

@djouallah
Copy link

same error with 0.6.4, it is the same bug reported both in aws and GCP, can you guys have a look !!!

@wjones127
Copy link
Collaborator

The behavior described here is not a bug. Those temporary files are created intentionally and should not interfere with any reader.

If you do encounter a bug where a temporary file breaks something, please file an issue that includes the error you encounter.

@djouallah
Copy link

@wjones127 yes I did already, see here #878 , the json is not written into the cloud storage, delta table can not be read, it is a corrupted table !!!

@djouallah
Copy link

@wjones127 can you please reopen this bug report the issue is not solved, the json is not written at all in the delta_log folder

@wjones127
Copy link
Collaborator

@djouallah if there is already a ticket open that contains the specific error, then please ping me there, not in a separate ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants