Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_with_datetime(datetime_string) throw error message when delta table is created for more than 60 days(twice the default value of logRetentionDuration) #521

Closed
zhujunyong opened this issue Dec 9, 2021 · 2 comments · Fixed by #2389
Labels
bug Something isn't working

Comments

@zhujunyong
Copy link

Environment

Delta-rs version:
0.5.4
Binding:
python 3.9.7
Environment:

  • Cloud provider: IBM
  • OS: CentOS
  • Other:

Bug

What happened:
load_with_datetime(datetime_string) function always throws error message when the specific detla table is created for more than 60 days.

>>> dt = DeltaTable('s3://mybucket/mypath/mydeltatable')
>>> dt.version()
469
>>> dt.load_with_datetime('2021-11-22T12:00:00Z')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/root/miniconda3/envs/delta-rs/lib/python3.9/site-packages/deltalake/table.py", line 196, in load_with_datetime
    self._table.load_with_datetime(datetime_string)
deltalake.PyDeltaTableError: Failed to read delta log object: Object not found
>>> 

What you expected to happen:
load the corresponding version and don't throw error message.

How to reproduce it:
create an delta table with specific logRetentionDuration value(small enough), and use load_with_datetime() function.

More details:
I believe it was because this method performs a binary search on all delta transaction logs. in the above case, the latest version no is 469, then it tries to get the timestamp of version 234 at the first time of the loop. but the transaction log of version 234 has been cleaned up(earlier than 30 days), so it throws an error message.

In this file, https://github.com/delta-io/delta-rs/blob/main/rust/src/delta.rs#L1223-L1261 line no 1240 will throw error message when the above conditions are met.

@zhujunyong zhujunyong added the bug Something isn't working label Dec 9, 2021
@fvaleye
Copy link
Collaborator

fvaleye commented Dec 9, 2021

Hello @zhujunyong,

Thanks for submitting your issue and the detailled report.

I will need more time to investigate what is the root cause and I keep you updated!

@houqp
Copy link
Member

houqp commented Dec 10, 2021

I think @zhujunyong is likely right. We can't do binary search starting from version 0, instead we should probably fetch all available versions through object list API, then do a binary search within that list restul.

ion-elgreco added a commit that referenced this issue Apr 12, 2024
# Description
It first sets a proper lower boundary instead of always assuming 0,
since we can also have checkpointed tables which had logRetention that
caused logs to be removed before a checkpoint.


- closes #521
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants