-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]Delta kernel can not get file stats #3771
Comments
@nastra Could you please take a look at this? |
FYI @scottsand-db |
Hi @wgtmac -- can you please tell me a bit more about your use case for file stats and for getChanges? We allow you to include a filter during the ScanBuilder -- what more would you need the file stats for? Could you also please look at this internal (not public) API for getChanges in Kernel and see if that fits your use case? We can consider making it public.
|
Thanks for the reply from @scottsand-db and help from @nastra! We use the delta kernel as a metadata client in our proprietary lakehouse to read from delta lake tables. To efficiently make splits at any snapshot and cache the file lists, we need to get following metadata from the API which is available in delta standalone:
Hopefully my explanation makes sense. |
Feature request
Which Delta project/connector is this regarding?
Overview
Since the delta-standalone has been deprecated, we are migrating out project using delta-kernel instead of delta-standalone.
But we found that delta-kernel can not get file stats when scanning file lists.
In delta-standalone, we can get file stats in this class : . And we can get the change logs
using "Iterator getChanges" in io.delta.standalone.DeltaLog which can not be list in delta-kernel too.
Motivation
Further details
Willingness to contribute
The text was updated successfully, but these errors were encountered: