fix: ListingSchemaProvider directory paths (related: #4204) #4788
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Is similar to #4204 - inability to use an object store listing for a table schema.
However this PR addresses tables generated from
ListingSchemaProvider
.Rationale for this change
I would like to set up an object store (s3) where each directory maps to a single table/schema, with the contents being made up of all files (parquet) inside the directory. By registering the schema provider like:
Then if there is a folder in the bucket, such as
userdata
, attempting to query againstuserdata
table causes the s3 client to 404, as the provider creates ListingTables with paths set to the raw directory names, eguserdata
, and indatafusion/core/src/datasource/listing/url.rs:149
, we have:Since the paths don't end with
/
, it treats the directories as files and attempts to performhead
on them instead of listing them.This PR remedies this scenario, allowing the query to succeed.
What changes are included in this PR?
ListingSchemaProvider is altered to track whether the table paths it has listed are directories or files. If they are directories, it creates the ListingTables with a '/' appended to the stringified table path, allowing the ListingTable to successfully list its contents.
Are these changes tested?
Some light unit tests added.
Are there any user-facing changes?
No