-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
URL Registered with ObjectStore registry is different from url in DeltaScan #1018
Comments
Thanks for the report! We'll have to look a bit deeper here. The Url passed to the object store registry is actually just an invention of delta-rs since we require out object store to be rooted at the table root and to avoid collisions with other stores that may be registered to the "raw" object store url. However somehwere urls get mixed up ... :). Do you by any chance have a quick repro example we can use for debugging. We do some integration testing with datafusion and S§, maybe we can find some differences. |
This defect slips through the integration tests since the Object Store configurations are exported to the environment hence when the "incorrect" url is obtained from the registry it rebuilds from that new URL plus the environment. In my use case I configured the underlying storage by passing in a HashMap with all the configurations values hence the default S3 environment variables are not exposed causing the defaults to be used. I can help by modifying the integration tests to instead maybe use a prefix of If you have any other approaches I'd like to hear them. |
@Blajda - sorry for the late reply. With the latest commits on main there are some updates to make more consistent use of the options map to create object stores in various places, which may already cover this bug. Would it be possible for you to confirm this? Of course having integration tests cover this scenario would be great! While we worked on making configuration more consistent, there is one piece still missing, which is configuration for the S3 lock client. Right now this also updates the environment. However as long as the configuration in the map is complete, we should in most places no longer take any values from the environment. |
Hi @roeap Thanks for all the work. |
Environment
Delta-rs version: latest
Binding: rust
Bug
What happened:
I loaded a Delta table backed by S3 storage and then registered it with Datafusion. I then performed a select which failed since ObjectStore was unable to get data from the backend.
The error provided by ObjectStore showed it tried to get from "https://amazon.com/path" where I configured Objectstore to use "http://localhost/"
What you expected to happen:
That when a table is registered with Datafusion it uses the same underlying configuration the registered table had.
How to reproduce it:
More details:
Further investigation showed that the URL passed to the ObjectStore registry is different from the URL used to get the ObjectStore. Since the correct url is not used it will create a new DeltaTable instance with defaults.
See rust/src/delta_datafusion.rs:404 and rust/src/delta_datafusion.rs:309
The text was updated successfully, but these errors were encountered: