-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add http(s) support to the command line #8753
Conversation
f05534a
to
b26dfce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition 👍
Regarding docs, you can update them under here:
Regarding testing, I see in the file there are some unit tests, e.g.
Is it possible to create a version of this but for http/https? Failing that, maybe can comment an example usage to demonstrate this new feature will function properly.
datafusion-cli/src/exec.rs
Outdated
"http" | "https" => { | ||
Arc::new(HttpBuilder::new().with_url(url.origin().ascii_serialization()).build()?) as Arc<dyn ObjectStore> | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"http" | "https" => { | |
Arc::new(HttpBuilder::new().with_url(url.origin().ascii_serialization()).build()?) as Arc<dyn ObjectStore> | |
} | |
"http" | "https" => { | |
let builder = HttpBuilder::new().with_url(url.origin().ascii_serialization()); | |
Arc::new(builder.build()?) as Arc<dyn ObjectStore> | |
} |
small nit, just to make it easier on the eyes
Thank you @kcolford and @Jefffrey for the review. It would be great to address @Jefffrey 's comments I tried this out on the ClickBench file fetched via http and it worked like a charm ❯ create external table hits stored as parquet location 'https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_1.parquet';
0 rows in set. Query took 0.178 seconds.
❯ describe hits;
+-----------------------+-----------+-------------+
| column_name | data_type | is_nullable |
+-----------------------+-----------+-------------+
| WatchID | Int64 | YES |
| JavaEnable | Int16 | YES |
| Title | Binary | YES |
| GoodEvent | Int16 | YES |
| EventTime | Int64 | YES |
| EventDate | UInt16 | YES |
...
| RefererHash | Int64 | YES |
| URLHash | Int64 | YES |
| CLID | Int32 | YES |
+-----------------------+-----------+-------------+
105 rows in set. Query took 0.003 seconds. This is a really nice improvement. Let us know if you need help updating the docs or tests |
Hi @kcolford -- I was wondering if you think you might have a chance to work on this PR soon? If not perhaps I can help find someone to finish it up. Thanks again |
I think this feature is important, so if no one else gets a chance to get it over the line I will try to do so sometime this week |
Which issue does this PR close?
Closes #8752
Rationale for this change
See the linked issue.
What changes are included in this PR?
See the commit log.
Are these changes tested?
This functionality is trivial enough that I don't think it needs to be covered by unit tests and I found nowhere where we could perform full integration test for this repo.
Are there any user-facing changes?
Users will now be able to use the
http
andhttps
schemes when usingcreate external table
. Documentation hasn't been provided in this PR because it doesn't appear that this repo hosts that documentation.No changes to public APIs.