-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refresh Python usage documentation #539
Conversation
* Reflowed the usage documentation to start with loading, then look at log introspection, and finally querying tables. * Added info about supported backends and data catalogs. * Added more examples and guidance on how to query Delta tables.
hi @wjones127 i was interested in this from a datafusion angle as well and added example to the rust documentation for querying here #519. i had some issues with this though and had to point to specific git commits to get it to work. i havent tried since then though so not sure if the issue will still persist. im less familiar with the python bindings, so take this with a grain of salt, but i believe the general structures are in place to replicate how we query in rust with the datafusion python bindings. on a separate but somewhat related note, i am working on adding s3 support to datafusion (https://github.com/datafusion-contrib/datafusion-objectstore-s3). my loose understanding of delta lake is that its often cloud based, so getting s3 support added to datafusion should make querying it easier. i had actually previously tried querying deltalake on s3 and that was how i found out that datafusion didnt support it which started me on that path. hope this helps! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the detailed write up @wjones127 !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wjones127
LGTM! There is a minor error in the CI before merging the PR.
python/docs/source/usage.rst
Outdated
Alternatively, if you have a data catalog you can load it by reference to a | ||
database and table name. Currently only AWS Glue is supported. | ||
|
||
.. TODO: auth to data catalog? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI #522 explains the requirements of the Data Catalog integration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Description
I rewrote most of the Python usage documentation.
By rewriting this I broke any links to sections within the page, but not to the page itself.
Related Issue(s)
None
Documentation