-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Provide examples of using read_parquet #49739
Comments
Thanks for the report! +1 on saying the function that's called from pandas and linking to it's documentation. However, if we were to document which kwargs there's a bit more maintenance burden keeping it in sync (e.g. what can be passed for filters just changed in 10.0.0) and I don't think it provides significant benefit to the user. |
Yep agreed, would rather link to the functions itself too |
Fixed typos that were causing tests to fail. Oops.
…rmatting failed checks
…d read_parquet from code_checks.sh as requested by @mroeschke
* DOC: Provide examples of using read_parquet #49739 * DOC: Provide examples of using read_parquet #49739 * DOC: Provide examples of using read_parquet #49739 (with minor fixes) * DOC: Provide examples of using read_parquet #49739 Fixed typos that were causing tests to fail. Oops. * DOC: Provide examples of using read_parquet #49739 - fix formatting failed checks * DOC: Provide examples of using read_parquet #49739 - removed read_parquet from code_checks.sh as requested by @mroeschke --------- Co-authored-by: Vijay Vaidyanathan <[email protected]>
I'm interested in examples of:
|
Pandas version checks
main
hereLocation of the documentation
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_parquet.html
Documentation problem
For the
pyarrow
engine, there are some important features behind thekwargs
that aren't aren't described here, and it might not be obvious to users where to look in PyArrow. For example:filters
, users can prune which files and/or row groups are read.filesystem
, users can configure a filesystem such as S3Suggested fix for documentation
At the very least, we should document for each engine where those
kwargs
are passed. But it might even be worthwhile to provide examples of filters, reading partitioned datasets, and configuring remote filesystems. Does that seem reasonable?The text was updated successfully, but these errors were encountered: