-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(datasets): Extend preview mechanism #595
Conversation
…rg/kedro-plugins into extend-preview-mechanism
@SajidAlamQB Is this still a draft or waiting for something? |
Its ready for review, just some coverage tests that are needed. |
…rg/kedro-plugins into extend-preview-mechanism
…rg/kedro-plugins into extend-preview-mechanism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments:
- I would try to stick with
_load
as much as possible to avoid re-implementation and edge case that run into difference load data. There may be some special case ifload
andpreview
expect different data, we discussed this in tech design before but that's another topic. - Performance - the
preview
should be a lightweight solution and should never read the full data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. thanks @SajidAlamQB <3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing all the comments, this is a great addition to the previews feature:sparkles:
* Extend preview to Parquet * Update sql_dataset.py * Update sql_dataset.py * update preview method for parquetdataset * Update sql_dataset.py * extend preview to JSONDataset * add json preview * add preview for pickledataset * Update json_dataset.py * lint * add tests for parquet and json * lint * rem pickle fix docstring * Fix parquet test * fix pandas.json tests * add coverage for sqldataset * lint * coverage for sanitisation of sql * changes based on review * use pyarrow for parquet preview * align jsondataset with spike * Update json_dataset.py * Update json_dataset.py * pass lines=true and nrows * update docstring * Update test_json_dataset.py * revert change * use sqlalchemy instead of query * fix sql tests Signed-off-by: tgoelles <[email protected]>
Description
Related to: kedro-org/kedro-viz#1622
This PR introduces an update to extend the preview functionality across more kedro-datasets, extending support for different data formats.
List of Datasets we are aiming to support in this PR:
ParquetDataset
SQLTableDataset
pandas.JSONDataset
JSONDataset
Development notes
Checklist
RELEASE.md
file