Skip to content
This repository has been archived by the owner on Apr 13, 2021. It is now read-only.

A simple wrapper to use Pandas Profiling easily in Kedro

License

Notifications You must be signed in to change notification settings

brickfrog/kedro-pandas-profiling

Repository files navigation

Kedro-Pandas-Profiling

ARCHIVED: This was originally built before Kedro hooks were established, and I believe the functionality can (more easily) be added by just installing pandas-profiling and utilizing it with hooks against the necessary datasets, thus making an additional package a bit redundant.

This is a Kedro plugin that uses Pandas-Profiling to profile datasets.

Installation

It can be installed via PyPI.

pip install kedro-pandas-profiling

How is Kedro-Pandas Profiling Used?

You simply proceed with a Kedro project as normal.

Once the data catalog is set up, you can run:

kedro profile #this returns the list of things in the catalog

kedro profile -n #with the name of the dataset

Kedro profile with no arguments returns the results of your catalog, and from that you can append a name of a dataset to profile. This current iteration only supports .csv and .xlsx files.

Sample Output:

Sample output based on the company dataset from the Kedro tutorial.

Sample Output

What licence do you use?

Kedro-Pandas-Profiling is licensed under the Apache 2.0 License.

About

A simple wrapper to use Pandas Profiling easily in Kedro

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages