Skip to content

Low‐code Exploratory Data Analysis Tools

Meghavarshini Krishnaswamy edited this page Feb 28, 2024 · 2 revisions

Data Exploration or Exploratory Data Analysis

[Image credit: Devopedia]

For reading and cleaning data, as well as for doing data analysis, the Pandas Python Library is the preferred choice for everyday data science tasks. Pandas also includes a set of essential visualization functions to explore the dataset properties.

We will present a small collection of open-source software Python tools that will facilitate us carrying out an Exploratory Data Analysis of a dataset with a small amount of coding necessary.

There is a significant number of these types of tools, that we will review:

  • ydata-profiling | Documentation. ydata-profiling (formerly known as pandas-profiling)provides a one-line Exploratory Data Analysis (EDA) experience in a consistent and fast solution. Like pandas df.describe() function, ydata-profiling delivers an extended analysis of a dataFrame while allowing the data analysis to be exported in different formats such as html and json. (Please read installation notes).

  • Sweetviz. Sweetviz is an open-source Python library that generates beautiful, high-density visualizations to kickstart EDA (Exploratory Data Analysis) with just two lines of code. Output is a fully self-contained HTML application.

  • Lux API | Documentation. Lux is a Python library that makes data science easier by automating certain aspects of the data exploration process. Lux is designed to facilitate faster experimentation with data, even when the user does not have a clear idea of what they are looking for. Lux is integrated with an interactive Jupyter widget that allows users to quickly browse through large collections of data directly within their Jupyter notebooks.

  • DataPrep | Documentation. DataPrep.EDA is the fastest and the easiest EDA (Exploratory Data Analysis) tool in Python. It allows you to understand a Pandas/Dask DataFrame with a few lines of code in seconds.

  • AutoViz. Automatically Visualize any dataset, any size with a single line of code. Now, you can save these interactive charts as HTML files automatically with the "html" setting.


Please see Jupyter Notebook with examples


References:


Created: 03/16/2023; Updated: 02/27/2024

Carlos Lizárraga
Data Science Institute
University of Arizona

CC BY-NC-SA 4.0