Skip to content

Latest commit

 

History

History
54 lines (42 loc) · 3.63 KB

paper.md

File metadata and controls

54 lines (42 loc) · 3.63 KB
title tags authors affiliations date bibliography
Kepler Mapper: A flexible Python implementation of the Mapper algorithm.
Python
Mapper
Topological Data Analysis
name affiliation
Hendrik Jacob van Veen
1
name orcid affiliation
Nathaniel Saul
0000-0002-8549-9810
2
name orcid affiliation
David Eargle
0000-0002-4056-8114
3
name orcid affiliation
Sam W. Mangham
0000-0001-7511-5652
4
name index
Nubank
1
name index
Department of Mathematics and Statistics, Washington State University Vancouver
2
name index
Leeds School of Business, University of Colorado Boulder
3
name index
Department of Electronics & Computer Science, University of Southampton, Southampton, SO17 1BJ, UK
4
12 February 2018
paper.bib

Summary

Topological data analysis (TDA) is a field of analysis focused on understanding the shape and structure of complex data. By computing topological descriptors of data, such as connected components, loops, and voids, we are better able to find hidden relationships among noisy and high-dimensionality data [@edelsbrunner2010computational; @carlsson2009topology]. However, raw TDA metrics cannot be readily visualized. To address this gap, Singh et al. [-@Singh2007] developed the Mapper algorithm to facilitate graphical exploration of topological data structures. The work of Lum et al. [-@Lumetal2013] sparked widespread interest in the Mapper technique by demonstrating its use in multiple domains, including political science, biology, and sports analytics.

This library, Kepler Mapper, is a Python implementation of the Mapper algorithm as first described in the paper "Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition" [@Singh2007]. Kepler Mapper presents an intuitive interface for the Mapper algorithm, and provides multiple comprehension methods for visualizing the network graph that Mapper produces. We leverage Scikit-Learn-API-compatible clustering and scaling algorithms to construct network graphs in a flexible and user-friendly way. We also provide an extensive suite of tutorials detailing the use of Kepler Mapper for simple and complex use cases.

Example Kepler Mapper graph visualization using the Wisconsin Breast Cancer Dataset [@Dua:2019]

Library Details

Kepler Mapper provides an object-oriented API for constructing lenses and building Mapper network graphs. The module employs the strategy pattern, giving users control over the clustering algorithm, covering scheme, and nerve scheme. This allows the module to be flexible for many use cases. Clustering strategies follow the Scikit-Learn clusterer interface [@scikit-learn]. We provide similar interfaces for Cover classes and Nerve classes, as well as default implementations of those classes that are most commonly found in the literature.

Visual exploration is a critical aspect of Mapper network graph analysis. For this, we provide multiple methods of visualization. For interactive visualization and exploration in the browser, Kepler Mapper can create a visual HTML interface utilizing D3.js. For use with Jupyter [@Kluyver:2016aa] or other embedded purposes, we provide a visualization interface utilizing Plotly [@plotly]. For static visualizations, we provide an adapter so that visualization functionality from NetworkX [@SciPyProceedings_11] and Matplotlib [@Hunter:2007] can be used.

Acknowledgements

Nathaniel Saul was partially supported by NSF DBI-1661348 and by Washington NASA Space Grant Consortium, NASA Grant #NNX15AJ98H.

References