OpenOmics is currently under active development and we may break API compatibility in the future.
This Python package provide a series of tools to integrate and explore the genomics, transcriptomics, proteomics, and clinical data (aka multi-omics data). With interfaces to popular annotation databases and scalable data-frame manipulation tools, OpenOmics facilitates the common data wrangling tasks when preparing data for RNA-seq bioinformatics analysis.
Documentation (Latest | Stable) | OpenOmics at a glance
OpenOmics assist in integration of heterogeneous multi-omics bioinformatics data. The library provides a Python API as well as an interactive Dash web interface. It features support for:
- Genomics, Transcriptomics, Proteomics, and Clinical data.
- Harmonization with 20+ popular annotation, interaction, disease-association databases.
OpenOmics also has an efficient data pipeline that bridges the popular data manipulation Pandas library and Dask distributed processing to address the following use cases:
- Providing a standard pipeline for dataset indexing, table joining and querying, which are transparent and customizable for end-users.
- Providing Efficient disk storage for large multi-omics dataset with Parquet data structures.
- Integrating various data types including interactions and sequence data, then exporting to NetworkX graphs or data generators for down-stream machine learning.
- Accessible by both developers and scientists with a Python API that works seamlessly with an external Galaxy tool interface or the built-in Dash web interface (WIP).
pip install openomics
conda install openomics -c jonnytran # Work in progress
git clone https://github.com/JonnyTran/OpenOmics/
cd OpenOmics
pip install -e .
The journal paper for this scientific package was reviewed by JOSS at https://joss.theoj.org/papers/10.21105/joss.03249#, and can be cited with:
# BibTeX
@article{Tran2021,
doi = {10.21105/joss.03249},
url = {https://doi.org/10.21105/joss.03249},
year = {2021},
publisher = {The Open Journal},
volume = {6},
number = {61},
pages = {3249},
author = {Nhat C. Tran and Jean X. Gao},
title = {OpenOmics: A bioinformatics API to integrate multi-omics datasets and interface with public databases.},
journal = {Journal of Open Source Software}
}
Thank you for extremely helpful feedback and guidance from the pyOpenSci reviewers. This package was created with the pyOpenSci/cookiecutter-pyopensci project template, based off audreyr/cookiecutter-pypackage.