Python for Scientific Audio

The aim of this repository is to create a comprehensive, curated list of python software/tools related and used for scientific research in audio/music applications.

Read-Write

audiolazy 📦 - Expressive Digital Signal Processing (DSP) package for Python.
audioread 📦 - Cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.
mutagen 📦 - Reads and writes all kind of audio metadata for various formats.
pyAV - PyAV is a Pythonic binding for FFmpeg or Libav.
(Py)Soundfile 📦 - Library based on libsndfile, CFFI, and NumPy.
pySox 📦 - Wrapper for sox.
stempeg 📦 - read/write of STEMS multistream audio.
tinytag 📦 - reading music meta data of MP3, OGG, FLAC and Wave files.
audiomate 📦 - Loading different types of audio datasets.

General DSP

pydub 📦 - Manipulate audio with a simple and easy high level interface.

real-time Processing & analysis

audiolazy 📦 - Realtime Audio Processing lib, general purpose.
SoundCard - A Pure-Python Real-Time Audio Library
PYO - Realtime audio dsp engine.
python-sounddevice 📦 - PortAudio wrapper providing realtime audio I/O with NumPy.
Realtime_PyAudio_FFT - Realtime audio analysis in Python, using PyAudio and Numpy to extract and visualize FFT features from streaming audio.
Notebooks

Machine Learning

dataset handling

audiomate - Python library for handling audio datasets.

Feature extraction

pyAudioAnalysis² 📦 - Feature Extraction, Classification, Diarization.
pliers - Automated feature extraction in Python
aubio 📦 - Feature extractor, written in C, Python interface.
essentia - Music related low level and high level feature extractor, C++ based, includes Python bindings.
python_speech_features 📦 - Common speech features for ASR.
pyYAAFE - Python bindings for YAAFE feature extractor.
speechpy 📦 - Library for Speech Processing and Recognition, mostly feature extraction for now.

Data augmentation

audiomentations 📦 - Audio Data Augmentation.
muda 📦 - Musical Data Augmentation.
pydiogment 📦 - Audio Data Augmentation.

Anotation tools

audino - Open source audio annotation tool for humans™

Deep Learning

Kapre 📦 - Keras Audio Preprocessors
TorchAudio - PyTorch Audio Loaders

Audio sentiment analysis

Audio Sentiment Analysis - This repository consists of work done to analyse sentiment of a customer in a conversation with a call center agent using various machine learning algorithms and audio features.
Paper - Audio Sentiment Analysis by Heterogeneous Signal Features Learned from Utterance-Based Parallel Neural Network
Paper - Sentiment Analysis on Speaker Specific Speech Data
Article - Emotions Recognition from voice data using Python and Keras
speech-emotion-recognition - Speaker independent emotion recognition

Speech

Diarization / Speaker identification

resemblyzer - A python package to analyze and compare voices with deep learning. Used by Robin Blanchard for his attempt.
Wavesplit: End-to-end Speech Separation by Speaker Clustering, Zeghidour N. et al, 2020 - Speaker stack + separation stack. fixed number of speakers
Speaker Diarization with Region Proposal Network, Huang Z. et al, 2020 - based on Faster R-CNN
pyannote.audio 📦 - Neural building blocks for speaker diarization.
Deep_Speaker-speaker_recognition_system - Keras implementation of Deep Speaker: an End-to-End Neural Speaker Embedding System (speaker recognition)
speaker-recognition - A Speaker Recognition System with GUI.
3D-convolutional-speaker-recognition - Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

Speech recognition

SpeechRecognition 📦 - Wrapper for several ASR engines and APIs, online and offline. Could be useful to add Wrappers
deepspeech 📦 - Pretrained automatic speech recognition. (Mozilla)

Speech Processing

aeneas 📦 - Forced aligner, based on MFCC+DTW, 35+ languages.
gentle - Forced-aligner built on Kaldi.
Parselmouth 📦 - Python interface to the Praat phonetics and speech analysis, synthesis, and manipulation software.
persephone 📦 - Automatic phoneme transcription tool.
py-webrtcvad 📦 - Interface to the WebRTC Voice Activity Detector.
pypesq - Wrapper for the PESQ score calculation. (Perceptual Evaluation of Speech Quality)
pystoi 📦 - Short Term Objective Intelligibility measure (STOI).
PyWorldVocoder - Wrapper for Morise's World Vocoder.
Montreal Forced Aligner - Forced aligner, based on Kaldi (HMM), English (others can be trained).
SIDEKIT 📦 - Speaker and Language recognition.

Other transformations

acoustics 📦 - useful tools for acousticians.
AudioTK - DSP filter toolbox (lots of filters).
AudioTSM 📦 - real-time audio time-scale modification procedures.
Gammatone - Gammatone filterbank implementation.
pyFFTW 📦 - Wrapper for FFTW(3).
NSGT 📦 - Non-stationary gabor transform, constant-q.
MDCT 📦 - MDCT transform.
pytftb - Implementation of the MATLAB Time-Frequency Toolbox.
pyroomacoustics 📦 - Room Acoustics Simulation (RIR generator)
PyRubberband 📦 - Wrapper for rubberband to do pitch-shifting and time-stretching.
PyWavelets 📦 - Discrete Wavelet Transform in Python.
Resampy 📦 - Sample rate conversion.
SFS-Python 📦 - Sound Field Synthesis Toolbox.
STFT 📦 - Standalone package for Short-Time Fourier Transform.

Environmental Sounds

sed_eval 📦 - Evaluation toolbox for Sound Event Detection

Perceptial Models - Auditory Models

cochlea 📦 - Inner ear models.
Brian2 📦 - Spiking neural networks simulator, includes cochlea model.
Loudness - Perceived loudness, includes Zwicker, Moore/Glasberg model.
pyloudnorm - Audio loudness meter and normalization, implements ITU-R BS.1770-4.
Sound Field Synthesis Toolbox 📦 - Sound Field Synthesis Toolbox.

Source Separation

commonfate 📦 - Common Fate Model and Transform.
NTFLib - Sparse Beta-Divergence Tensor Factorization.
NUSSL 📦 - Holistic source separation framework including DSP methods and deep learning methods.
NIMFA 📦 - Several flavors of non-negative-matrix factorization.

Music

Fingerprinting (audio recognition)

dejavu - Audio fingerprinting and recognition in Python
audio-fingerprint-identifying-python - The Shazam-similar app, that identify the song using audio fingerprints & spectrum analysis and Fast Fourier transform

Other Music Information Retrieval

Catchy - Corpus Analysis Tools for Computational Hook Discovery.
Madmom 📦 - MIR packages with strong focus on beat detection, onset detection and chord recognition.
mir_eval 📦 - Common scores for various MIR tasks. Also includes bss_eval implementation.
msaf 📦 - Music Structure Analysis Framework.
librosa 📦 - General audio and music analysis.
essentia - Music related low level and high level feature extractor, C++ based, includes Python bindings.

Symbolic Music - MIDI - Musicology

Music21 📦 - Toolkit for Computer-Aided Musicology.
Mido 📦 - Realtime MIDI wrapper.
mingus 📦 - Advanced music theory and notation package with MIDI file and playback support.
Pretty-MIDI 📦 - Utility functions for handling MIDI data in a nice/intuitive way.

Web Audio

TimeSide (Beta) - high level audio analysis, imaging, transcoding, streaming and labelling.

Audio related APIs and Datasets

beets 📦 - Music library manager and MusicBrainz tagger.
dsdtools 📦 - Parse and process the demixing secrets dataset.
medleydb - Parse medleydb audio + annotations.
Soundcloud API 📦 - Wrapper for Soundcloud API.
Youtube-Downloader 📦 - Download youtube videos (and the audio).

Wrappers for Audio Plugins

VamPy Host 📦 - Interface compiled vamp plugins.

Tutorials

Whirlwind Tour Of Python - fast-paced introduction to Python essentials, aimed at researchers and developers.
Introduction to Numpy and Scipy - Highly recommended tutorial, covers large parts of the scientific Python ecosystem.
Numpy for MATLAB® Users - Short overview of equivalent python functions for switchers.
MIR Notebooks - collection of instructional iPython Notebooks for music information retrieval (MIR).
Selected Topics in Audio Signal Processing - Exercises as iPython notebooks.
Audio analysis notebook - Some Jupyter notebooks about audio signal processing with Python - Nb viewer

Books

Python Data Science Handbook - Jake Vanderplas, Excellent Book and accompanying tutorial notebooks.
Fundamentals of Music Processing - Meinard Müller, comes with Python exercises.

Scientific Papers

Python for audio signal processing - John C. Glover, Victor Lazzarini and Joseph Timoney, Linux Audio Conference 2011.
librosa: Audio and Music Signal Analysis in Python, Video - Brian McFee, Colin Raffel, Dawen Liang, Daniel P.W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto, Scipy 2015.
pyannote.audio: neural building blocks for speaker diarization, Video - Hervé Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill, ICASSP 2020.

Other Resources

Coursera Course - Audio Signal Processing, Python based course from UPF of Barcelona and Stanford University.
Digital Signal Processing Course - Masters Course Material (University of Rostock) with many Python examples.
Slack Channel - Music Information Retrieval Community.

Related lists

There is already PythonInMusic but it is not up to date and includes too many packages of special interest that are mostly not relevant for scientific applications. Awesome-Python is large curated list of python packages. However, the audio section is very small.

Contributing

Your contributions are always welcome! Please take a look at the contribution guidelines first.

I will keep some pull requests open if I'm not sure whether those libraries are awesome, you could vote for them by adding 👍 to them.

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python for Scientific Audio

Read-Write

General DSP

real-time Processing & analysis

Machine Learning

dataset handling

Feature extraction

Data augmentation

Anotation tools

Deep Learning

Audio sentiment analysis

Speech

Diarization / Speaker identification

Speech recognition

Speech Processing

Other transformations

Environmental Sounds

Perceptial Models - Auditory Models

Source Separation

Music

Fingerprinting (audio recognition)

Other Music Information Retrieval

Symbolic Music - MIDI - Musicology

Web Audio

Audio related APIs and Datasets

Wrappers for Audio Plugins

Tutorials

Books

Scientific Papers

Other Resources

Related lists

Contributing

License

About

Releases

Packages

hugovasselin/awesome-python-scientific-audio

Folders and files

Latest commit

History

Repository files navigation

Python for Scientific Audio

Read-Write

General DSP

real-time Processing & analysis

Machine Learning

dataset handling

Feature extraction

Data augmentation

Anotation tools

Deep Learning

Audio sentiment analysis

Speech

Diarization / Speaker identification

Speech recognition

Speech Processing

Other transformations

Environmental Sounds

Perceptial Models - Auditory Models

Source Separation

Music

Fingerprinting (audio recognition)

Other Music Information Retrieval

Symbolic Music - MIDI - Musicology

Web Audio

Audio related APIs and Datasets

Wrappers for Audio Plugins

Tutorials

Books

Scientific Papers

Other Resources

Related lists

Contributing

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages