The aim of this repository is to create a comprehensive, curated list of python software/tools related and used for scientific research in audio/music applications.
- audiolazy ๐ฆ - Expressive Digital Signal Processing (DSP) package for Python.
- audioread ๐ฆ - Cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.
- mutagen ๐ฆ - Reads and writes all kind of audio metadata for various formats.
- pyAV - PyAV is a Pythonic binding for FFmpeg or Libav.
- (Py)Soundfile ๐ฆ - Library based on libsndfile, CFFI, and NumPy.
- pySox ๐ฆ - Wrapper for sox.
- stempeg ๐ฆ - read/write of STEMS multistream audio.
- tinytag ๐ฆ - reading music meta data of MP3, OGG, FLAC and Wave files.
- audiomate ๐ฆ - Loading different types of audio datasets.
- audiolazy ๐ฆ - Realtime Audio Processing lib, general purpose.
- SoundCard - A Pure-Python Real-Time Audio Library
- PYO - Realtime audio dsp engine.
- python-sounddevice ๐ฆ - PortAudio wrapper providing realtime audio I/O with NumPy.
- Realtime_PyAudio_FFT - Realtime audio analysis in Python, using PyAudio and Numpy to extract and visualize FFT features from streaming audio.
- Notebooks
- audiomate - Python library for handling audio datasets.
- pyAudioAnalysisยฒ ๐ฆ - Feature Extraction, Classification, Diarization.
- pliers - Automated feature extraction in Python
- aubio ๐ฆ - Feature extractor, written in C, Python interface.
- essentia - Music related low level and high level feature extractor, C++ based, includes Python bindings.
- python_speech_features ๐ฆ - Common speech features for ASR.
- pyYAAFE - Python bindings for YAAFE feature extractor.
- speechpy ๐ฆ - Library for Speech Processing and Recognition, mostly feature extraction for now.
- audiomentations ๐ฆ - Audio Data Augmentation.
- muda ๐ฆ - Musical Data Augmentation.
- pydiogment ๐ฆ - Audio Data Augmentation.
- audino - Open source audio annotation tool for humansโข
- Kapre ๐ฆ - Keras Audio Preprocessors
- TorchAudio - PyTorch Audio Loaders
- Audio Sentiment Analysis - This repository consists of work done to analyse sentiment of a customer in a conversation with a call center agent using various machine learning algorithms and audio features.
- Paper - Audio Sentiment Analysis by Heterogeneous Signal Features Learned from Utterance-Based Parallel Neural Network
- Paper - Sentiment Analysis on Speaker Specific Speech Data
- Article - Emotions Recognition from voice data using Python and Keras
- speech-emotion-recognition - Speaker independent emotion recognition
- resemblyzer - A python package to analyze and compare voices with deep learning. Used by Robin Blanchard for his attempt.
- Wavesplit: End-to-end Speech Separation by Speaker Clustering, Zeghidour N. et al, 2020 - Speaker stack + separation stack. fixed number of speakers
- Speaker Diarization with Region Proposal Network, Huang Z. et al, 2020 - based on Faster R-CNN
- pyannote.audio ๐ฆ - Neural building blocks for speaker diarization.
- Deep_Speaker-speaker_recognition_system - Keras implementation of Deep Speaker: an End-to-End Neural Speaker Embedding System (speaker recognition)
- speaker-recognition - A Speaker Recognition System with GUI.
- 3D-convolutional-speaker-recognition - Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
- SpeechRecognition ๐ฆ - Wrapper for several ASR engines and APIs, online and offline. Could be useful to add Wrappers
- deepspeech ๐ฆ - Pretrained automatic speech recognition. (Mozilla)
- aeneas ๐ฆ - Forced aligner, based on MFCC+DTW, 35+ languages.
- gentle - Forced-aligner built on Kaldi.
- Parselmouth ๐ฆ - Python interface to the Praat phonetics and speech analysis, synthesis, and manipulation software.
- persephone ๐ฆ - Automatic phoneme transcription tool.
- py-webrtcvad ๐ฆ - Interface to the WebRTC Voice Activity Detector.
- pypesq - Wrapper for the PESQ score calculation. (Perceptual Evaluation of Speech Quality)
- pystoi ๐ฆ - Short Term Objective Intelligibility measure (STOI).
- PyWorldVocoder - Wrapper for Morise's World Vocoder.
- Montreal Forced Aligner - Forced aligner, based on Kaldi (HMM), English (others can be trained).
- SIDEKIT ๐ฆ - Speaker and Language recognition.
- acoustics ๐ฆ - useful tools for acousticians.
- AudioTK - DSP filter toolbox (lots of filters).
- AudioTSM ๐ฆ - real-time audio time-scale modification procedures.
- Gammatone - Gammatone filterbank implementation.
- pyFFTW ๐ฆ - Wrapper for FFTW(3).
- NSGT ๐ฆ - Non-stationary gabor transform, constant-q.
- MDCT ๐ฆ - MDCT transform.
- pytftb - Implementation of the MATLAB Time-Frequency Toolbox.
- pyroomacoustics ๐ฆ - Room Acoustics Simulation (RIR generator)
- PyRubberband ๐ฆ - Wrapper for rubberband to do pitch-shifting and time-stretching.
- PyWavelets ๐ฆ - Discrete Wavelet Transform in Python.
- Resampy ๐ฆ - Sample rate conversion.
- SFS-Python ๐ฆ - Sound Field Synthesis Toolbox.
- STFT ๐ฆ - Standalone package for Short-Time Fourier Transform.
- cochlea ๐ฆ - Inner ear models.
- Brian2 ๐ฆ - Spiking neural networks simulator, includes cochlea model.
- Loudness - Perceived loudness, includes Zwicker, Moore/Glasberg model.
- pyloudnorm - Audio loudness meter and normalization, implements ITU-R BS.1770-4.
- Sound Field Synthesis Toolbox ๐ฆ - Sound Field Synthesis Toolbox.
- commonfate ๐ฆ - Common Fate Model and Transform.
- NTFLib - Sparse Beta-Divergence Tensor Factorization.
- NUSSL ๐ฆ - Holistic source separation framework including DSP methods and deep learning methods.
- NIMFA ๐ฆ - Several flavors of non-negative-matrix factorization.
- dejavu - Audio fingerprinting and recognition in Python
- audio-fingerprint-identifying-python - The Shazam-similar app, that identify the song using audio fingerprints & spectrum analysis and Fast Fourier transform
- Catchy - Corpus Analysis Tools for Computational Hook Discovery.
- Madmom ๐ฆ - MIR packages with strong focus on beat detection, onset detection and chord recognition.
- mir_eval ๐ฆ - Common scores for various MIR tasks. Also includes bss_eval implementation.
- msaf ๐ฆ - Music Structure Analysis Framework.
- librosa ๐ฆ - General audio and music analysis.
- essentia - Music related low level and high level feature extractor, C++ based, includes Python bindings.
- Music21 ๐ฆ - Toolkit for Computer-Aided Musicology.
- Mido ๐ฆ - Realtime MIDI wrapper.
- mingus ๐ฆ - Advanced music theory and notation package with MIDI file and playback support.
- Pretty-MIDI ๐ฆ - Utility functions for handling MIDI data in a nice/intuitive way.
- TimeSide (Beta) - high level audio analysis, imaging, transcoding, streaming and labelling.
- beets ๐ฆ - Music library manager and MusicBrainz tagger.
- dsdtools ๐ฆ - Parse and process the demixing secrets dataset.
- medleydb - Parse medleydb audio + annotations.
- Soundcloud API ๐ฆ - Wrapper for Soundcloud API.
- Youtube-Downloader ๐ฆ - Download youtube videos (and the audio).
- VamPy Host ๐ฆ - Interface compiled vamp plugins.
- Whirlwind Tour Of Python - fast-paced introduction to Python essentials, aimed at researchers and developers.
- Introduction to Numpy and Scipy - Highly recommended tutorial, covers large parts of the scientific Python ecosystem.
- Numpy for MATLABยฎ Users - Short overview of equivalent python functions for switchers.
- MIR Notebooks - collection of instructional iPython Notebooks for music information retrieval (MIR).
- Selected Topics in Audio Signal Processing - Exercises as iPython notebooks.
- Audio analysis notebook - Some Jupyter notebooks about audio signal processing with Python - Nb viewer
- Python Data Science Handbook - Jake Vanderplas, Excellent Book and accompanying tutorial notebooks.
- Fundamentals of Music Processing - Meinard Mรผller, comes with Python exercises.
- Python for audio signal processing - John C. Glover, Victor Lazzarini and Joseph Timoney, Linux Audio Conference 2011.
- librosa: Audio and Music Signal Analysis in Python, Video - Brian McFee, Colin Raffel, Dawen Liang, Daniel P.W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto, Scipy 2015.
- pyannote.audio: neural building blocks for speaker diarization, Video - Hervรฉ Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill, ICASSP 2020.
- Coursera Course - Audio Signal Processing, Python based course from UPF of Barcelona and Stanford University.
- Digital Signal Processing Course - Masters Course Material (University of Rostock) with many Python examples.
- Slack Channel - Music Information Retrieval Community.
There is already PythonInMusic but it is not up to date and includes too many packages of special interest that are mostly not relevant for scientific applications. Awesome-Python is large curated list of python packages. However, the audio section is very small.
Your contributions are always welcome! Please take a look at the contribution guidelines first.
I will keep some pull requests open if I'm not sure whether those libraries are awesome, you could vote for them by adding ๐ to them.