Skip to content

Version 3.3.0

Compare
Choose a tag to compare
@hbredin hbredin released this 14 Jun 08:41
· 40 commits to develop since this release

TL;DR

pyannote.audio does speech separation: multi-speaker audio in, one audio channel per speaker out!

pip install pyannote.audio[separation]==3.3.0

New features

  • feat(task): add PixIT joint speaker diarization and speech separation task (with @joonaskalda)
  • feat(model): add ToTaToNet joint speaker diarization and speech separation model (with @joonaskalda)
  • feat(pipeline): add SpeechSeparation pipeline (with @joonaskalda)
  • feat(io): add option to select torchaudio backend

Fixes

  • fix(task): fix wrong train/development split when training with (some) meta-protocols (#1709)
  • fix(task): fix metadata preparation with missing validation subset (@clement-pages)

Improvements

  • improve(io): when available, default to using soundfile backend
  • improve(pipeline): do not extract embeddings when max_speakers is set to 1
  • improve(pipeline): optimize memory usage of most pipelines (#1713 by @benniekiss)