Version 3.2.0

hbredin released this 08 May 09:51

· 52 commits to develop since this release

70a8507

New features

feat(task): add option to cache task training metadata to speed up training (with @clement-pages)
feat(model): add receptive_field, num_frames and dimension to models (with @Bilal-Rahou)
feat(model): add fbank_only property to WeSpeaker models
feat(util): add Powerset.permutation_mapping to help with permutation in powerset space (with @FrenchKrab)
feat(sample): add sample file at pyannote.audio.sample.SAMPLE_FILE
feat(metric): add reduce option to diarization_error_rate metric (with @Bilal-Rahou)
feat(pipeline): add Waveform and SampleRate preprocessors

Fixes

fix(task): fix random generators and their reproducibility (with @FrenchKrab)
fix(task): fix estimation of training set size (with @FrenchKrab)
fix(hook): fix torch.Tensor support in ArtifactHook
fix(doc): fix typo in Powerset docstring (with @lukasstorck)

Improvements

improve(metric): add support for number of speakers mismatch in diarization_error_rate metric
improve(pipeline): track both Model and nn.Module attributes in Pipeline.to(device)
improve(io): switch to torchaudio >= 2.2.0
improve(doc): update tutorials (with @clement-pages)

Breaking changes

BREAKING(model): get rid of Model.example_output in favor of num_frames method, receptive_field property, and dimension property
BREAKING(task): custom tasks need to be updated (see "Add your own task" tutorial)

Community contributions

community: add tutorial for offline use of pyannote/speaker-diarization-3.1 (by @simonottenhauskenbun)

Assets 2