All

28 repositories

FN-SSL
Public
The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization
speech narrow-band sound-source-localization microphone-array-generalization
Python
•10•93•1•0•Updated Dec 3, 2024Dec 3, 2024
FS-EEND
Public
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024]
end-to-end pytorch speaker-diarization self-attention online-inference frame-wise
Python
•
MIT License
•4•86•2•0•Updated Dec 2, 2024Dec 2, 2024
NBSS
Public
The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation
speech pytorch multi-channel enhancement denoising separation dereverberation narrow-band full-band
Python
•
MIT License
•26•235•20•0•Updated Nov 4, 2024Nov 4, 2024
UMA-ASR
Public
This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).
speech-recognition asr
Shell
•3•17•1•0•Updated Oct 29, 2024Oct 29, 2024
ATST-SED
Public
This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
semi-supervised-learning sed sound-event-detection fine-tuning self-supervised-learning atst
Jupyter Notebook
•
MIT License
•13•104•1•0•Updated Oct 15, 2024Oct 15, 2024
RealMAN
Public
A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NIPS 2024]
multi-channel speech-enhancement microphone-array-processing doa-estimation audio-datasets sound-source-localization microphone-audio-capture real-world-datasets
Python
•11•98•3•0•Updated Oct 12, 2024Oct 12, 2024
SAR-SSL
Public
A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer” [TASLP 2024]
multi-channel microphone-array real-world-data room-acoustics conformer fine-tuning tdoa array-signal-processing self-supervised-learning downstream-tasks
Python
•
MIT License
•1•30•2•0•Updated Oct 11, 2024Oct 11, 2024
ATST-RCT
Public
ATST-RCT model for DCASE 2022 task4.
Python
•0•2•0•0•Updated Sep 19, 2024Sep 19, 2024
audiossl
Public
A library built for easier audio self-supervised training, downstream tasks evaluation
audio-classification audioset nsynth speech-commands audio-datasets voxceleb1 urbansound8k pytorch-lightning audio-representation audio-self-supervised-learning
Python
•
Other
•10•107•3•1•Updated Aug 27, 2024Aug 27, 2024
RVAE-EM
Public
Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [ICASSP2024]
vae bayesian-inference speech-processing speech-enhancement dereverberation
Python
•
MIT License
•4•42•0•0•Updated Mar 20, 2024Mar 20, 2024
Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement-
Public
Python
•
MIT License
•0•4•0•0•Updated Mar 12, 2024Mar 12, 2024
pytorch_lightning_template_for_beginners
Public
A pytorch template for beginners based on pytorch_lightning
Python
•5•36•0•0•Updated Feb 1, 2024Feb 1, 2024
FullSubNet
Public
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
audio reproducible-research paper speech band speech-processing noise-reduction denoising speech-separation speech-enhancement
Python
•
MIT License
•158•554•38•1•Updated Aug 19, 2023Aug 19, 2023
McNet
Public
The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023
signal-processing speech-enhancement array-signal-processing multi-channel-speech-enhancement pytorch-lightning pytorch
Python
•13•108•1•0•Updated Mar 24, 2023Mar 24, 2023
RCT
Public
This repo gives the code for the official implementation of RCT.
Python
•1•13•0•0•Updated Jun 28, 2022Jun 28, 2022
Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement
Public
Python
•
MIT License
•9•7•0•0•Updated Sep 30, 2021Sep 30, 2021
Audio-WestlakeU.github.io
Public
Audio and Signal Information Processing Lab in Westlake University concentrates on speech processing algorithm
MIT License
•1•3•0•0•Updated Jul 8, 2021Jul 8, 2021
Narrowband_DeepFiltering
Public
Python
•6•19•0•0•Updated Apr 1, 2020Apr 1, 2020
RTF_InterFrameSpecSub
Public
MATLAB
•3•1•0•0•Updated Apr 1, 2020Apr 1, 2020
RS_noisePSD
Public
MATLAB
•0•1•0•0•Updated Apr 1, 2020Apr 1, 2020
DP_RTF_SSL
Public
MATLAB
•3•3•1•0•Updated Apr 1, 2020Apr 1, 2020
bss_ctf_lasso
Public
MATLAB
•3•5•0•0•Updated Apr 1, 2020Apr 1, 2020
dereverb_ctf_nonneg
Public
MATLAB
•2•1•0•0•Updated Apr 1, 2020Apr 1, 2020
BSS_CTF_EM
Public
MATLAB
•1•0•0•0•Updated Apr 1, 2020Apr 1, 2020
LSTM-noisePSD
Public
Python
•2•8•0•0•Updated Apr 1, 2020Apr 1, 2020
ctf_mint
Public
MATLAB
•1•0•0•0•Updated Apr 1, 2020Apr 1, 2020
OnlineSSL_DPRTF_EG
Public
MATLAB
•5•8•0•0•Updated Apr 1, 2020Apr 1, 2020
SMIF_online_dereverb
Public
MATLAB
•3•3•0•0•Updated Apr 1, 2020Apr 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio-WestlakeU

All repositories

All

28 repositories

FN-SSL

FS-EEND

NBSS

UMA-ASR

ATST-SED

RealMAN

SAR-SSL

ATST-RCT

audiossl

RVAE-EM

Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement-

pytorch_lightning_template_for_beginners

FullSubNet

McNet

RCT

Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement

Audio-WestlakeU.github.io

Narrowband_DeepFiltering

RTF_InterFrameSpecSub

RS_noisePSD

DP_RTF_SSL

bss_ctf_lasso

dereverb_ctf_nonneg

BSS_CTF_EM

LSTM-noisePSD

ctf_mint

OnlineSSL_DPRTF_EG

SMIF_online_dereverb

All repositories

All

Repositories list

28 repositories