Skip to content

Audio-WestlakeU/Narrowband_DeepFiltering

Repository files navigation

This repository provides code for the TASLP submission:

and for

  • Xiaofei Li and Radu Horaud. Multichannel Speech Enhancement Based on Time-frequency Masking Using Subband Long Short-Term Memory. WASPAA 2019.

For more details, please refer to: https://team.inria.fr/perception/research/mse-lstm/

Preparation:

The codes are developed using MATLAB R2017a and Python3. The following dataset and tools should be first prepared:

1. CHiME3 dataset. Actually only the multichannel real recordings are used.

2. tensorflow, Keras, SciPy

3. To evaluate the speech enhancement performance, please download and setup the matlab toolkits for PESQ, STOI, SDR and (normalized) SRMR.

Usage:

1. Mixed data generation for train/validation/test

move mix_bthplusbackground.m into CHiME3/tools/simulation/, and run it.

2. Extract subband sequences for train/validation

run sequence_generation.py

3. Train

run train.py

4. Speech enhancement on test data using trained models

run prediction.py

5. Speech enhancement performance evaluation

run evaluation.m

Note:

1. For each of the above steps, please set the dataPath in the script. All the data are stored in such dataPath. About 100 GB of disk space is required. 

2. In ref_models/, model weights for four networks trained by the author are provided.

3. In ref_se_result/, performance scores obtained by the author are provided. Please refer to evaluation.m for the data structure.

Author: Xiaofei Li, Westlake University, China and INRIA Grenoble Rhone-Alpes, France

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published