Skip to content
/ BERP Public
forked from Alizeded/BERP

The pytorch implementation of BERP: A Blind Estimator of Room acoustic and physical Parameters

License

Notifications You must be signed in to change notification settings

LuYixian/BERP

 
 

Repository files navigation

BERP: A Blind Estimator of Room acoustic and physical Parameters for Single-Channel Noisy Speech Signals

The official implementation of BERP

pytorch lightning hydra Template
Paper

Description

The implementation of Blind Estimator of Room Acoustic and Physical Parameters (BERP) is Pytorch-based framework for predicting room acoustic and physical parameters all-in-one. The project is based on PyTorch Lightning and Hydra. This implementation includes the data preprocessing pipelines, model architectures, training and inference strategies, and experimental configurations.

Installation

Pre-requisites

# Install miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh

# add conda to PATH
echo "export PATH=~/miniconda3/bin:$PATH" >> ~/.zshrc
source ~/.zshrc

# initialize conda
conda init zsh

# create conda environment
conda create -n acoustic-toolkit python=3.11.8
conda activate acoustic-toolkit

pdm installation

For better dependency management, we use pdm as the package manager and deprecate pip. You can install pdm with the following command

pip install pdm
# clone project
git clone https://github.com/Alizeded/BERP
cd BERP

# create conda environment and install dependencies
pdm config venv.backend conda # choose the backend as conda
pdm sync # install dependencies with locking dependencies versions

Data download and preprocessing

The data is also avaliable, you can download from the cloud storage

# noisy reverberant speech data
https://jstorage.app.box.com/v/berp-datasets/file/1496315212687

# crowded reverberant speech data
https://jstorage.app.box.com/v/berp-datasets/file/1496441527052

Then, unzip the data and put it in the data directory.

# unzip the data
unzip noiseReverbSpeech.zip -d data
unzip mixed_speech.zip -d data

Juypiter notebook data_preprocessing.ipynb in notebook folder details the data preprocessing pipeline.

How to run

Train model with the default configurations

# train on single GPU
# for unified module
python src/train_jointRegressor.py trainer=gpu data=ReverbSpeechJointEst logger=wandb_jointRegressor callbacks=default_jointRegressor

# for occupancy module
python src/train_numEstimator.py trainer=gpu logger=wandb_numEstimator callbacks=default_numEstimator
# train on dual GPUs
# for unified module
python src/train_jointRegressor.py trainer=ddp data=ReverbSpeechJointEst logger=wandb_jointRegressor callbacks=default_jointRegressor

# for occupancy module
python src/train_numEstimator.py trainer=ddp logger=wandb_numEstimator callbacks=default_numEstimator
# train on quad GPUs
# for unified module
python src/train_jointRegressor.py trainer=ddp trainer.devices=4 data=ReverbSpeechJointEst logger=wandb_jointRegressor callbacks=default_jointRegressor

# for occupancy module
python src/train.py trainer=ddp trainer.devices=4 logger=wandb_numEstimator callbacks=default_numEstimator
# train on multiple GPUs with multiple nodes (2 nodes, 4 GPUs as an example)
# for unified module
python src/train_jointRegressor.py trainer=ddp trainer.nodes=2 trainer.devices=4 data=ReverbSpeechJointEst logger=wandb_jointRegressor callbacks=default_jointRegressor

# for occupancy module
python src/train_numEstimator.py trainer=ddp trainer.nodes=2 trainer.devices=4 logger=wandb_numEstimator callbacks=default_numEstimator

Configuration of training

Please refer to model, callback and logger folder and train.yaml in configs directory for more details.

Inference with the trained model

python src/inference_jointRegressor.py data=ReverbSpeechJointEst
python src/inference_numEstimator.py

More details about the inference can be found in inference.yaml in configs directory.

After inferencing from the trained model, you can use the following command to inference the room acoustic parameters using SSIR model.

python src/inference_rap_joint.py

More details about the inference of room acoustic parameters can be found in inference_rap.yaml in configs directory.

Configuration of inference output from the trained model

Please refer to inference.yaml in configs directory for more details.

Weights are also available, please check the weights directory for more information

In the weights directory, you can download the corresponding weights of each module for the BERP framework, including the unified module and the occupancy module with three featurization methods and the separate module with MFCC featurization.

you can download the weights from the following link:

# download the weights for the unified module
sh weights/unified_module/unified_module_Gammatone.sh
sh weights/unified_module/unified_module_MFCC.sh
sh weights/unified_module/unified_module_Mel.sh
# or you can copy & paste the following cloud storage link to your browser
https://jstorage.box.com/v/BERP-unified-module-gammatone
https://jstorage.box.com/v/BERP-unified-module-mfcc
https://jstorage.box.com/v/BERP-unified-module-mel
# download the weights for the occupancy module
sh weights/occupancy_module/occupancy_module_Gammatone.sh
sh weights/occupancy_module/occupancy_module_MFCC.sh
sh weights/occupancy_module/occupancy_module_Mel.sh
# or you can copy & paste the following cloud storage link to your browser
https://jstorage.box.com/v/BERP-occupancy-module-gamma
https://jstorage.box.com/v/BERP-occupancy-module-mfcc
https://jstorage.box.com/v/BERP-occupancy-module-mel
# download the weights for the separate module
sh weights/rir_module/rir_module_MFCC.sh
sh weights/volume_module/volume_module_MFCC.sh
sh weights/distance_module/distance_module_MFCC.sh
sh weights/orientation_module/orientation_module_MFCC.sh
# or you can copy & paste the following cloud storage link to your browser
https://jstorage.box.com/v/BERP-rir-module-mfcc
https://jstorage.box.com/v/BERP-volume-module-mfcc
https://jstorage.box.com/v/BERP-distance-module-mfcc
https://jstorage.box.com/v/BERP-ori-module-mfcc

After obtaining the weights, please check the "eval.yaml" or "inference.yaml" in the "configs" directory to put the weights in the correct path for the evaluation or inference.

PS1: These shell scripts will pop out the default browser and you can download the weights from the cloud storage.

PS2: We have checked the validity of the download link, there should be no problem with the download link. We are working on migrating the dataset to the hugginggface dataset hub. Please stay tuned.

License

This project is licensed under the GPL-3.0 License - see the LICENSE file for details.

Acknowledgments

This project obtained the great favours from Jianan Chen, our good friend. Thanks for his great help.

Citation

If you find this repository useful in your research, or if you want to refer to the methodology and code, please cite the following paper:

@misc{wang2024berp,
      title={BERP: A Blind Estimator of Room Acoustic and Physical Parameters for Single-Channel Noisy Speech Signals}, 
      author={Lijun Wang and Yixian Lu and Ziyan Gao and Kai Li and Jianqiang Huang and Yuntao Kong and Shogo Okada},
      year={2024},
      eprint={2405.04476},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

About

The pytorch implementation of BERP: A Blind Estimator of Room acoustic and physical Parameters

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 69.0%
  • Jupyter Notebook 30.7%
  • Other 0.3%