The implementation of Blind Estimator of Room Acoustic and Physical Parameters (BERP) is Pytorch-based framework for predicting room acoustic and physical parameters all-in-one. The project is based on PyTorch Lightning and Hydra. This implementation includes the data preprocessing pipelines, model architectures, training and inference strategies, and experimental configurations.
# Install miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh
# add conda to PATH
echo "export PATH=~/miniconda3/bin:$PATH" >> ~/.zshrc
source ~/.zshrc
# initialize conda
conda init zsh
# create conda environment
conda create -n acoustic-toolkit python=3.11.8
conda activate acoustic-toolkit
For better dependency management, we use pdm
as the package manager and deprecate pip
. You can install pdm
with the following command
pip install pdm
# clone project
git clone https://github.com/Alizeded/BERP
cd BERP
# create conda environment and install dependencies
pdm config venv.backend conda # choose the backend as conda
pdm sync # install dependencies with locking dependencies versions
The data is also avaliable, you can download from the cloud storage
# noisy reverberant speech data
https://jstorage.app.box.com/v/berp-datasets/file/1496315212687
# crowded reverberant speech data
https://jstorage.app.box.com/v/berp-datasets/file/1496441527052
Then, unzip the data and put it in the data
directory.
# unzip the data
unzip noiseReverbSpeech.zip -d data
unzip mixed_speech.zip -d data
Juypiter notebook data_preprocessing.ipynb
in notebook
folder details the data preprocessing pipeline.
Train model with the default configurations
# train on single GPU
# for unified module
python src/train_jointRegressor.py trainer=gpu data=ReverbSpeechJointEst logger=wandb_jointRegressor callbacks=default_jointRegressor
# for occupancy module
python src/train_numEstimator.py trainer=gpu logger=wandb_numEstimator callbacks=default_numEstimator
# train on dual GPUs
# for unified module
python src/train_jointRegressor.py trainer=ddp data=ReverbSpeechJointEst logger=wandb_jointRegressor callbacks=default_jointRegressor
# for occupancy module
python src/train_numEstimator.py trainer=ddp logger=wandb_numEstimator callbacks=default_numEstimator
# train on quad GPUs
# for unified module
python src/train_jointRegressor.py trainer=ddp trainer.devices=4 data=ReverbSpeechJointEst logger=wandb_jointRegressor callbacks=default_jointRegressor
# for occupancy module
python src/train.py trainer=ddp trainer.devices=4 logger=wandb_numEstimator callbacks=default_numEstimator
# train on multiple GPUs with multiple nodes (2 nodes, 4 GPUs as an example)
# for unified module
python src/train_jointRegressor.py trainer=ddp trainer.nodes=2 trainer.devices=4 data=ReverbSpeechJointEst logger=wandb_jointRegressor callbacks=default_jointRegressor
# for occupancy module
python src/train_numEstimator.py trainer=ddp trainer.nodes=2 trainer.devices=4 logger=wandb_numEstimator callbacks=default_numEstimator
Please refer to model
, callback
and logger
folder and train.yaml
in configs
directory for more details.
python src/inference_jointRegressor.py data=ReverbSpeechJointEst
python src/inference_numEstimator.py
More details about the inference can be found in inference.yaml
in configs
directory.
After inferencing from the trained model, you can use the following command to inference the room acoustic parameters using SSIR model.
python src/inference_rap_joint.py
More details about the inference of room acoustic parameters can be found in inference_rap.yaml
in configs
directory.
Please refer to inference.yaml
in configs
directory for more details.
In the weights
directory, you can download the corresponding weights of each module for the BERP framework,
including the unified module and the occupancy module with three featurization methods and the separate module with MFCC featurization.
you can download the weights from the following link:
# download the weights for the unified module
sh weights/unified_module/unified_module_Gammatone.sh
sh weights/unified_module/unified_module_MFCC.sh
sh weights/unified_module/unified_module_Mel.sh
# or you can copy & paste the following cloud storage link to your browser
https://jstorage.box.com/v/BERP-unified-module-gammatone
https://jstorage.box.com/v/BERP-unified-module-mfcc
https://jstorage.box.com/v/BERP-unified-module-mel
# download the weights for the occupancy module
sh weights/occupancy_module/occupancy_module_Gammatone.sh
sh weights/occupancy_module/occupancy_module_MFCC.sh
sh weights/occupancy_module/occupancy_module_Mel.sh
# or you can copy & paste the following cloud storage link to your browser
https://jstorage.box.com/v/BERP-occupancy-module-gamma
https://jstorage.box.com/v/BERP-occupancy-module-mfcc
https://jstorage.box.com/v/BERP-occupancy-module-mel
# download the weights for the separate module
sh weights/rir_module/rir_module_MFCC.sh
sh weights/volume_module/volume_module_MFCC.sh
sh weights/distance_module/distance_module_MFCC.sh
sh weights/orientation_module/orientation_module_MFCC.sh
# or you can copy & paste the following cloud storage link to your browser
https://jstorage.box.com/v/BERP-rir-module-mfcc
https://jstorage.box.com/v/BERP-volume-module-mfcc
https://jstorage.box.com/v/BERP-distance-module-mfcc
https://jstorage.box.com/v/BERP-ori-module-mfcc
After obtaining the weights, please check the "eval.yaml" or "inference.yaml" in the "configs" directory to put the weights in the correct path for the evaluation or inference.
PS1: These shell scripts will pop out the default browser and you can download the weights from the cloud storage.
PS2: We have checked the validity of the download link, there should be no problem with the download link. We are working on migrating the dataset to the hugginggface dataset hub. Please stay tuned.
This project is licensed under the GPL-3.0 License - see the LICENSE file for details.
This project obtained the great favours from Jianan Chen, our good friend. Thanks for his great help.
If you find this repository useful in your research, or if you want to refer to the methodology and code, please cite the following paper:
@misc{wang2024berp,
title={BERP: A Blind Estimator of Room Acoustic and Physical Parameters for Single-Channel Noisy Speech Signals},
author={Lijun Wang and Yixian Lu and Ziyan Gao and Kai Li and Jianqiang Huang and Yuntao Kong and Shogo Okada},
year={2024},
eprint={2405.04476},
archivePrefix={arXiv},
primaryClass={eess.AS}
}