DPM-TSE

Official Pytorch Implementation of DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction

🎧 Listen to examples on the Demopage

🔥 Updates: SoloAudio is now available! This advanced diffusion-transformer-based model extracts target sounds from free-text input.

Content

Usage
References
Acknowledgement

Usage

Download checkpoints and dataset from this 🤗 link
Prepare environment: requirement.txt

# Training
python src/train_ddim_cls.py --data-path 'data/fsd2018/' --autoencoder-path 'ckpts/first_stage.pt' --autoencoder-config 'ckpts/vae.yaml' --diffusion-config 'src/config/DiffTSE_cls_v_b_1000.yaml'

# Inference
python src/tse.py --device 'cuda' --mixture 'example.wav' --target_sound 'Applause' --autoencoder-path 'ckpts/first_stage.pt' --autoencoder-config 'ckpts/vae.yaml' --diffusion-config 'src/config/DiffTSE_cls_v_b_1000.yaml' --diffusion-ckpt 'ckpts/base_v_1000.pt'

References

If you find the code useful for your research, please consider citing:

@inproceedings{hai2024dpm,
  title={DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction},
  author={Hai, Jiarui and Wang, Helin and Yang, Dongchao and Thakkar, Karan and Dehak, Najim and Elhilali, Mounya},
  booktitle={ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={1196--1200},
  year={2024},
  organization={IEEE}
}

Acknowledgement

We borrow code from following repos:

Diffusion Schedulers and 2D UNet are based on 🤗 Diffusers
16k HiFi-GAN vocoder is borrowed from AudioLDM

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.idea		.idea
ckpts		ckpts
data		data
img		img
src		src
README.md		README.md
example.wav		example.wav
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DPM-TSE

Content

Usage

References

Acknowledgement

About

Releases

Packages

Languages

haidog-yaqub/DPMTSE

Folders and files

Latest commit

History

Repository files navigation

DPM-TSE

Content

Usage

References

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages