Anechoic and IR Convolution-based Auralization Data-compilation Ensemble
AIRCADE is a data-compilation ensemble, primarily intended to serve as a resource for researchers in the field of dereverberation, particularly for data-driven approaches. It comprises speech and song samples, together with acoustic guitar sounds, with original annotations pertinent to emotion recognition and Music Information Retrieval (MIR). Moreover, it includes a selection of Impulse Response (IR) samples with varying Reverberation Time (RT) values, providing a wide range of conditions for evaluation. This data-compilation can be used together with provided Python scripts, for generating auralized data ensembles in different sizes: tiny, small, medium and large. Additionally, the provided metadata annotations also allow for further analysis and investigation of the performance of dereverberation algorithms under different conditions. All data is licensed under Creative Commons Attribution 4.0 International License.
The data-compilation is hosted at Zenodo, with an approximate total file size of 1.3 GB. For simplicity, all samples in our data-compilation were renamed, e.g., guitar_0000, rir_0000, song_0000, speech_0000, and so on. The ensemble versions are available in different sizes, from a tiny version, with limited data, to a large version, with almost 300,000 samples. This allows users to choose the most suitable version for their specific research needs. The following table illustrates the differences between all versions, detailing the number of song, speech, guitar, IR and auralized samples in each one, together with their respective total file size and duration.
Version | Tiny | Small | Medium | Large |
---|---|---|---|---|
Song samples | 100 | 500 | 1,012 | 1,012 |
Speech samples | 100 | 500 | 1,012 | 1,440 |
Guitar samples | 100 | 500 | 1,012 | 2,004 |
IR samples | 5 | 9 | 33 | 65 |
Auralized samples | 1,500 | 13,500 | 100,188 | 289,640 |
Total duration | 3.2 h | 30.41 h | 221.77 h | 658.08 h |
Total file size (required) | 1.1 GB | 10.5 GB | 76.6 GB | 227.5 GB |
For more information, please refer to our data paper on ArXiv.
Create conda environment by running the following command:
conda create --name AIRCADE python=3.9
Activate the new environment by running the following command:
conda activate AIRCADE
Clone this repository by running the following command:
git clone [email protected]:TulioChiodi/AIRCADE.git
obs.: use https clone if your SSH key is not configured.
cd into the project by running the following command:
cd AIRCADE
Install the requirements from the requirements.txt file by running the following command:
pip install -r requirements.txt
Run the script using the following command in the terminal:
python src/prepare_base_dataset.py
Run the script using the following structure:
python src/dataset_generator.py [-i <input-dir>] [-o <output-dir>] [-rc <rir-configs-path>] [-ac <anec-configs-path>] [-ps <preset-size>] [-p <processes>] [-s]
-i
,--input-dir
: Directory path for the base dataset. Default isdata/dataset_base
.-o
,--output-dir
: Directory path for the processed dataset. Default isdata/dataset_processed
.-rc
,--rir-configs-path
: Path to the json file containing the RIR configurations. Default isconfigs/rir_configs.json
.-ac
,--anec-configs-path
: Path to the json file containing the anechoic configurations. Default isconfigs/anec_configs.json
.-ps
,--preset-size
: Size of the dataset. Default istiny
. Choose between (tiny, small, medium, large).-p
,--processes
: Number of parallel processes to use. Default is the number of CPUs available on the system.-s
,--sequential
: Use sequential processing. Default is using multiprocessing.
python src/dataset_generator.py -i /path/to/input/directory -o /path/to/output/directory -ps small -p 4
If you find AIRCADE useful in your research, please cite:
@misc{chiodi2023aircade,
title={AIRCADE: an Anechoic and IR Convolution-based Auralization Data-compilation Ensemble},
author={Túlio Chiodi and Arthur dos Santos and Pedro Martins and Bruno Masiero},
year={2023},
eprint={2304.09318},
archivePrefix={arXiv},
primaryClass={eess.AS}
}
This work was partially supported by the São Paulo Research Foundation (FAPESP), grants #2017/08120-6, #2019/22795-1 and #2022/16168-7.