AIRCADE

Anechoic and IR Convolution-based Auralization Data-compilation Ensemble

AIRCADE is a data-compilation ensemble, primarily intended to serve as a resource for researchers in the field of dereverberation, particularly for data-driven approaches. It comprises speech and song samples, together with acoustic guitar sounds, with original annotations pertinent to emotion recognition and Music Information Retrieval (MIR). Moreover, it includes a selection of Impulse Response (IR) samples with varying Reverberation Time (RT) values, providing a wide range of conditions for evaluation. This data-compilation can be used together with provided Python scripts, for generating auralized data ensembles in different sizes: tiny, small, medium and large. Additionally, the provided metadata annotations also allow for further analysis and investigation of the performance of dereverberation algorithms under different conditions. All data is licensed under Creative Commons Attribution 4.0 International License.

About the sizeable versions

The data-compilation is hosted at Zenodo, with an approximate total file size of 1.3 GB. For simplicity, all samples in our data-compilation were renamed, e.g., guitar_0000, rir_0000, song_0000, speech_0000, and so on. The ensemble versions are available in different sizes, from a tiny version, with limited data, to a large version, with almost 300,000 samples. This allows users to choose the most suitable version for their specific research needs. The following table illustrates the differences between all versions, detailing the number of song, speech, guitar, IR and auralized samples in each one, together with their respective total file size and duration.

Version	Tiny	Small	Medium	Large
Song samples	100	500	1,012	1,012
Speech samples	100	500	1,012	1,440
Guitar samples	100	500	1,012	2,004
IR samples	5	9	33	65
Auralized samples	1,500	13,500	100,188	289,640
Total duration	3.2 h	30.41 h	221.77 h	658.08 h
Total file size (required)	1.1 GB	10.5 GB	76.6 GB	227.5 GB

For more information, please refer to our data paper on ArXiv.

Usage

Environment configuration:

Create conda environment by running the following command:

conda create --name AIRCADE python=3.9

Activate the new environment by running the following command:

conda activate AIRCADE

Clone this repository by running the following command:

git clone [email protected]:TulioChiodi/AIRCADE.git

obs.: use https clone if your SSH key is not configured.

cd into the project by running the following command:

cd AIRCADE

Install the requirements from the requirements.txt file by running the following command:

pip install -r requirements.txt

Preparing base dataset:

Run the script using the following command in the terminal:

python src/prepare_base_dataset.py

Running the script:

Run the script using the following structure:

python src/dataset_generator.py [-i <input-dir>] [-o <output-dir>] [-rc <rir-configs-path>] [-ac <anec-configs-path>] [-ps <preset-size>] [-p <processes>] [-s]

The following arguments are optional:

-i, --input-dir: Directory path for the base dataset. Default is data/dataset_base.
-o, --output-dir: Directory path for the processed dataset. Default is data/dataset_processed.
-rc, --rir-configs-path: Path to the json file containing the RIR configurations. Default is configs/rir_configs.json.
-ac, --anec-configs-path: Path to the json file containing the anechoic configurations. Default is configs/anec_configs.json.
-ps, --preset-size: Size of the dataset. Default is tiny. Choose between (tiny, small, medium, large).
-p, --processes: Number of parallel processes to use. Default is the number of CPUs available on the system.
-s, --sequential: Use sequential processing. Default is using multiprocessing.

Example usage:

python src/dataset_generator.py -i /path/to/input/directory -o /path/to/output/directory -ps small -p 4

Citation

If you find AIRCADE useful in your research, please cite:

@misc{chiodi2023aircade,
    title={AIRCADE: an Anechoic and IR Convolution-based Auralization Data-compilation Ensemble},
    author={Túlio Chiodi and Arthur dos Santos and Pedro Martins and Bruno Masiero},
    year={2023},
    eprint={2304.09318},
    archivePrefix={arXiv},
    primaryClass={eess.AS}
}

Acknowledgements

This work was partially supported by the São Paulo Research Foundation (FAPESP), grants #2017/08120-6, #2019/22795-1 and #2022/16168-7.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
configs		configs
data		data
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AIRCADE

About the sizeable versions

Usage

Environment configuration:

Preparing base dataset:

Running the script:

The following arguments are optional:

Example usage:

Citation

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

TulioChiodi/AIRCADE

Folders and files

Latest commit

History

Repository files navigation

AIRCADE

About the sizeable versions

Usage

Environment configuration:

Preparing base dataset:

Running the script:

The following arguments are optional:

Example usage:

Citation

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages