Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding loader for Ballroom dataset #613

Merged
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions docs/.readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@

# Required
version: 2
build:
os: "ubuntu-22.04"
tools:
python: "3.9"

# Build documentation in the docs/ directory with Sphinx
sphinx:
Expand All @@ -18,6 +22,5 @@ formats: all

# Optionally set the version of Python and requirements required to build your docs
python:
version: 3.7
install:
- requirements: docs/requirements.txt
- requirements: docs/requirements.txt
10 changes: 5 additions & 5 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,11 +66,11 @@

# To shorten links of licenses and add to table
extlinks = {
"acousticbrainz": ("https://zenodo.org/record/2554044#.X_ivJ-n7RUI%s", "Custom"),
"cante": ("https://zenodo.org/record/1324183#.X_nq7-n7RUI%s", "Custom"),
"ikala": ("http://mac.citi.sinica.edu.tw/ikala/%s", "Custom"),
"rwc": ("https://staff.aist.go.jp/m.goto/RWC-MDB/%s", "Custom"),
"tonas": ("https://www.upf.edu/web/mtg/tonas/%s", "Custom"),
"acousticbrainz": ("https://zenodo.org/record/2554044#.X_ivJ-n7RUI%s", "Custom%s"),
"cante": ("https://zenodo.org/record/1324183#.X_nq7-n7RUI%s", "Custom%s"),
"ikala": ("http://mac.citi.sinica.edu.tw/ikala/%s", "Custom%s"),
"rwc": ("https://staff.aist.go.jp/m.goto/RWC-MDB/%s", "Custom%s"),
"tonas": ("https://www.upf.edu/web/mtg/tonas/%s", "Custom%s"),
}


Expand Down
2 changes: 1 addition & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Deprecated>=1.2.13
jams>=0.3.4
librosa>=0.10.1
numpy>=1.21.6
sphinx==4.2.0
sphinx>=5.2.0
sphinx-togglebutton>=0.3.2
sphinx-rtd-theme>=1.3.0
tqdm>=4.66.1
7 changes: 7 additions & 0 deletions docs/source/mirdata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,13 @@ baf
.. automodule:: mirdata.datasets.baf
:members:
:inherited-members:

ballroom
^^^^^^^^

.. automodule:: mirdata.datasets.ballroom
:members:
:inherited-members:


beatles
Expand Down
9 changes: 9 additions & 0 deletions docs/source/table.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,15 @@
- - .. image:: https://img.shields.io/badge/license-custom-orange
:target: https://zenodo.org/record/6868083

* - Ballroom
- - audio: ✅
- annotations: ❌
tanmayy24 marked this conversation as resolved.
Show resolved Hide resolved
- - :ref:`beats`
- :ref:`tempo`
- 698
- .. image:: https://licensebuttons.net/l/zero/1.0/80x15.png
:target: http://creativecommons.org/publicdomain/zero/1.0/

* - Beatles
- - audio: ❌
- annotations: ✅
Expand Down
237 changes: 237 additions & 0 deletions mirdata/datasets/ballroom.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,237 @@
"""Ballroom Rhythm Dataset Loader

.. admonition:: Dataset Info
:class: dropdown

The Ballroom Rhythm Dataset is a comprehensive collection of rhythm annotations for ballroom dance music. This dataset is designed for tasks such as beat tracking, rhythm analysis, and tempo estimation in ballroom dance music. It includes annotations for beats and bars corresponding to different dance styles within the ballroom genre.

**Dataset Overview:**

The dataset offers beat and bar annotations for various ballroom dance styles, such as Waltz, Tango, Viennese Waltz, Slow Foxtrot, Quickstep, Samba, Cha-Cha-Cha, Rumba, Paso Doble, and Jive. These annotations are provided in a format that includes beat time in seconds and beat ID, facilitating precise rhythm analysis.

**Beat and Bar Annotations:**

The beat annotations are structured as `.beats` files, where each line represents a beat with its timestamp and beat ID. For example, a line `9.430022675 3` indicates that the third beat of a bar is located at 9.43 seconds. This format is particularly useful for identifying downbeats, as they correspond to beats with ID = 1.

**Annotation Methodology:**

The dataset's annotations are based on the tempo guidelines of each ballroom dance style. Initial annotations were generated using a beat tracker, and then manually adjusted for accuracy. This method ensures that the annotations reflect the characteristic rhythms of each dance style.

**Applications:**

The Ballroom Rhythm Dataset is ideal for developing and testing algorithms for beat tracking, tempo estimation, and rhythm analysis in ballroom dance music. It can also be used for educational purposes, offering insights into the rhythmic structures of various ballroom dance styles.

**Acknowledgments and References:**

This dataset was created with the collaboration of experts in ballroom dance music. We extend our gratitude to those who contributed their knowledge and expertise to this project. For detailed information on the dataset and its creation, please refer to the associated research papers and documentation.

[1] Gouyon F., A. Klapuri, S. Dixon, M. Alonso, G. Tzanetakis, C. Uhle, and P. Cano. An experimental comparison of audio tempo induction algorithms. Transactions on Audio, Speech and Language Processing 14(5), pp.1832-1844, 2006.

[2] Böck, S., and M. Schedl. Enhanced beat tracking with context-aware neural networks. In Proceedings of the International Conference on Digital Audio Effects (DAFX), 2010.

[3] Dixon, S., F. Gouyon & G. Widmer. Towards Characterisation of Music via Rhythmic Patterns. In Proceedings of the 5th International Society for Music Information Retrieval Conference (ISMIR). 2004.
"""

import os
import csv
import logging
import librosa
import numpy as np
from typing import BinaryIO, Optional, TextIO, Tuple

from mirdata import annotations, core, download_utils, io, jams_utils


BIBTEX = """
@ARTICLE{1678001,
author={Gouyon, F. and Klapuri, A. and Dixon, S. and Alonso, M. and Tzanetakis, G. and Uhle, C. and Cano, P.},
journal={IEEE Transactions on Audio, Speech, and Language Processing},
title={An experimental comparison of audio tempo induction algorithms},
year={2006},
volume={14},
number={5},
pages={1832-1844},
doi={10.1109/TSA.2005.858509}}
"""

INDEXES = {
"default": "1.0",
"test": "1.0",
"1.0": core.Index(filename="ballroom_full_index_1.0.json"),
}

REMOTES = {
"audio": download_utils.RemoteFileMetadata(
filename="data1.tar.gz",
url="https://mtg.upf.edu/ismir2004/contest/tempoContest/data1.tar.gz",
checksum="2872a3e52070bc342a4510a95e2fa0b8",
destination_dir="B_1.0/audio",
unpack_directories=["BallroomData"],
tanmayy24 marked this conversation as resolved.
Show resolved Hide resolved
)
}

LICENSE_INFO = (
"Creative Commons Attribution Non Commercial Share Alike 4.0 International."
)

DOWNLOAD_INFO = """
Unfortunately most of the Ballroom dataset is not available for download.
If you have the Ballroom dataset, place the contents into a folder called
ballroom with the following structure:
> B_1.0/
> audio/
> annotations/beats
> annotations/tempo
and copy the ballroom folder to {}
"""
tanmayy24 marked this conversation as resolved.
Show resolved Hide resolved


class Track(core.Track):
"""Ballroom Rhythm class

Args:
track_id (str): track id of the track
data_home (str): Local path where the dataset is stored. default=None
If `None`, looks for the data in the default directory, `~/mir_datasets`

Attributes:
audio_path (str): path to audio file
beats_path (srt): path to beats file
tempo_path (srt): path to tempo file

tanmayy24 marked this conversation as resolved.
Show resolved Hide resolved
"""

def __init__(
self,
track_id,
data_home,
dataset_name,
index,
metadata,
):
super().__init__(
track_id,
data_home,
dataset_name,
index,
metadata,
)

# Audio path
self.audio_path = self.get_path("audio")

# Annotations paths
self.beats_path = self.get_path("beats")
self.tempo_path = self.get_path("tempo")

@core.cached_property
def beats(self) -> Optional[annotations.BeatData]:
return load_beats(self.beats_path)

@core.cached_property
def tempo(self) -> Optional[float]:
return load_tempo(self.tempo_path)

@property
def audio(self) -> Optional[Tuple[np.ndarray, float]]:
"""The track's audio

Returns:
* np.ndarray - audio signal
* float - sample rate

"""
return load_audio(self.audio_path)

def to_jams(self):
"""Get the track's data in jams format

Returns:
jams.JAMS: the track's data in jams format

"""
return jams_utils.jams_converter(
audio_path=self.audio_path,
beat_data=[(self.beats, "beats")],
tempo_data=[(self.tempo, "tempo")],
metadata=None,
)


def load_audio(audio_path):
"""Load an audio file.

Args:
audio_path (str): path to audio file

Returns:
* np.ndarray - the mono audio signal
* float - The sample rate of the audio file

"""
if audio_path is None:
return None
return librosa.load(audio_path, sr=44100, mono=False)
tanmayy24 marked this conversation as resolved.
Show resolved Hide resolved


@io.coerce_to_string_io
def load_beats(fhandle: TextIO):
"""Load beats

Args:
fhandle (str or file-like): Local path where the beats annotation is stored.

Returns:
BeatData: beat annotations

"""
beat_times = []
beat_positions = []

reader = csv.reader(fhandle, delimiter=" ")
for line in reader:
beat_times.append(float(line[0]))
beat_positions.append(int(line[1]))

if not beat_times or beat_times[0] == -1.0:
return None

Check warning on line 197 in mirdata/datasets/ballroom.py

View check run for this annotation

Codecov / codecov/patch

mirdata/datasets/ballroom.py#L197

Added line #L197 was not covered by tests

return annotations.BeatData(
np.array(beat_times), "s", np.array(beat_positions), "bar_index"
)


@io.coerce_to_string_io
def load_tempo(fhandle: TextIO) -> float:
"""Load tempo

Args:
fhandle (str or file-like): Local path where the tempo annotation is stored.

Returns:
float: tempo annotation

"""
reader = csv.reader(fhandle, delimiter=",")
return float(next(reader)[0])


@core.docstring_inherit(core.Dataset)
class Dataset(core.Dataset):
"""
The ballroom dataset

"""

def __init__(self, data_home=None, version="default"):
super().__init__(
data_home,
version,
name="ballroom",
track_class=Track,
bibtex=BIBTEX,
indexes=INDEXES,
remotes=REMOTES,
license_info=LICENSE_INFO,
download_info=DOWNLOAD_INFO,
)
Loading
Loading