Add spoken digit dataset and baseline model #1090

faroit · 2020-12-15T11:12:11Z

🚀 Feature

Given that there is a lack of small and comprehensive audio tasks, I would propose to add a speech MNIST dataset to torch audio.

Motivation

In the audio domain, we often lack small toy scenarios that would be a good equivalent to the ubiqous MNIST task.
A spoken digit dataset and model could help to sketch and try audio ML ideas.

Pitch

add either of the two:

to torchaudio.datasets

Additional context

furthermore, it might be a good idea to also add a baseline model, either based on MELSpectrogram -> conv2d or using the existing wav2letter.

The text was updated successfully, but these errors were encountered:

vincentqb · 2020-12-17T03:51:32Z

The two datasets sound great :) and the example you mention would make for a nice tutorial, e.g. pytorch/tutorials#1204, thoughts?

faroit · 2020-12-17T08:11:16Z

@vincentqb I expect a lot of overlap with a possible baseline for the speechcommand dataset. Is there a simple and lightweight ASR model that we can use for a tutorial?

vincentqb · 2020-12-23T20:51:22Z

oops, I updated the link in my previous message to the new audio classification tutorial we have for audio. Is that what you meant?

jonnor · 2021-05-23T11:50:55Z

There is now a PyTorch loader for FSDD in https://github.com/eonu/torch-fsdd

jonnor · 2021-05-23T11:53:23Z

There was also a pull request to add AudioMNIST, but it was closed, #84

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add spoken digit dataset and baseline model #1090

Add spoken digit dataset and baseline model #1090

faroit commented Dec 15, 2020

vincentqb commented Dec 17, 2020 •

edited

Loading

faroit commented Dec 17, 2020

vincentqb commented Dec 23, 2020

jonnor commented May 23, 2021

jonnor commented May 23, 2021

Add spoken digit dataset and baseline model #1090

Add spoken digit dataset and baseline model #1090

Comments

faroit commented Dec 15, 2020

🚀 Feature

Motivation

Pitch

Additional context

vincentqb commented Dec 17, 2020 • edited Loading

faroit commented Dec 17, 2020

vincentqb commented Dec 23, 2020

jonnor commented May 23, 2021

jonnor commented May 23, 2021

vincentqb commented Dec 17, 2020 •

edited

Loading