Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spoken digit dataset and baseline model #1090

Open
faroit opened this issue Dec 15, 2020 · 5 comments
Open

Add spoken digit dataset and baseline model #1090

faroit opened this issue Dec 15, 2020 · 5 comments

Comments

@faroit
Copy link
Contributor

faroit commented Dec 15, 2020

🚀 Feature

Given that there is a lack of small and comprehensive audio tasks, I would propose to add a speech MNIST dataset to torch audio.

Motivation

In the audio domain, we often lack small toy scenarios that would be a good equivalent to the ubiqous MNIST task.
A spoken digit dataset and model could help to sketch and try audio ML ideas.

Pitch

add either of the two:

to torchaudio.datasets

Additional context

furthermore, it might be a good idea to also add a baseline model, either based on MELSpectrogram -> conv2d or using the existing wav2letter.

@vincentqb
Copy link
Contributor

vincentqb commented Dec 17, 2020

The two datasets sound great :) and the example you mention would make for a nice tutorial, e.g. pytorch/tutorials#1204, thoughts?

@faroit
Copy link
Contributor Author

faroit commented Dec 17, 2020

@vincentqb I expect a lot of overlap with a possible baseline for the speechcommand dataset. Is there a simple and lightweight ASR model that we can use for a tutorial?

@vincentqb
Copy link
Contributor

oops, I updated the link in my previous message to the new audio classification tutorial we have for audio. Is that what you meant?

@jonnor
Copy link

jonnor commented May 23, 2021

There is now a PyTorch loader for FSDD in https://github.com/eonu/torch-fsdd

@jonnor
Copy link

jonnor commented May 23, 2021

There was also a pull request to add AudioMNIST, but it was closed, #84

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants