-
Notifications
You must be signed in to change notification settings - Fork 664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changed GTZAN so that it only traverses filenames belonging to the dataset #791
Conversation
* Added the GTZAN class in torchaudio.datasets using the same format as the rest of the datasets. * Added the appropriate test function in test_datasets.py. * Added the GTZAN class in the datasets.rst documentation file.
* Added dummy noise .wav in `test/assets/` * Removed transforms of input and output from the dataset `__init__` function, as well as the corresponding methods. * Replaced rendundant `filtered` and `subset` methods from class initialization and also changed the corresponding assertion message.
…taset Now, instead of walking the whole directory and subdirectories of the dataset GTZAN only looks for files under a `genre`/`genre`.`5 digit number`.wav format, where `genre` is an allowed GTZAN genre label. This allows moving or removing files from the dataset (e.g. for fixing duplication or mislabeling issues).
Codecov Report
@@ Coverage Diff @@
## master #791 +/- ##
==========================================
+ Coverage 89.66% 89.71% +0.04%
==========================================
Files 34 34
Lines 2652 2664 +12
==========================================
+ Hits 2378 2390 +12
Misses 274 274
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
thanks! |
Thanks for the pull request :) Before updating other datasets, I'd want us to make these abstractions more general so that the particular implementation of each dataset remains very simple to reproduce. Indeed, one of the strength of the dataset implementations that we currently have is how simple it is to replicate and extend to other cases. |
Thanks for accepting it :) Are you referring to utility functions such as to address #794 or something like an AudioDataset parent class (or a mixture of both)? |
I meant for #794. I see advantages to staying closes to standard pytorch dataset. :) |
The simplest solution is to make |
Keeping the implementation simple is good but making the code work correctly is important here. |
Running simple grep, all the function calls to
|
Yes, if we could sort as
Yes, but one does not prevent the other :)
The reason |
My point here is that together with the precious comment this sounds like you prefer to have the current (wrong) implementation for the sake of simple implementation. I do not think that's your intention but I would like to stress that the most important things is the torchaudio provides correct and easy-to-use dataset implementation. The simplicity of the implementation should come after that.
In that case, neither |
Given that the code is open source, we should aim for "easy to use" to also mean "easy to modify". In any case, this is not about correctness versus simplicity. It's about raising the bar and having both. :)
The idea of using generator was to investigate what we could do in the event that the list is streamed (say as for I'm sure you have seen this, but there's a fun discussion here that would be relevant :) Apparently, one can simply sort the items returned by There's been some changes recently in torchtext, though they were also using |
Hi, I noticed that |
Thanks for helping! :) Just to make sure we are on the same page, there are two issues.
Both issues should be addressed for all datasets, likely by modifying the common function |
I started working on this in #814 so you can stay relax. Thanks. |
word_language_model: Fix Transformer init_weights
After recommendation by @mthrok in #764
Now, instead of walking the whole directory of the dataset path,
GTZAN only looks for files under a
genre
/genre
.5 digit number
.wav format, wheregenre
is an allowed GTZAN genre label.This allows moving or removing files from the dataset (e.g. for fixing duplication or mislabeling issues) while not listing irrelevant files.