Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a generic test for the datasets #1015

Merged
merged 2 commits into from
Jun 15, 2019

Conversation

pmeier
Copy link
Collaborator

@pmeier pmeier commented Jun 13, 2019

This adds a generic test, which should be mandatory for all datasets. It tests

  1. if the dataset has the correct length,
  2. if the dataset returns a PIL.Image and int when indexed, and
  3. if the first example of the dataset correctly belongs to a given class.

To make this generic test applicable to the *MNIST datasets, I removed the randomness in the label generation.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, thanks a lot!

I have a few comments, in particular around the fact that generic_dataset is not generic: many of our datasets do not return (Image, int), so I'd rather not give the impression that they should. Renaming the function (and maybe moving it into Tester) would be better I think.

@@ -9,6 +9,14 @@
from fakedata_generation import mnist_root, cifar_root, imagenet_root


def generic_dataset_test(tester, dataset, num_images=1, cls='fakedata'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you maybe rename this function to basic_classification_dataset_test or something that includes classification in it? Not all datasets have the class_to_idx attribute in it, so I'd not want (at least for now) to force people to add it just to make it compliant with this test. Maybe in the future we could have a ClassificationDataset that has this attribute, and then we could start enforcing the presence of this attribute.

Also, any particular reason why this is not a method in Tester? If it doesn't start with test, it won't be run by the Tester, so we can have helper functions inside Tester.

Copy link
Collaborator Author

@pmeier pmeier Jun 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[...] many of our datasets do not return (Image, int), so I'd rather not give the impression that they should.

Shame on me, I keep forgetting that.

Also, any particular reason why this is not a method in Tester? If it doesn't start with test, it won't be run by the Tester [...]

I did not know that. Thanks for the clarification.

- renamed generic*() to generic_classification*()
- moved function inside Tester
- test class_to_idx attribute outside of generic_classification*()
@codecov-io
Copy link

codecov-io commented Jun 15, 2019

Codecov Report

Merging #1015 into master will increase coverage by 0.04%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1015      +/-   ##
==========================================
+ Coverage   63.33%   63.37%   +0.04%     
==========================================
  Files          65       65              
  Lines        5149     5152       +3     
  Branches      772      772              
==========================================
+ Hits         3261     3265       +4     
+ Misses       1665     1664       -1     
  Partials      223      223
Impacted Files Coverage Δ
torchvision/transforms/functional.py 71.21% <0%> (+0.08%) ⬆️
torchvision/models/detection/roi_heads.py 56.89% <0%> (+0.24%) ⬆️
torchvision/datasets/mnist.py 51.18% <0%> (+0.47%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7693c89...20e6922. Read the comment docs.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@fmassa fmassa merged commit 3c81d47 into pytorch:master Jun 15, 2019
@pmeier pmeier deleted the refactor_dataset_test branch June 17, 2019 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants