Adding fvgc_aircraft dataset #5178

yiwen-song · 2022-01-09T01:20:57Z

Per #5108

cc @pmeier

facebook-github-bot · 2022-01-09T01:21:04Z

💊 CI failures summary and remediations

As of commit caf762b (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

pmeier

Hey @sallysyw, thanks a lot for the PR. I've one question before I actually dive into the PR: why do we want restructure the data into an image folder hierarchy? IIUC, we could also work with the given format, which I would prefer.

NicolasHug

Thanks a lot for the PR Yiwen!

This looks great, I made a few minor comments below. As Philip noted above, we tend not to change the dataset underlying structure, partly so that users can also just manually download the files, should they want to. Was there a specific reason to re-organise the dataset structure?

Thanks!

torchvision/datasets/fvgc_aircraft.py

test/test_datasets.py

yiwen-song · 2022-01-10T17:37:41Z

Hey @sallysyw, thanks a lot for the PR. I've one question before I actually dive into the PR: why do we want restructure the data into an image folder hierarchy? IIUC, we could also work with the given format, which I would prefer.

Good point! I did that for no specific reason - just follow what VISSL did. But I agree that we'd better not copy these images to another folder after downloading to avoid additional overhead. Let me send a patch to address this issue. Thanks!

@pmeier @NicolasHug

pmeier

Hey @sallysyw, I have some minor comments inline.

Apart from that I'm wondering if it makes sense to add a parameter to select the different "levels" of the dataset.

Model, e.g. Boeing 737-76J. Since certain models are nearly visually indistinguishable, this level is not used in the evaluation.

Variant, e.g. Boeing 737-700. A variant collapses all the models that are visually indistinguishable into one class. The dataset comprises 102 different variants.

Family, e.g. Boeing 737. The dataset comprises 70 different families.

Manufacturer, e.g. Boeing. The dataset comprises 41 different manufacturers.

We could have a level="variant" that also allows "family" and "manufacturer". From what I see, we would only need to change the files we read from based on this parameter.

torchvision/datasets/fgvc_aircraft.py

test/test_datasets.py

NicolasHug · 2022-01-12T10:04:49Z

Apart from that I'm wondering if it makes sense to add a parameter to select the different "levels" of the dataset.
...
We could have a level="variant" that also allows "family" and "manufacturer". From what I see, we would only need to change the files we read from based on this parameter.

That's a great point. Ultimately our goal with this dataset is to support our colleagues which are implementing FLAVA, and the "reference" in this case is https://github.com/facebookresearch/vissl/blob/main/extra_scripts/datasets/create_fgvc_aircraft_data_files.py. From what I understand, they would only need to use the "variant" categories. Perhaps it would be OK to just implement the "variant" category in this PR to keep it simple, and to implement the rest if users request more categories?

yiwen-song · 2022-01-12T18:31:06Z

Thanks both of you for the review. For the categories, I think for completeness I will provide a parameter called "level" or "annotation_level" and set the default to be "variant", this way it covers the existing functionalities in FLAVA and still giving users flexibility to use other levels' annotations of this dataset.

@pmeier @NicolasHug

pmeier

Small nits and one question inline. Otherwise LGTM. Thanks a lot @sallysyw!

torchvision/datasets/fgvc_aircraft.py

test/test_datasets.py

torchvision/datasets/fgvc_aircraft.py

Summary: * add fvgc_aircraft dataset * add docstring & remove useless import * resolve lint issue * address comments * adding more annotation level * nit * address comments * Apply suggestions from code review * unify format * remove useless line Reviewed By: NicolasHug Differential Revision: D33618172 fbshipit-source-id: ce6471096d8527b08373061e8ec2059a25f96f1d Co-authored-by: Philip Meier <[email protected]>

add fvgc_aircraft dataset

8fc5913

pytorch-probot bot added the ciflow/default label Jan 9, 2022

facebook-github-bot added the cla signed label Jan 9, 2022

add docstring & remove useless import

3a05703

yiwen-song mentioned this pull request Jan 9, 2022

New classification datasets support for FLAVA #5108

Closed

14 tasks

yiwen-song requested review from pmeier and NicolasHug January 9, 2022 01:35

yiwen-song and others added 2 commits January 8, 2022 23:20

Merge branch 'main' into fvgc

8737e13

resolve lint issue

2045aee

pmeier reviewed Jan 10, 2022

View reviewed changes

NicolasHug reviewed Jan 10, 2022

View reviewed changes

address comments

fedb68a

pmeier reviewed Jan 12, 2022

View reviewed changes

yiwen-song added new feature module: datasets labels Jan 12, 2022

yiwen-song added 2 commits January 14, 2022 06:36

adding more annotation level

3e317e3

nit

dcaa2ec

pmeier approved these changes Jan 14, 2022

View reviewed changes

yiwen-song and others added 2 commits January 14, 2022 21:09

address comments

9dc5b6d

Merge branch 'main' into fvgc

53ba38f

pmeier approved these changes Jan 14, 2022

View reviewed changes

torchvision/datasets/fgvc_aircraft.py Outdated Show resolved Hide resolved

torchvision/datasets/fgvc_aircraft.py Outdated Show resolved Hide resolved

Apply suggestions from code review

2db1bf4

pmeier requested a review from NicolasHug January 14, 2022 21:23

yiwen-song added 3 commits January 14, 2022 21:41

unify format

3098399

merge

12b74a0

remove useless line

caf762b

yiwen-song merged commit adf8466 into pytorch:main Jan 14, 2022

yiwen-song deleted the fvgc branch January 14, 2022 22:59

pmeier mentioned this pull request Jun 27, 2022

Add FGVC-Aircraft Dataset #467

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding fvgc_aircraft dataset #5178

Adding fvgc_aircraft dataset #5178

yiwen-song commented Jan 9, 2022 •

edited by pytorch-probot bot

Loading

facebook-github-bot commented Jan 9, 2022 •

edited

Loading

pmeier left a comment

NicolasHug left a comment

yiwen-song commented Jan 10, 2022

pmeier left a comment

NicolasHug commented Jan 12, 2022

yiwen-song commented Jan 12, 2022

pmeier left a comment

Adding fvgc_aircraft dataset #5178

Adding fvgc_aircraft dataset #5178

Conversation

yiwen-song commented Jan 9, 2022 • edited by pytorch-probot bot Loading

facebook-github-bot commented Jan 9, 2022 • edited Loading

💊 CI failures summary and remediations

pmeier left a comment

Choose a reason for hiding this comment

NicolasHug left a comment

Choose a reason for hiding this comment

yiwen-song commented Jan 10, 2022

pmeier left a comment

Choose a reason for hiding this comment

NicolasHug commented Jan 12, 2022

yiwen-song commented Jan 12, 2022

pmeier left a comment

Choose a reason for hiding this comment

yiwen-song commented Jan 9, 2022 •

edited by pytorch-probot bot

Loading

facebook-github-bot commented Jan 9, 2022 •

edited

Loading