-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add widerface dataset #2883
Add widerface dataset #2883
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2883 +/- ##
==========================================
- Coverage 73.41% 72.57% -0.85%
==========================================
Files 99 100 +1
Lines 8801 8947 +146
Branches 1389 1419 +30
==========================================
+ Hits 6461 6493 +32
- Misses 1915 2033 +118
+ Partials 425 421 -4
Continue to review full report at Codecov.
|
…into add-widerface-dataset
@fmassa I need some clarification. Are you suggesting to remove the
Should the If |
…into add-widerface-dataset
@fmassa and @pmeier I'm just picking this PR back up. It got lost during the holidays I think. I've removed the dataset citation for now and removed the Do you have any other suggestions or is this about ready to merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand your implementation correctly, the target is always returned as None
for the test split. If that is the case, I would like to hear @fmassa s thoughts about this, since it would create precedence. As of now, all our datasets always return a valid target.
Other than that two minor nitpicks.
Co-authored-by: Philip Meier <[email protected]>
Ground truth annotations for the WIDERFace test split are not released publicly (as noted in the description section of the main website). In this case, the only options we can provide here is either:
A comment has been added to Actual evaluation on the test split requires a submission to the dataset author. I prefer option 2 as it would encourage users to go for SOTA on the dataset. |
Hi @jgbradley1 Sorry for the delay. I think I would be ok letting the dataset return a If we were to return either |
Looks like there is a problem with a couple of the conda CI builds (unrelated to my changes as far I can tell). I don't think it's anything I can resolve from my end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @jgbradley1!
Summary: * initial commit of widerface dataset * comment out old code * improve parsing of annotation files * code cleanup and fix docstring comments * speed up check for quota exceeded * cleanup print statements * reformat code and remove print statements * minor code cleanup and reformatting * add more comments * reuse variable * reverse formatting changes * fix flake8 errors * add type annotations * fix mypy errors * add a base_folder to root directory * some formatting fixes * GDrive threshold does not throw 403 error * testing new download logic * cleanup logic for download and integrity check * use a better variable name * format fix * reorder list in docstring * initial widerface unit test - fails on MD5 check * use list of dictionaries to store dataset * fix docstring formatting * remove unnecessary error checking * fix type checker error * revert typo fix * rename var constants, use file context manager, verify str args * fix flake8 error * fix checking target_type argument values * create uncompressed dataset folders * cleanup unit tests for widerface * use correct os function * add more info to docstring * disable unittests for windows * fix _check_integrity logic * update docstring * remove citation * remove target_type option * fix formatting issue * remove comment and add more info to docstring * update type annotations * restart CI jobs Reviewed By: datumbox Differential Revision: D25954560 fbshipit-source-id: 213da6919cda16e03d4a6be66eaa5eaa220cd8d0 Co-authored-by: Philip Meier <[email protected]> Co-authored-by: Joshua Bradley <[email protected]> Co-authored-by: Philip Meier <[email protected]> Co-authored-by: vfdev <[email protected]>
This PR addresses issue #1627 and adds a dataset that is more representative (and harder) of generic face detection.
Main changes in the PR include
improves the method to check and download files from google drive when download quota has been exceeded. See the Google drive API docs for more details, A quota limit exceeded will return a 403 response. Checking for the correct response code before performing a string search saves a significant amount of time for the most general use case (i.e. when the quota limit has not been exceeded).cc: @pmeier @vfdev-5