-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve COCO parsing error messages #684
Conversation
21b7006
to
20703c5
Compare
20703c5
to
9238ca2
Compare
tests/test_coco_format.py
Outdated
anns = { | ||
"images": [ | ||
{ | ||
"id": 5, | ||
"width": 10, | ||
"height": 5, | ||
"file_name": "a.jpg", | ||
} | ||
], | ||
"annotations": [], | ||
"categories": [], | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these tests are very similar; I think it would be more readable to factor out the common parts into a single function that takes one parameter (a function that mangles the dataset) and returns the resulting exception. Then the individual tests could just look like this:
def mangle(anns):
del anns["images"][0][field]
exc = self._test_mangled_annotations(mangle)
self.assertIsInstance(capture.exception.__cause__, MissingFieldError)
self.assertEqual(capture.exception.__cause__.name, field)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree about the common part extraction, I'll check if it's possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You extracted a shared annotation template (which is certainly a big improvement), but there's still a lot of duplication in the structure of the tests. How about factoring that out too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you see that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like in my code snippet above. In full it would look something like this:
def _load_mangled_annotations(self, mangle):
with TestDir() as test_dir:
ann_path = osp.join(test_dir, "ann.json")
anns = mangle(deepcopy(self.ANNOTATION_JSON_TEMPLATE))
dump_json_file(ann_path, anns)
with self.assertRaises(ItemImportError) as capture:
Dataset.import_from(ann_path, "coco_instances")
return capture
def test_can_report_missing_item_field(self):
for field in ["id", "file_name"]:
with self.subTest(field=field):
def mangle(anns):
anns["images"][0].pop(field)
exc = self._load_mangled_annotations(mangle)
self.assertIsInstance(capture.exception.__cause__, MissingFieldError)
self.assertEqual(capture.exception.__cause__.name, field)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but it seem to save 1 line of code per test. It also requires to catch DatasetImportError
and check the type and message in the test.
@@ -836,6 +848,205 @@ def test_can_pickle(self): | |||
compare_datasets_strict(self, source, parsed) | |||
|
|||
|
|||
class CocoExtractorTests(TestCase): | |||
ANNOTATION_JSON_TEMPLATE = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, it would be useful to have a test that loads the unmodified template, to ensure that any errors in the other tests are due to the modifications made in those tests and not due to the template itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that the common piece was possible to extract is occasional, it is not something designed or expected. There are lots of tests that cover successful loading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that the common piece was possible to extract is occasional, it is not something designed or expected.
How is that not expected? The point of the new tests is to make sure that specific errors in the dataset cause the extraction to fail. To be certain that it's those errors that cause the failure, we need to ensure that the original template contains no errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is checked by the error details checks. However, I can agree, that such a test could be useful on its own, because there is no such test for just a json parsing.
Summary
How to test
Checklist
develop
branchLicense
Feel free to contact the maintainers if that's a concern.