Enhance 'id_from_image_name' transform to ensure each identifier is unique #1635

jihyeonyi · 2024-10-08T04:32:15Z

Summary

Ticket: 153389

Enhance 'id_from_image_name' transform to ensure each identifier is unique.
- add random suffix if the image name is not distinct: [image_name]__[suffix]
- introduce related parameters: ensure_unique(default: false), suffix_length (default: 3)
Handle VideoFrame considering its index
- format: [video_name]_frame-[index]

How to test

Checklist

I have added unit tests to cover my changes.
I have added integration tests to cover my changes.
I have added the description of my changes into CHANGELOG.
I have updated the documentation accordingly

License

I submit my code changes under the same MIT License that covers the project.
Feel free to contact the maintainers if that's a concern.
I have updated the license header for each file (see an example below).

# Copyright (C) 2024 Intel Corporation
#
# SPDX-License-Identifier: MIT

codecov · 2024-10-11T02:50:21Z

Codecov Report

Attention: Patch coverage is 97.22222% with 1 line in your changes missing coverage. Please review.

Project coverage is 81.17%. Comparing base (ff5fd94) to head (2532532).
Report is 15 commits behind head on develop.

Files with missing lines	Patch %	Lines
src/datumaro/plugins/transforms.py	97.22%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1635      +/-   ##
===========================================
+ Coverage    81.06%   81.17%   +0.10%     
===========================================
  Files          278      281       +3     
  Lines        32517    32781     +264     
  Branches      6607     5294    -1313     
===========================================
+ Hits         26360    26609     +249     
- Misses        4701     4721      +20     
+ Partials      1456     1451       -5

Flag	Coverage Δ
ubuntu-20.04_Python-3.10	`81.15% <97.22%> (+0.10%)`	⬆️
windows-2022_Python-3.10	`81.14% <97.22%> (+0.10%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sooahleex

Looks good to me, but I left one comment. It may be misunderstanding due to my low understanding of this feature, if it doesn't matter, you can ignore it.

sooahleex · 2024-10-11T05:46:21Z

tests/unit/test_transforms.py

+    def fxt_dataset(self, n_labels=3, n_anns=5, n_items=7) -> Dataset:
+        video = Video("video.mp4")
+        video._frame_size = MagicMock(return_value=(32, 32))
+        video.get_frame_data = MagicMock(return_value=np.ndarray((32, 32, 3), dtype=np.uint8))
+        return Dataset.from_iterable(
+            [
+                DatasetItem(id=1, media=Image.from_file(path="path1.jpg")),
+                DatasetItem(id=2, media=Image.from_file(path="path1.jpg")),
+                DatasetItem(id=3, media=Image.from_file(path="path1.jpg")),
+                DatasetItem(id=4, media=VideoFrame(video, index=30)),
+                DatasetItem(id=5, media=VideoFrame(video, index=30)),
+                DatasetItem(id=6, media=VideoFrame(video, index=60)),
+                DatasetItem(id=7),
+                DatasetItem(id=8, media=Image.from_numpy(data=np.ones([5, 5, 3]))),
+            ]
+        )


Is it okay to bundle the dataset for all tests in one without dividing the dataset into several?

As you said, we can make several datasets, but I think it doesn't matter to reuse it.

jihyeonyi added 3 commits October 8, 2024 13:22

ensure unique and handle VideoFrame too

d35ec9e

update document

dfb103c

change log

f3017c7

jihyeonyi marked this pull request as ready for review October 8, 2024 08:53

jihyeonyi requested review from a team as code owners October 8, 2024 08:53

jihyeonyi requested review from sooahleex and removed request for a team October 8, 2024 08:53

fix test

633b539

update test

2532532

sooahleex approved these changes Oct 11, 2024

View reviewed changes

jihyeonyi merged commit f716078 into openvinotoolkit:develop Oct 11, 2024
8 checks passed

jihyeonyi deleted the jihyeony/enhance_id_from_image_name branch October 11, 2024 07:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance 'id_from_image_name' transform to ensure each identifier is unique #1635

Enhance 'id_from_image_name' transform to ensure each identifier is unique #1635

jihyeonyi commented Oct 8, 2024 •

edited

Loading

codecov bot commented Oct 11, 2024 •

edited

Loading

sooahleex left a comment

sooahleex Oct 11, 2024

jihyeonyi Oct 11, 2024

Enhance 'id_from_image_name' transform to ensure each identifier is unique #1635

Enhance 'id_from_image_name' transform to ensure each identifier is unique #1635

Conversation

jihyeonyi commented Oct 8, 2024 • edited Loading

Summary

How to test

Checklist

License

codecov bot commented Oct 11, 2024 • edited Loading

Codecov Report

sooahleex left a comment

Choose a reason for hiding this comment

sooahleex Oct 11, 2024

Choose a reason for hiding this comment

jihyeonyi Oct 11, 2024

Choose a reason for hiding this comment

jihyeonyi commented Oct 8, 2024 •

edited

Loading

codecov bot commented Oct 11, 2024 •

edited

Loading