Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix task creation with gt job and gt job frame access #8510

Merged
merged 14 commits into from
Oct 7, 2024

Conversation

zhiltsov-max
Copy link
Contributor

@zhiltsov-max zhiltsov-max commented Oct 4, 2024

Motivation and context

Fixes problems introduced by #8272 and #8348, including discovered in #8507

  • Fixed invalid chunks for GT jobs when ?number is used and task frame step > 1
  • Fixed invalid output of frames for specific GT frame requests
  • Fixed incorrect GT jobs created on task creation, when there are start frame > 0 or frame step > 1 in the task
  • Fixed incorrect static chunk creation for GT jobs on task creation
  • Fixed incorrect handling of manual and random_per_job options for GT job creation, when there are start frame > 0 or frame step > 1 in the task
  • Improved stability of several tests

How has this been tested?

Unit tests

Checklist

  • I submit my changes into the develop branch
  • I have created a changelog fragment
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • I have linked related issues (see GitHub docs)
  • I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

Summary by CodeRabbit

  • New Features

    • Enhanced caching mechanism for media segments and previews, improving efficiency.
    • Improved frame selection logic during task creation, ensuring accurate frame mapping and validation.
    • Added functionality for cloud storage previews and segment task chunk management.
  • Bug Fixes

    • Resolved issues with invalid chunk generation and output for GT jobs, ensuring correct processing of requests.
  • Tests

    • Added comprehensive test cases for task visibility, creation, export/import, annotation handling, and task previews to ensure robust functionality.

Copy link
Contributor

coderabbitai bot commented Oct 4, 2024

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The changes in this pull request address multiple enhancements and fixes within the CVAT application. Key updates include fixes for handling Ground Truth (GT) jobs, improvements to caching mechanisms for media segments, and refinements in frame validation and selection during task creation. New methods have been introduced in various classes to improve functionality, and numerous tests have been added to ensure the robustness of these features.

Changes

File Path Change Summary
changelog.d/20241004_144626_mzhiltso_fix_task_creation_with_gt_job.md Fixes for handling GT jobs, including invalid chunk generation and output for frame requests.
cvat/apps/engine/cache.py Modifications to MediaCache class, including new methods for caching media segments and previews, and enhanced error handling.
cvat/apps/engine/frame_provider.py Updated validate_frame_number method to sort frame_sequence for consistent frame validation.
cvat/apps/engine/serializers.py Enhancements to JobWriteSerializer for improved frame selection logic during job creation.
cvat/apps/engine/task.py Added helper functions for converting between absolute and relative frame indices in task creation.
tests/python/rest_api/test_tasks.py Added new test cases for task visibility, creation, export/import, annotation handling, and task previews.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant API
    participant Cache
    participant TaskManager

    User->>API: Request task creation
    API->>Cache: Check for existing task data
    Cache-->>API: Return cached data (if available)
    API->>TaskManager: Create task with validated frames
    TaskManager->>Cache: Store new task data
    Cache-->>TaskManager: Confirm storage
    TaskManager-->>API: Return task creation response
    API-->>User: Confirm task creation
Loading

🐇 "In the meadow, changes bloom,
Fixes and features chase away the gloom.
Caching and tasks, all in a line,
With frames validated, everything's fine!
Hopping through code, we celebrate,
For every new feature, we can't wait!" 🐰✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (12)
changelog.d/20241004_144626_mzhiltso_fix_task_creation_with_gt_job.md (2)

3-4: Consider adding more context to the changelog entry.

The description of the fix is clear and concise. However, it might be helpful to provide more context about what "chunks" are and why this fix is important for users. This additional information can help users better understand the impact of the change.

Consider expanding the description slightly, for example:

- Invalid chunks for GT jobs when `?number` is used in the request and task frame step > 1
+ Fixed generation of invalid data chunks for GT jobs when `?number` is used in the request and task frame step > 1, ensuring correct data processing for these scenarios

5-6: Consider elaborating on the impact of the fix.

The description clearly identifies the issue and the affected API endpoint. To make it more informative for users, consider adding a brief note about the impact of this fix or the nature of the invalid output that was occurring.

You might expand the description as follows:

- Invalid output of frames for specific GT frame requests with `api/jobs/{id}/data/?type=frame`
+ Fixed invalid output of frames for specific GT frame requests with `api/jobs/{id}/data/?type=frame`, ensuring correct frame data is returned for GT jobs
cvat/apps/engine/frame_provider.py (1)

488-489: Approve change with a minor optimization suggestion.

The sorting of frame_sequence ensures consistent frame number validation, which is a good practice. This change aligns with the PR objectives by addressing issues related to GT job creation and frame handling.

However, consider the following optimization:

-frame_sequence = sorted(self._db_segment.frame_set)
+frame_sequence = sorted(self._db_segment.frame_set) if not isinstance(self._db_segment.frame_set, range) else self._db_segment.frame_set

This optimization avoids unnecessary sorting if frame_set is already a range object, which is inherently ordered.

cvat/apps/engine/cache.py (7)

Line range hint 115-119: Improper Handling of Checksum Mismatch in _get_or_set_cache_item

When a checksum mismatch is detected, the current implementation recreates the cache item without handling potential concurrency issues. Consider adding thread safety mechanisms, such as locks, to prevent race conditions where multiple threads might attempt to recreate the cache item simultaneously.


Line range hint 131-133: Broad Exception Handling in _delete_cache_item

Catching only pickle.UnpicklingError might not be sufficient if other exceptions occur during cache deletion. Consider catching a broader range of exceptions or using a generic Exception to ensure that all potential errors are logged and handled appropriately.


Line range hint 247-255: Lack of Error Handling in get_or_set_selective_job_chunk

The method get_or_set_selective_job_chunk uses a lambda function that calls prepare_masked_range_segment_chunk, which may raise exceptions. It's advisable to add try-except blocks to handle potential errors gracefully and log informative messages for easier debugging.


Line range hint 507-511: Incorrect Indentation in frame_range Calculation

The frame_range calculation in prepare_masked_range_segment_chunk appears to have improper indentation, which could lead to incorrect frame sequence generation or syntax errors.

Apply this diff to correct the indentation:

-                        (
-                            db_data.start_frame
-                            + (chunk_number * db_data.chunk_size + chunk_frame_idx) * frame_step
-                        )
-                        for chunk_frame_idx in range(db_data.chunk_size)
+                    (
+                        db_data.start_frame
+                        + (chunk_number * db_data.chunk_size + chunk_frame_idx) * frame_step
+                    )
+                    for chunk_frame_idx in range(db_data.chunk_size)

Line range hint 607-610: Hardcoded Path for 3D Preview Image

In _prepare_segment_preview, the path to the 3D preview image is hardcoded:

preview = PIL.Image.open(
    os.path.join(os.path.dirname(__file__), "assets/3d_preview.jpeg")
)

This could cause issues in different environments or if the asset path changes. Consider moving the asset path to a configuration file or using a resource management system to ensure the path is always valid.


Line range hint 686-693: Potential Empty MIME Type Return in prepare_context_images_chunk

When related_images is empty, the method returns without specifying a MIME type:

if not related_images:
    return zip_buffer, ""

Returning an empty string for the MIME type might cause issues for functions expecting a valid MIME type. Consider returning a default MIME type, such as 'application/zip', even when the buffer is empty.

Apply this diff to set a default MIME type:

-            return zip_buffer, ""
+            return zip_buffer, "application/zip"

Line range hint 695-712: Exception Not Handled in Image Encoding Loop

In the loop where images are encoded and added to the ZIP file, if cv2.imencode fails, an exception is raised but not specifically handled:

success, result = cv2.imencode(".JPEG", image)
if not success:
    raise Exception('Failed to encode image to ".jpeg" format')

Consider catching specific exceptions and providing more detailed error messages. Additionally, skipping the faulty image instead of halting the entire process might improve robustness.

Apply this diff to handle exceptions:

-            if not success:
-                raise Exception('Failed to encode image to ".jpeg" format')
+            if not success:
+                slogger.glob.error(f"Failed to encode image {name} to '.jpeg' format")
+                continue

Alternatively, you could wrap the encoding process in a try-except block to catch and log exceptions without interrupting the loop.

cvat/apps/engine/serializers.py (1)

845-845: Enhance Readability of Overlap Handling in Frame Selection

The expression overlap * (segment.start_frame != 0) may be unclear to some readers. Multiplying a boolean value can reduce code readability. Consider refactoring to make the intention more explicit.

Apply this diff to improve clarity:

-selectable_segment_frames = set(
-    sorted(segment_frames)[overlap * (segment.start_frame != 0) : ]
-).difference(selected_frames)
+skip_overlap = overlap if segment.start_frame != 0 else 0
+selectable_segment_frames = set(
+    sorted(segment_frames)[skip_overlap:]
+).difference(selected_frames)
tests/python/rest_api/test_tasks.py (1)

Line range hint 3330-3447: Consider simplifying nested conditionals for readability

The deeply nested conditionals between lines 3343-3447 may impact readability. Refactoring some of the inner conditionals into helper functions or flattening the structure can improve clarity.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 1285858 and a5c2bdd.

📒 Files selected for processing (6)
  • changelog.d/20241004_144626_mzhiltso_fix_task_creation_with_gt_job.md (1 hunks)
  • cvat/apps/engine/cache.py (1 hunks)
  • cvat/apps/engine/frame_provider.py (1 hunks)
  • cvat/apps/engine/serializers.py (3 hunks)
  • cvat/apps/engine/task.py (5 hunks)
  • tests/python/rest_api/test_tasks.py (21 hunks)
🔇 Additional comments (11)
cvat/apps/engine/cache.py (2)

Line range hint 174-176: Verify Updates to _make_segment_task_chunk_key Calls

The method _make_segment_task_chunk_key now includes the quality parameter. Please ensure that all invocations of this method have been updated to pass the quality argument to maintain cache key consistency.

Run the following script to find any calls missing the quality parameter:

#!/bin/bash
# Description: Verify all calls to _make_segment_task_chunk_key include the quality parameter.

# Test: Search for _make_segment_task_chunk_key calls. Expect: All calls should include quality parameter.
ast-grep --lang python --pattern $'_make_segment_task_chunk_key($_, $_, quality=$_)'

Line range hint 95-99: Ensure Consistent Use of quality Parameter in _make_chunk_key

The method _make_chunk_key now includes the quality parameter in its signature and return value. Please verify that all calls to this method throughout the codebase have been updated to include the quality argument to prevent cache key inconsistencies.

Run the following script to check for calls missing the quality parameter:

✅ Verification successful

All _make_chunk_key calls include the quality parameter. Verification successful.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all calls to _make_chunk_key include the quality parameter.

# Test: Search for _make_chunk_key calls. Expect: All calls should include quality parameter.
ast-grep --lang python --pattern $'_make_chunk_key($_, $_, quality=$_)'

Length of output: 85


Script:

#!/bin/bash
# Description: Verify all calls to _make_chunk_key include the quality parameter.

# Corrected pattern to search for _make_chunk_key calls missing the quality parameter.
ast-grep --lang python --pattern '_make_chunk_key($_, $_, quality=$_)'

Length of output: 72


Script:

#!/bin/bash
# Description: List all calls to _make_chunk_key to verify the inclusion of the quality parameter.

# Search for all _make_chunk_key calls and display their surrounding context
rg '_make_chunk_key\(' --context=2

Length of output: 2332

cvat/apps/engine/task.py (6)

1401-1403: Addition of helper function _to_rel_frame enhances frame index handling

The new helper function _to_rel_frame correctly converts absolute frame indices to relative indices based on the task's start_frame and frame_step. This improves code readability and ensures consistency in frame index calculations throughout the validation frame selection process.


1437-1437: Verify correct mapping of segment frames to relative indices

At line 1437, you map segment.frame_set to relative frame indices using _to_rel_frame. Please ensure that segment.frame_set contains absolute frame indices and that this conversion is necessary and accurately applied in this context.


1446-1446: Check overlap handling in frame slicing logic

In the slicing operation sorted(segment_frames)[overlap * (segment.start_frame != 0) :], verify that the calculation of the starting index properly accounts for overlaps, especially when segment.start_frame is zero. This ensures that no frames are inadvertently skipped or duplicated due to incorrect overlap calculations.


1463-1463: Correct mapping of frame paths to relative frame indices

The dictionary comprehension at line 1463 effectively maps frame paths to their corresponding relative frame indices using _to_rel_frame(frame.frame). This ensures consistent identification of frames during the manual selection process.


1490-1492: Addition of helper function _to_abs_frame for accurate index conversion

The new helper function _to_abs_frame correctly converts relative frame indices back to absolute indices using db_data.get_frame_step() and db_data.start_frame. This is crucial for updating the validation layout with the correct absolute frame numbers.


1497-1497: Ensure frames are correctly converted to absolute indices when creating GT segment

Using list(map(_to_abs_frame, db_data.validation_layout.frames)) at line 1497 properly converts the relative validation frames back to absolute indices when initializing db_gt_segment. This maintains consistency with other parts of the system that expect absolute frame indices.

cvat/apps/engine/serializers.py (1)

Line range hint 825-852: Code Logic for Frame Selection in RANDOM_PER_JOB Method is Correct

The updated logic for the RANDOM_PER_JOB frame selection method properly accounts for segment overlaps and ensures unique frame selection across segments. The use of TaskFrameProvider enhances frame number management, and initializing the random number generator with a specified seed ensures reproducibility.

tests/python/rest_api/test_tasks.py (2)

2644-2646: LGTM!

The calculation of resulting_task_size is correct and handles edge cases appropriately.


2990-2999: LGTM!

The _get_job_abs_frame_set function correctly retrieves the set of frame indices based on included_frames or the frame range.

Comment on lines +2386 to +2395
if frame_selection_method in ("random_uniform", "manual"):
assert gt_job_metas[0].size == validation_frames_count
elif frame_selection_method == "random_per_job":
assert gt_job_metas[0].size == (
resulting_task_size // segment_size * validation_per_job_count
+ min(resulting_task_size % segment_size, validation_per_job_count)
)
else:
assert False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor duplicated conditional logic into a helper function

The conditional blocks in lines 2386-2395 and 2507-2516 are nearly identical. Consider extracting this logic into a helper function to reduce code duplication and enhance maintainability.

Example refactoring:

def _verify_gt_job_size(self, frame_selection_method, gt_job_metas, validation_frames_count, resulting_task_size, segment_size, validation_per_job_count):
    if frame_selection_method in ("random_uniform", "manual"):
        assert gt_job_metas[0].size == validation_frames_count
    elif frame_selection_method == "random_per_job":
        assert gt_job_metas[0].size == (
            resulting_task_size // segment_size * validation_per_job_count
            + min(resulting_task_size % segment_size, validation_per_job_count)
        )
    else:
        raise NotImplementedError(f"Unsupported frame_selection_method: {frame_selection_method}")

Then replace the duplicated code with calls to _verify_gt_job_size.

Also applies to: 2507-2516


⚠️ Potential issue

Replace assert False with a more informative exception

Using assert False in the else block on line 2394 raises a generic AssertionError without context. Consider raising a more descriptive exception like NotImplementedError with an informative message to facilitate debugging when an unsupported frame_selection_method is encountered.

Apply this diff to improve error handling:

 else:
-    assert False
+    raise NotImplementedError(f"Unsupported frame_selection_method: {frame_selection_method}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if frame_selection_method in ("random_uniform", "manual"):
assert gt_job_metas[0].size == validation_frames_count
elif frame_selection_method == "random_per_job":
assert gt_job_metas[0].size == (
resulting_task_size // segment_size * validation_per_job_count
+ min(resulting_task_size % segment_size, validation_per_job_count)
)
else:
assert False
if frame_selection_method in ("random_uniform", "manual"):
assert gt_job_metas[0].size == validation_frames_count
elif frame_selection_method == "random_per_job":
assert gt_job_metas[0].size == (
resulting_task_size // segment_size * validation_per_job_count
+ min(resulting_task_size % segment_size, validation_per_job_count)
)
else:
raise NotImplementedError(f"Unsupported frame_selection_method: {frame_selection_method}")

Comment on lines +2507 to +2516
if frame_selection_method == "random_uniform":
assert gt_job_metas[0].size == validation_frames_count
elif frame_selection_method == "random_per_job":
assert gt_job_metas[0].size == (
resulting_task_size // segment_size * validation_per_job_count
+ min(resulting_task_size % segment_size, validation_per_job_count)
)
else:
assert False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace assert False with a more informative exception

Similarly, on line 2515, using assert False does not provide sufficient information about the error. Raising a specific exception with a descriptive message will help identify issues with unsupported frame_selection_method values.

Apply this diff to improve error handling:

 else:
-    assert False
+    raise NotImplementedError(f"Unsupported frame_selection_method: {frame_selection_method}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if frame_selection_method == "random_uniform":
assert gt_job_metas[0].size == validation_frames_count
elif frame_selection_method == "random_per_job":
assert gt_job_metas[0].size == (
resulting_task_size // segment_size * validation_per_job_count
+ min(resulting_task_size % segment_size, validation_per_job_count)
)
else:
assert False
if frame_selection_method == "random_uniform":
assert gt_job_metas[0].size == validation_frames_count
elif frame_selection_method == "random_per_job":
assert gt_job_metas[0].size == (
resulting_task_size // segment_size * validation_per_job_count
+ min(resulting_task_size % segment_size, validation_per_job_count)
)
else:
raise NotImplementedError(f"Unsupported frame_selection_method: {frame_selection_method}")

Comment on lines +2785 to +2859
def _uploaded_images_task_with_gt_and_segments_base(
self,
request: pytest.FixtureRequest,
*,
start_frame: Optional[int] = None,
step: Optional[int] = None,
frame_selection_method: str = "random_uniform",
) -> Generator[Tuple[_TaskSpec, int], None, None]:
used_frames_count = 16
total_frame_count = (start_frame or 0) + used_frames_count * (step or 1)
segment_size = 5
image_files = generate_image_files(total_frame_count)

validation_params_kwargs = {"frame_selection_method": frame_selection_method}

if "random" in frame_selection_method:
validation_params_kwargs["random_seed"] = 42

if frame_selection_method == "random_uniform":
validation_frames_count = 10
validation_params_kwargs["frame_count"] = validation_frames_count
elif frame_selection_method == "random_per_job":
frames_per_job_count = 3
validation_params_kwargs["frames_per_job_count"] = frames_per_job_count
validation_frames_count = used_frames_count // segment_size + min(
used_frames_count % segment_size, frames_per_job_count
)
elif frame_selection_method == "manual":
validation_frames_count = 10

valid_frame_ids = range(
(start_frame or 0), (start_frame or 0) + used_frames_count * (step or 1), step or 1
)
rng = np.random.Generator(np.random.MT19937(seed=42))
validation_params_kwargs["frames"] = rng.choice(
[f.name for i, f in enumerate(image_files) if i in valid_frame_ids],
validation_frames_count,
replace=False,
).tolist()
else:
raise NotImplementedError

validation_params = models.DataRequestValidationParams._from_openapi_data(
mode="gt",
**validation_params_kwargs,
)

yield from self._uploaded_images_task_fxt_base(
request=request,
frame_count=None,
image_files=image_files,
segment_size=segment_size,
sorting_method="natural",
start_frame=start_frame,
step=step,
validation_params=validation_params,
)

@fixture(scope="class")
@parametrize("start_frame, step", [(2, 3)])
@parametrize("frame_selection_method", ["random_uniform", "random_per_job", "manual"])
def fxt_uploaded_images_task_with_gt_and_segments_start_step(
self,
request: pytest.FixtureRequest,
start_frame: Optional[int],
step: Optional[int],
frame_selection_method: str,
) -> Generator[Tuple[_TaskSpec, int], None, None]:
yield from self._uploaded_images_task_with_gt_and_segments_base(
request,
start_frame=start_frame,
step=step,
frame_selection_method=frame_selection_method,
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure consistent exception handling for unimplemented methods

In lines 2825-2826, a NotImplementedError is raised without an accompanying message. Providing a descriptive message enhances clarity when an unsupported frame_selection_method is encountered.

Apply this diff to add an informative message:

 else:
-    raise NotImplementedError
+    raise NotImplementedError(f"Unsupported frame_selection_method: {frame_selection_method}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _uploaded_images_task_with_gt_and_segments_base(
self,
request: pytest.FixtureRequest,
*,
start_frame: Optional[int] = None,
step: Optional[int] = None,
frame_selection_method: str = "random_uniform",
) -> Generator[Tuple[_TaskSpec, int], None, None]:
used_frames_count = 16
total_frame_count = (start_frame or 0) + used_frames_count * (step or 1)
segment_size = 5
image_files = generate_image_files(total_frame_count)
validation_params_kwargs = {"frame_selection_method": frame_selection_method}
if "random" in frame_selection_method:
validation_params_kwargs["random_seed"] = 42
if frame_selection_method == "random_uniform":
validation_frames_count = 10
validation_params_kwargs["frame_count"] = validation_frames_count
elif frame_selection_method == "random_per_job":
frames_per_job_count = 3
validation_params_kwargs["frames_per_job_count"] = frames_per_job_count
validation_frames_count = used_frames_count // segment_size + min(
used_frames_count % segment_size, frames_per_job_count
)
elif frame_selection_method == "manual":
validation_frames_count = 10
valid_frame_ids = range(
(start_frame or 0), (start_frame or 0) + used_frames_count * (step or 1), step or 1
)
rng = np.random.Generator(np.random.MT19937(seed=42))
validation_params_kwargs["frames"] = rng.choice(
[f.name for i, f in enumerate(image_files) if i in valid_frame_ids],
validation_frames_count,
replace=False,
).tolist()
else:
raise NotImplementedError
validation_params = models.DataRequestValidationParams._from_openapi_data(
mode="gt",
**validation_params_kwargs,
)
yield from self._uploaded_images_task_fxt_base(
request=request,
frame_count=None,
image_files=image_files,
segment_size=segment_size,
sorting_method="natural",
start_frame=start_frame,
step=step,
validation_params=validation_params,
)
@fixture(scope="class")
@parametrize("start_frame, step", [(2, 3)])
@parametrize("frame_selection_method", ["random_uniform", "random_per_job", "manual"])
def fxt_uploaded_images_task_with_gt_and_segments_start_step(
self,
request: pytest.FixtureRequest,
start_frame: Optional[int],
step: Optional[int],
frame_selection_method: str,
) -> Generator[Tuple[_TaskSpec, int], None, None]:
yield from self._uploaded_images_task_with_gt_and_segments_base(
request,
start_frame=start_frame,
step=step,
frame_selection_method=frame_selection_method,
)
def _uploaded_images_task_with_gt_and_segments_base(
self,
request: pytest.FixtureRequest,
*,
start_frame: Optional[int] = None,
step: Optional[int] = None,
frame_selection_method: str = "random_uniform",
) -> Generator[Tuple[_TaskSpec, int], None, None]:
used_frames_count = 16
total_frame_count = (start_frame or 0) + used_frames_count * (step or 1)
segment_size = 5
image_files = generate_image_files(total_frame_count)
validation_params_kwargs = {"frame_selection_method": frame_selection_method}
if "random" in frame_selection_method:
validation_params_kwargs["random_seed"] = 42
if frame_selection_method == "random_uniform":
validation_frames_count = 10
validation_params_kwargs["frame_count"] = validation_frames_count
elif frame_selection_method == "random_per_job":
frames_per_job_count = 3
validation_params_kwargs["frames_per_job_count"] = frames_per_job_count
validation_frames_count = used_frames_count // segment_size + min(
used_frames_count % segment_size, frames_per_job_count
)
elif frame_selection_method == "manual":
validation_frames_count = 10
valid_frame_ids = range(
(start_frame or 0), (start_frame or 0) + used_frames_count * (step or 1), step or 1
)
rng = np.random.Generator(np.random.MT19937(seed=42))
validation_params_kwargs["frames"] = rng.choice(
[f.name for i, f in enumerate(image_files) if i in valid_frame_ids],
validation_frames_count,
replace=False,
).tolist()
else:
raise NotImplementedError(f"Unsupported frame_selection_method: {frame_selection_method}")
validation_params = models.DataRequestValidationParams._from_openapi_data(
mode="gt",
**validation_params_kwargs,
)
yield from self._uploaded_images_task_fxt_base(
request=request,
frame_count=None,
image_files=image_files,
segment_size=segment_size,
sorting_method="natural",
start_frame=start_frame,
step=step,
validation_params=validation_params,
)
@fixture(scope="class")
@parametrize("start_frame, step", [(2, 3)])
@parametrize("frame_selection_method", ["random_uniform", "random_per_job", "manual"])
def fxt_uploaded_images_task_with_gt_and_segments_start_step(
self,
request: pytest.FixtureRequest,
start_frame: Optional[int],
step: Optional[int],
frame_selection_method: str,
) -> Generator[Tuple[_TaskSpec, int], None, None]:
yield from self._uploaded_images_task_with_gt_and_segments_base(
request,
start_frame=start_frame,
step=step,
frame_selection_method=frame_selection_method,
)

Copy link

sonarcloud bot commented Oct 4, 2024

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 70.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 74.30%. Comparing base (1285858) to head (ee4e347).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8510      +/-   ##
===========================================
+ Coverage    74.09%   74.30%   +0.20%     
===========================================
  Files          396      396              
  Lines        42768    42774       +6     
  Branches      3897     3897              
===========================================
+ Hits         31691    31783      +92     
+ Misses       11077    10991      -86     
Components Coverage Δ
cvat-ui 78.75% <ø> (ø)
cvat-server 70.46% <70.00%> (+0.38%) ⬆️

@zhiltsov-max zhiltsov-max mentioned this pull request Oct 4, 2024
7 tasks
@bsekachev bsekachev merged commit 0a45bc7 into develop Oct 7, 2024
34 checks passed
@cvat-bot cvat-bot bot mentioned this pull request Oct 10, 2024
@bsekachev bsekachev deleted the zm/fix-task-creation-with-gt-job branch October 24, 2024 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants