Add auto-annotation support to SDK and CLI #6483

SpecLad · 2023-07-14T14:47:16Z

Motivation and context

Introduce a cvat-sdk auto-annotate command that downloads data for a task,
then runs a function on the local computer on that data, and uploads
resulting annotations back to the task.

To support this functionality, add a new SDK module,
cvat_sdk.auto_annotation, that contains an interface that the functions
must follow, and a driver that applies a function to a task.

This will let users easily annotate their tasks with custom DL models.

How has this been tested?

Unit tests, manual tests.

Checklist

I submit my changes into the develop branch
I have added a description of my changes into the CHANGELOG file
I have updated the documentation accordingly
I have added tests to cover my changes
~~[ ] I have linked related issues (see GitHub docs)~~
[ ] I have increased versions of npm packages if it is necessary
(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)

License

I submit my code changes under the same MIT License that covers the project.
Feel free to contact the maintainers if that's a concern.

cvat-cli/src/cvat_cli/parser.py

zhiltsov-max · 2023-07-18T13:07:59Z

cvat-cli/src/cvat_cli/parser.py

+        metavar='MODULE',
+        required=True,
+        dest="function_module",
+    )


Probably, it makes sense to add other parameters:

job_id(s)

clear_existing / append

script file path

an extra parameter to for module root directory

strict mode, where we fail if the script labels do not exist in the dataset ("label 'car' is not in dataset; any annotations using it will be ignored" - is not something I'd like to see as a user after the model has been running for some time)

skip-errors to skip the frame if there is an error

Need to discuss this. From one point of view, some of these parameters can be convenient:

job_id(s) - probably we should be able to annotate jobs as well. I agree. But I don't think that we need the argument here

clear_existing - better to have just a separate command to remove annotation. Otherwise we can get very complicated interface.

script file path - OK, instead of a module name

strict - agree and maybe it should be a default mode. Thus we avoid many complains, that a function doesn't annotate a task. I met the problem today.

skip-errors - I'm not sure we want to skip any errors.

My thoughts:

job_id(s)

I think ideally the CLI should have separate sets of task and job based commands (e.g. cvat-cli task list / cvat-cli job list). IIUC, job auto-annotation won't require you to use any task-based APIs, so it would make more sense to implement it as a job subcommand (cvat-cli job auto-annotate).

However, that would be beyond the scope of this PR.

clear_existing / append

I agree with Maxim that this should be an option. Moreover, the default should probably be "append". Currently, the command clears existing annotations, but that's mainly just because it's easier. Defaulting to append will avoid accidental data loss.

@nmanovic I think we should have a separate command to clear annotations, but it's not very convenient to run a separate command every time you want to reannotate, so I think an option for this command would be useful to have.

script file path

I'm thinking of adding a --function-file option as an alternative to --function (and maybe renaming the latter to --function-module). Instead of importing, this will load the file as a string and compile it. It will require that the file has no relative imports, though.

an extra parameter to for module root directory

I don't know what this means.

strict

Yeah, fair point. I also agree with Nikita that this should be the default.

skip-errors

I get the idea, but then what do you do with frames that were skipped? There's no easy way to find them afterwards, and even if we could find them, we can't rerun the function on just those frames.

We could consider some solution for this, but IMO it goes beyond the current scope.

Instead of importing, this will load the file as a string and compile it. It will require that the file has no relative imports, though.

an extra parameter to for module root directory

I don't know what this means.

I've given an example in the test comments. Basically, it's needed both for relative imports and for fully-qualified imports to specify where to look for the import path (like PYTHONPATH does). Without this, it's hard to use scripts relying on a customized framework (e.g. mmdetection), lying somewhere near the cvat adapter script, or just outside the current directory:

# .../some/dir/mymodel/mmdetection/... # .../some/dir/mymodel/cvat_adapter.py cvat-cli auto-annotate --module-root '.../some/dir/' --module 'mymodel.cvat_adapter' ...

skip-errors

I get the idea, but then what do you do with frames that were skipped? There's no easy way to find them afterwards, and even if we could find them, we can't rerun the function on just those frames.

I thought about things like OOM errors on some frames. Not sure how often it happens in real applications.

Actually, the implementation allows execution on specific frames (except for tracks), as the function is executed on separate frames, so it's only a question of API/CLI. I also see an option to execute the model only on some part of the task a useful feature, though the CLI use may be complicated in this case. Probably, we can leave this for future updates.

I've given an example in the test comments. Basically, it's needed both for relative imports and for fully-qualified imports to specify where to look for the import path (like PYTHONPATH does).

So why not just use PYTHONPATH?

Actually, the implementation allows execution on specific frames (except for tracks), as the function is executed on separate frames, so it's only a question of API/CLI.

It's true that we can run the function on specific frames, but how do we know which frames to use (in the context of recovery from a partially successful auto-annotation run?

I also see an option to execute the model only on some part of the task a useful feature, though the CLI use may be complicated in this case. Probably, we can leave this for future updates.

I agree.

So why not just use PYTHONPATH?

Probably, it's an implementation detail, and this obliges users to learn environment variables.

how do we know which frames to use (in the context of recovery from a partially successful auto-annotation run?

It could be reported somehow, e.g. in the logs.

I added --clear-existing and --allow-unmatched-labels. Correspondingly, I changed the defaults to not erase old annotations, and abort when a function declares labels not configured in the task.

Probably, it's an implementation detail, and this obliges users to learn environment variables.

The user would have to learn something either way, and at least they might already know about PYTHONPATH.

zhiltsov-max · 2023-07-18T13:36:01Z

tests/python/cli/test_cli.py

+            "auto-annotate",
+            str(fxt_new_task.id),
+            f"--function={__package__}.example_function",
+        )


I suggest adding several other tests for different file placement, because we don't have a package when we call it from the actual CLI (in the tests we cheat a little, in practice we have to use something like PYTHONPATH="$PWD:$PYTHONPATH" cvat-cli ...).

Well... what else is there to test? Either way, the auto-annotate command just does an import on the provided module name.

A test that we can execute a function from a file/module in the current directory

A test that we can execute a function from a file/module outside the current directory subtree, and which is not installed in the current system or python environment (e.g. cvat-cli auto-annotate --function ../../my_cvat_model.py ...

A test that we can execute a function from a file/module, that references other modules (e.g. the function script wraps a patched framework in a nested directory, example is mmdetection + a custom script for cvat on top)

I don't see the point in these tests to be honest. We don't do anything unusual with the import mechanics. If Python can import a module, then so can we (and vice versa).

tests/python/cli/example_function.py

zhiltsov-max

Related: #5324

nmanovic · 2023-07-18T19:21:11Z

cvat-sdk/cvat_sdk/auto_annotation/interface.py

+
+
+@attrs.frozen(kw_only=True)
+class DetectionFunctionSpec:


From the user's perspective, I would not recommend using PatchedLabelRequest and LabeledShapeRequest. They are counterintuitive.

I will recommend defining alternative naming. Something like Label or Shape.

Another approach is to define some high-level interfaces like LabelingSpec and LabeledFrame inside SDK and use them.

In both cases, the interface will be clear and attractive.

Also, even I can read comments about DetectionFunctionContext, I'm not sure that we need it. How are you going to use it in the future?

I think I'll do something of a middle ground. I'll add factory functions with nice names and nice interfaces that construct the same old *Request objects. That way, the function implementation can look nice and clean, but we're not introducing any new concepts (and the users can still construct the low-level request objects if the factory functions are insufficient for some reason).

Also, even I can read comments about DetectionFunctionContext, I'm not sure that we need it. How are you going to use it in the future?

See the thread directly above this one.

I'll add factory functions with nice names and nice interfaces

I did it.

nmanovic · 2023-07-18T19:27:54Z

tests/python/cli/example_function.py

@@ -0,0 +1,18 @@
+# Copyright (C) 2023 CVAT.ai Corporation


Let's try to add a real example based on https://github.com/ultralytics/ultralytics. As you suggested, we can create a python module with a detection function that is based on ultralytics and use it to explain how to write such functions. Also, we can use it for a demonstration to our users.
Without a way to run the functionality out of the box, it will demotivate 90% of regular users.

nmanovic · 2023-07-18T19:37:04Z

tests/python/cli/test_cli.py

+        self.run_cli(
+            "auto-annotate",
+            str(fxt_new_task.id),
+            f"--function={__package__}.example_function",


Again, from a regular user perspective, I will recommend supporting at least several interfaces. I'm not sure that we have to force users to know what a Python module is. At the same time, many users know what is a directory. Do you think we can redesign the interface and use widely known concepts amount CLI users?

I added an alternate option that takes a file path.

nmanovic · 2023-07-18T19:39:57Z

cvat-sdk/cvat_sdk/auto_annotation/driver.py

+        mapper.validate_and_remap(frame_shapes, sample.frame_index)
+        shapes.extend(frame_shapes)
+
+    client.logger.info("Uploading annotations to task %d", task_id)


Here we force users to wait till all frames are annotated. In CLI we should be able to show progress and also I believe it will be useful to have a configuration option to upload data in batches. What do you think?

Regarding progress: yes we should.

Regarding uploading data in batches: I think notionally it could be useful, but in practice there are problems:

I'm not sure if the API supports partial uploads like this.

If the annotation fails mid-way, there's no way to restart it from where it left off (and it it doesn't fail mid-way, then what's the point in uploading in batches?).

I added progress reporting.

There are several problems with how progress reporting is handled in the SDK, both on the interface and implementation level: * The user is supposed to know, for a given function, which units it will report progress in. This is unnecessary coupling and prevents us from switching to different units, or to have several progress bars using different units. * To create a TqdmProgressReporter, you have to create a tqdm instance, which immediately draws a progress bar. This works poorly if the function prints any log messages before the progress actually starts. * There's no easy way to automatically call `finish` on a progress bar, so some functions don't (for example, `Downloader.download_file`). This can cause unexpected output, since tqdm will refresh the progress bar in a background thread (possibly after we've already printed something else). * `iter` is basically broken, because it divides by `period`, which is 0 in all current implementations. * Even ignoring that, it's hard to use correctly, because you need to manually call `finish` in case of an exception. * `split` is not implemented and not used. * `StreamWithProgress.seek` assumes that the progress bar is at 0 at the start, and so does `ProgressReporter.iter`. The former also works incorrectly if the second argument is not `SEEK_SET`. Fix these problems by doing the following: * Add a new `start2` method which accepts more parameters. The default implementation calls `start`, so that if a user has implemented the original interface, it'll keep working. * Add a `DeferredTqdmProgressReporter` that accepts tqdm parameters instead of a tqdm instance, and only creates an instance after `start2` is called. Use it where `TqdmProgressReporter` was used before. The old `TqdmProgressReporter` is kept for compatibility, but it doesn't support any `start2` arguments other than those supported by the original `start`. * Add a `task` context manager, which automatically calls `start2` and `finish`. Use it everywhere instead of explicit `start`/`finish` calls. Remove `start`/`finish` calls from `StreamWithProgress` and `iter`. * Implement basic assertions to ensure that `start2` and `finish` are used correctly. * Remove `period` and `split`. * Rewrite `StreamWithProgress.seek` and `ProgressReporter.iter` to use relative progress reports. These changes should be backwards compatible for users who pass predefined or custom progress reporters into SDK functions. They are not backwards compatible for users who try to use progress reporters directly (e.g. calling `start`/`finish`). I don't consider that a significant issue, since the purpose of the `ProgressReporter` interface is for the user to get progress information from the SDK, not for them to use it in their own code. Originally developed for #6483.

codecov · 2023-07-28T17:28:00Z

Codecov Report

Merging #6483 (0df5821) into develop (c56f2e9) will increase coverage by 0.06%.
The diff coverage is 98.06%.

@@             Coverage Diff             @@
##           develop    #6483      +/-   ##
===========================================
+ Coverage    81.17%   81.23%   +0.06%     
===========================================
  Files          363      368       +5     
  Lines        39606    39813     +207     
  Branches      3557     3557              
===========================================
+ Hits         32150    32342     +192     
- Misses        7456     7471      +15

Components	Coverage Δ
cvat-ui	`74.89% <ø> (-0.05%)`	⬇️
cvat-server	`86.79% <98.06%> (+0.09%)`	⬆️

zhiltsov-max · 2023-08-01T07:20:34Z

tests/python/cli/example_function.py

+    return [
+        cvataa.rectangle(0, [1, 2, 3, 4]),
+    ]


How about changing the interface to be the following:

The function does not return anything

There is a function to add an annotation for a frame (e.g. context.add_rectangle())

This way:

the users won't need to know the return annotation types (there can be another function for the users who know)

we can change the function API without breaking it

we can manage annotation storage (and optimize it, potentially)

It is a tempting proposal. 🤔 I'm just not sure if I'll have time to implement it; it's a pretty big change.

Introduce a `cvat-sdk auto-annotate` command that downloads data for a task, then runs a function on the local computer on that data, and uploads resulting annotations back to the task. To support this functionality, add a new SDK module, `cvat_sdk.auto_annotation`, that contains an interface that the functions must follow, and a driver that applies a function to a task.

…ation-related models

…t instead

SpecLad · 2023-08-02T13:03:57Z

I added a test for the built-in yolov8n function, as well as an ultralytics extra for the SDK to support said function.

Documentation for #6483.

## \[2.6.0\] - 2023-08-11 ### Added - \[SDK\] Introduced the `DeferredTqdmProgressReporter` class, which avoids the glitchy output seen with the `TqdmProgressReporter` under certain circumstances (<#6556>) - \[SDK, CLI\] Added the `cvat_sdk.auto_annotation` module, providing functionality to automatically annotate tasks by executing a user-provided function on the local machine. A corresponding CLI command (`auto-annotate`) is also available. Some predefined functions using torchvision are also available. (<#6483>, <#6649>) - Included an indication for cached frames in the interface (<#6586>) ### Changed - Raised the default guide assets limitations to 30 assets, with a maximum size of 10MB each (<#6575>) - \[SDK\] Custom `ProgressReporter` implementations should now override `start2` instead of `start` The old implementation is still supported. (<#6556>) - Improved memory optimization and code in the decoding module (<#6585>) ### Removed - Removed the YOLOv5 serverless function (<#6618>) ### Fixed - Corrected an issue where the prebuilt FFmpeg bundled in PyAV was being used instead of the custom build. - Fixed the filename for labels in the CamVid format (<#6600>)

There are several problems with how progress reporting is handled in the SDK, both on the interface and implementation level: * The user is supposed to know, for a given function, which units it will report progress in. This is unnecessary coupling and prevents us from switching to different units, or to have several progress bars using different units. * To create a TqdmProgressReporter, you have to create a tqdm instance, which immediately draws a progress bar. This works poorly if the function prints any log messages before the progress actually starts. * There's no easy way to automatically call `finish` on a progress bar, so some functions don't (for example, `Downloader.download_file`). This can cause unexpected output, since tqdm will refresh the progress bar in a background thread (possibly after we've already printed something else). * `iter` is basically broken, because it divides by `period`, which is 0 in all current implementations. * Even ignoring that, it's hard to use correctly, because you need to manually call `finish` in case of an exception. * `split` is not implemented and not used. * `StreamWithProgress.seek` assumes that the progress bar is at 0 at the start, and so does `ProgressReporter.iter`. The former also works incorrectly if the second argument is not `SEEK_SET`. Fix these problems by doing the following: * Add a new `start2` method which accepts more parameters. The default implementation calls `start`, so that if a user has implemented the original interface, it'll keep working. * Add a `DeferredTqdmProgressReporter` that accepts tqdm parameters instead of a tqdm instance, and only creates an instance after `start2` is called. Use it where `TqdmProgressReporter` was used before. The old `TqdmProgressReporter` is kept for compatibility, but it doesn't support any `start2` arguments other than those supported by the original `start`. * Add a `task` context manager, which automatically calls `start2` and `finish`. Use it everywhere instead of explicit `start`/`finish` calls. Remove `start`/`finish` calls from `StreamWithProgress` and `iter`. * Implement basic assertions to ensure that `start2` and `finish` are used correctly. * Remove `period` and `split`. * Rewrite `StreamWithProgress.seek` and `ProgressReporter.iter` to use relative progress reports. These changes should be backwards compatible for users who pass predefined or custom progress reporters into SDK functions. They are not backwards compatible for users who try to use progress reporters directly (e.g. calling `start`/`finish`). I don't consider that a significant issue, since the purpose of the `ProgressReporter` interface is for the user to get progress information from the SDK, not for them to use it in their own code. Originally developed for cvat-ai#6483.

Introduce a `cvat-sdk auto-annotate` command that downloads data for a task, then runs a function on the local computer on that data, and uploads resulting annotations back to the task. To support this functionality, add a new SDK module, `cvat_sdk.auto_annotation`, that contains an interface that the functions must follow, and a driver that applies a function to a task. This will let users easily annotate their tasks with custom DL models.

Documentation for cvat-ai#6483.

SpecLad force-pushed the client-side-lambdas branch 2 times, most recently from 6c1f16b to beb0ca6 Compare July 18, 2023 10:18

zhiltsov-max reviewed Jul 18, 2023

View reviewed changes

cvat-cli/src/cvat_cli/parser.py Outdated Show resolved Hide resolved

zhiltsov-max reviewed Jul 18, 2023

View reviewed changes

tests/python/cli/example_function.py Show resolved Hide resolved

zhiltsov-max reviewed Jul 18, 2023

View reviewed changes

tests/python/cli/example_function.py Outdated Show resolved Hide resolved

zhiltsov-max reviewed Jul 18, 2023

View reviewed changes

nmanovic changed the title ~~[Dependent] Client side lambdas~~ Client side lambdas Jul 18, 2023

nmanovic reviewed Jul 18, 2023

View reviewed changes

SpecLad mentioned this pull request Jul 21, 2023

generate_image_file: don't hardcode the file format #6538

Merged

3 tasks

SpecLad force-pushed the client-side-lambdas branch from 2cd36e9 to 0dab04c Compare July 21, 2023 16:57

SpecLad mentioned this pull request Jul 24, 2023

Revamp the progress reporting API #6556

Merged

5 tasks

SpecLad force-pushed the client-side-lambdas branch 4 times, most recently from 352358b to acf50c0 Compare July 28, 2023 16:18

SpecLad marked this pull request as ready for review July 28, 2023 16:22

SpecLad requested review from azhavoro and mdacoca as code owners July 28, 2023 16:22

SpecLad changed the title ~~Client side lambdas~~ Add auto-annotation support to SDK and CLI Jul 28, 2023

zhiltsov-max reviewed Aug 1, 2023

View reviewed changes

zhiltsov-max approved these changes Aug 1, 2023

View reviewed changes

SpecLad added 2 commits August 2, 2023 13:01

cvat_sdk.auto_annotation: add convenience factory functions for annot…

1b9e57e

…ation-related models

SpecLad added 10 commits August 2, 2023 13:01

Use the convenience factories in tests

18ee609

Add a predefined annotation function based on YOLOv8n

494b8d1

Allow using a file path to specify the function

bb7f557

Add progress reporting to auto-annotation

409cc2b

By default, don't clear existing annotations; add an option to do tha…

ade9545

…t instead

Use the cvataa helpers more in tests

c9fabf0

Disallow unmatched labels by default

3e6a737

Add the frame name to DetectionFunctionContext

7b7ce6c

Add docstrings for the auto-annotation functionality

3d0444b

Update changelog

630a4ee

SpecLad force-pushed the client-side-lambdas branch from acf50c0 to 522582f Compare August 2, 2023 10:02

SpecLad requested a review from sizov-kirill as a code owner August 2, 2023 10:02

Add a test for the YOLOv8n function

0df5821

SpecLad force-pushed the client-side-lambdas branch from 522582f to 0df5821 Compare August 2, 2023 12:18

SpecLad merged commit 1c0a49f into cvat-ai:develop Aug 2, 2023

SpecLad deleted the client-side-lambdas branch August 2, 2023 17:30

SpecLad mentioned this pull request Aug 3, 2023

Add documentation for auto-annnotation in SDK/CLI #6611

Merged

3 tasks

SpecLad added a commit that referenced this pull request Aug 8, 2023

Add documentation for auto-annnotation in SDK/CLI (#6611)

c68cb07

Documentation for #6483.

SpecLad mentioned this pull request Aug 11, 2023

Release 2.6.0 #6660

Merged

mikhail-treskin pushed a commit to retailnext/cvat that referenced this pull request Oct 25, 2023

Add documentation for auto-annnotation in SDK/CLI (cvat-ai#6611)

845c3d3

Documentation for cvat-ai#6483.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add auto-annotation support to SDK and CLI #6483

Add auto-annotation support to SDK and CLI #6483

SpecLad commented Jul 14, 2023 •

edited

Loading

zhiltsov-max Jul 18, 2023 •

edited

Loading

nmanovic Jul 18, 2023

SpecLad Jul 19, 2023

zhiltsov-max Jul 24, 2023

SpecLad Jul 25, 2023

zhiltsov-max Jul 26, 2023

SpecLad Jul 28, 2023

zhiltsov-max Jul 18, 2023 •

edited

Loading

SpecLad Jul 19, 2023

zhiltsov-max Jul 20, 2023 •

edited

Loading

SpecLad Jul 25, 2023

zhiltsov-max left a comment

nmanovic Jul 18, 2023 •

edited

Loading

SpecLad Jul 19, 2023

SpecLad Jul 21, 2023

nmanovic Jul 18, 2023

SpecLad Jul 21, 2023

nmanovic Jul 18, 2023

SpecLad Jul 21, 2023

nmanovic Jul 18, 2023

SpecLad Jul 19, 2023

SpecLad Jul 28, 2023

codecov bot commented Jul 28, 2023 •

edited

Loading

zhiltsov-max Aug 1, 2023

SpecLad Aug 1, 2023

SpecLad commented Aug 2, 2023



		@attrs.frozen(kw_only=True)
		class DetectionFunctionSpec:

Add auto-annotation support to SDK and CLI #6483

Add auto-annotation support to SDK and CLI #6483

Conversation

SpecLad commented Jul 14, 2023 • edited Loading

Motivation and context

How has this been tested?

Checklist

License

zhiltsov-max Jul 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiltsov-max Jul 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiltsov-max Jul 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiltsov-max left a comment

Choose a reason for hiding this comment

nmanovic Jul 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 28, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SpecLad commented Aug 2, 2023

SpecLad commented Jul 14, 2023 •

edited

Loading

zhiltsov-max Jul 18, 2023 •

edited

Loading

zhiltsov-max Jul 18, 2023 •

edited

Loading

zhiltsov-max Jul 20, 2023 •

edited

Loading

nmanovic Jul 18, 2023 •

edited

Loading

codecov bot commented Jul 28, 2023 •

edited

Loading