-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add auto-annotation support to SDK and CLI #6483
Conversation
6c1f16b
to
beb0ca6
Compare
metavar='MODULE', | ||
required=True, | ||
dest="function_module", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, it makes sense to add other parameters:
job_id(s)
clear_existing
/append
- script file path
- an extra parameter to for module root directory
strict
mode, where we fail if the script labels do not exist in the dataset ("label 'car' is not in dataset; any annotations using it will be ignored" - is not something I'd like to see as a user after the model has been running for some time)skip-errors
to skip the frame if there is an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to discuss this. From one point of view, some of these parameters can be convenient:
- job_id(s) - probably we should be able to annotate jobs as well. I agree. But I don't think that we need the argument here
- clear_existing - better to have just a separate command to remove annotation. Otherwise we can get very complicated interface.
- script file path - OK, instead of a module name
- strict - agree and maybe it should be a default mode. Thus we avoid many complains, that a function doesn't annotate a task. I met the problem today.
- skip-errors - I'm not sure we want to skip any errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts:
job_id(s)
I think ideally the CLI should have separate sets of task and job based commands (e.g. cvat-cli task list
/ cvat-cli job list
). IIUC, job auto-annotation won't require you to use any task-based APIs, so it would make more sense to implement it as a job subcommand (cvat-cli job auto-annotate
).
However, that would be beyond the scope of this PR.
clear_existing
/append
I agree with Maxim that this should be an option. Moreover, the default should probably be "append". Currently, the command clears existing annotations, but that's mainly just because it's easier. Defaulting to append
will avoid accidental data loss.
@nmanovic I think we should have a separate command to clear annotations, but it's not very convenient to run a separate command every time you want to reannotate, so I think an option for this command would be useful to have.
script file path
I'm thinking of adding a --function-file
option as an alternative to --function
(and maybe renaming the latter to --function-module
). Instead of importing, this will load the file as a string and compile it. It will require that the file has no relative imports, though.
an extra parameter to for module root directory
I don't know what this means.
strict
Yeah, fair point. I also agree with Nikita that this should be the default.
skip-errors
I get the idea, but then what do you do with frames that were skipped? There's no easy way to find them afterwards, and even if we could find them, we can't rerun the function on just those frames.
We could consider some solution for this, but IMO it goes beyond the current scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of importing, this will load the file as a string and compile it. It will require that the file has no relative imports, though.
an extra parameter to for module root directory
I don't know what this means.
I've given an example in the test comments. Basically, it's needed both for relative imports and for fully-qualified imports to specify where to look for the import path (like PYTHONPATH
does). Without this, it's hard to use scripts relying on a customized framework (e.g. mmdetection), lying somewhere near the cvat adapter script, or just outside the current directory:
# .../some/dir/mymodel/mmdetection/...
# .../some/dir/mymodel/cvat_adapter.py
cvat-cli auto-annotate --module-root '.../some/dir/' --module 'mymodel.cvat_adapter' ...
skip-errors
I get the idea, but then what do you do with frames that were skipped? There's no easy way to find them afterwards, and even if we could find them, we can't rerun the function on just those frames.
I thought about things like OOM errors on some frames. Not sure how often it happens in real applications.
Actually, the implementation allows execution on specific frames (except for tracks), as the function is executed on separate frames, so it's only a question of API/CLI. I also see an option to execute the model only on some part of the task a useful feature, though the CLI use may be complicated in this case. Probably, we can leave this for future updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've given an example in the test comments. Basically, it's needed both for relative imports and for fully-qualified imports to specify where to look for the import path (like PYTHONPATH does).
So why not just use PYTHONPATH
?
Actually, the implementation allows execution on specific frames (except for tracks), as the function is executed on separate frames, so it's only a question of API/CLI.
It's true that we can run the function on specific frames, but how do we know which frames to use (in the context of recovery from a partially successful auto-annotation run?
I also see an option to execute the model only on some part of the task a useful feature, though the CLI use may be complicated in this case. Probably, we can leave this for future updates.
I agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why not just use PYTHONPATH?
Probably, it's an implementation detail, and this obliges users to learn environment variables.
how do we know which frames to use (in the context of recovery from a partially successful auto-annotation run?
It could be reported somehow, e.g. in the logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added --clear-existing
and --allow-unmatched-labels
. Correspondingly, I changed the defaults to not erase old annotations, and abort when a function declares labels not configured in the task.
Probably, it's an implementation detail, and this obliges users to learn environment variables.
The user would have to learn something either way, and at least they might already know about PYTHONPATH
.
"auto-annotate", | ||
str(fxt_new_task.id), | ||
f"--function={__package__}.example_function", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding several other tests for different file placement, because we don't have a package when we call it from the actual CLI (in the tests we cheat a little, in practice we have to use something like PYTHONPATH="$PWD:$PYTHONPATH" cvat-cli ...
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well... what else is there to test? Either way, the auto-annotate
command just does an import on the provided module name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- A test that we can execute a function from a file/module in the current directory
- A test that we can execute a function from a file/module outside the current directory subtree, and which is not installed in the current system or python environment (e.g.
cvat-cli auto-annotate --function ../../my_cvat_model.py ...
- A test that we can execute a function from a file/module, that references other modules (e.g. the function script wraps a patched framework in a nested directory, example is mmdetection + a custom script for cvat on top)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the point in these tests to be honest. We don't do anything unusual with the import mechanics. If Python can import a module, then so can we (and vice versa).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related: #5324
|
||
|
||
@attrs.frozen(kw_only=True) | ||
class DetectionFunctionSpec: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the user's perspective, I would not recommend using PatchedLabelRequest and LabeledShapeRequest. They are counterintuitive.
I will recommend defining alternative naming. Something like Label or Shape.
Another approach is to define some high-level interfaces like LabelingSpec and LabeledFrame inside SDK and use them.
In both cases, the interface will be clear and attractive.
Also, even I can read comments about DetectionFunctionContext, I'm not sure that we need it. How are you going to use it in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll do something of a middle ground. I'll add factory functions with nice names and nice interfaces that construct the same old *Request
objects. That way, the function implementation can look nice and clean, but we're not introducing any new concepts (and the users can still construct the low-level request objects if the factory functions are insufficient for some reason).
Also, even I can read comments about DetectionFunctionContext, I'm not sure that we need it. How are you going to use it in the future?
See the thread directly above this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add factory functions with nice names and nice interfaces
I did it.
@@ -0,0 +1,18 @@ | |||
# Copyright (C) 2023 CVAT.ai Corporation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to add a real example based on https://github.com/ultralytics/ultralytics. As you suggested, we can create a python module with a detection function that is based on ultralytics and use it to explain how to write such functions. Also, we can use it for a demonstration to our users.
Without a way to run the functionality out of the box, it will demotivate 90% of regular users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
tests/python/cli/test_cli.py
Outdated
self.run_cli( | ||
"auto-annotate", | ||
str(fxt_new_task.id), | ||
f"--function={__package__}.example_function", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, from a regular user perspective, I will recommend supporting at least several interfaces. I'm not sure that we have to force users to know what a Python module is. At the same time, many users know what is a directory. Do you think we can redesign the interface and use widely known concepts amount CLI users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an alternate option that takes a file path.
mapper.validate_and_remap(frame_shapes, sample.frame_index) | ||
shapes.extend(frame_shapes) | ||
|
||
client.logger.info("Uploading annotations to task %d", task_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we force users to wait till all frames are annotated. In CLI we should be able to show progress and also I believe it will be useful to have a configuration option to upload data in batches. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding progress: yes we should.
Regarding uploading data in batches: I think notionally it could be useful, but in practice there are problems:
- I'm not sure if the API supports partial uploads like this.
- If the annotation fails mid-way, there's no way to restart it from where it left off (and it it doesn't fail mid-way, then what's the point in uploading in batches?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added progress reporting.
2cd36e9
to
0dab04c
Compare
There are several problems with how progress reporting is handled in the SDK, both on the interface and implementation level: * The user is supposed to know, for a given function, which units it will report progress in. This is unnecessary coupling and prevents us from switching to different units, or to have several progress bars using different units. * To create a TqdmProgressReporter, you have to create a tqdm instance, which immediately draws a progress bar. This works poorly if the function prints any log messages before the progress actually starts. * There's no easy way to automatically call `finish` on a progress bar, so some functions don't (for example, `Downloader.download_file`). This can cause unexpected output, since tqdm will refresh the progress bar in a background thread (possibly after we've already printed something else). * `iter` is basically broken, because it divides by `period`, which is 0 in all current implementations. * Even ignoring that, it's hard to use correctly, because you need to manually call `finish` in case of an exception. * `split` is not implemented and not used. * `StreamWithProgress.seek` assumes that the progress bar is at 0 at the start, and so does `ProgressReporter.iter`. The former also works incorrectly if the second argument is not `SEEK_SET`. Fix these problems by doing the following: * Add a new `start2` method which accepts more parameters. The default implementation calls `start`, so that if a user has implemented the original interface, it'll keep working. * Add a `DeferredTqdmProgressReporter` that accepts tqdm parameters instead of a tqdm instance, and only creates an instance after `start2` is called. Use it where `TqdmProgressReporter` was used before. The old `TqdmProgressReporter` is kept for compatibility, but it doesn't support any `start2` arguments other than those supported by the original `start`. * Add a `task` context manager, which automatically calls `start2` and `finish`. Use it everywhere instead of explicit `start`/`finish` calls. Remove `start`/`finish` calls from `StreamWithProgress` and `iter`. * Implement basic assertions to ensure that `start2` and `finish` are used correctly. * Remove `period` and `split`. * Rewrite `StreamWithProgress.seek` and `ProgressReporter.iter` to use relative progress reports. These changes should be backwards compatible for users who pass predefined or custom progress reporters into SDK functions. They are not backwards compatible for users who try to use progress reporters directly (e.g. calling `start`/`finish`). I don't consider that a significant issue, since the purpose of the `ProgressReporter` interface is for the user to get progress information from the SDK, not for them to use it in their own code. Originally developed for #6483.
352358b
to
acf50c0
Compare
Codecov Report
@@ Coverage Diff @@
## develop #6483 +/- ##
===========================================
+ Coverage 81.17% 81.23% +0.06%
===========================================
Files 363 368 +5
Lines 39606 39813 +207
Branches 3557 3557
===========================================
+ Hits 32150 32342 +192
- Misses 7456 7471 +15
|
return [ | ||
cvataa.rectangle(0, [1, 2, 3, 4]), | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about changing the interface to be the following:
- The function does not return anything
- There is a function to add an annotation for a frame (e.g.
context.add_rectangle()
)
This way:
- the users won't need to know the return annotation types (there can be another function for the users who know)
- we can change the function API without breaking it
- we can manage annotation storage (and optimize it, potentially)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a tempting proposal. 🤔 I'm just not sure if I'll have time to implement it; it's a pretty big change.
Introduce a `cvat-sdk auto-annotate` command that downloads data for a task, then runs a function on the local computer on that data, and uploads resulting annotations back to the task. To support this functionality, add a new SDK module, `cvat_sdk.auto_annotation`, that contains an interface that the functions must follow, and a driver that applies a function to a task.
…ation-related models
acf50c0
to
522582f
Compare
522582f
to
0df5821
Compare
I added a test for the built-in |
## \[2.6.0\] - 2023-08-11 ### Added - \[SDK\] Introduced the `DeferredTqdmProgressReporter` class, which avoids the glitchy output seen with the `TqdmProgressReporter` under certain circumstances (<#6556>) - \[SDK, CLI\] Added the `cvat_sdk.auto_annotation` module, providing functionality to automatically annotate tasks by executing a user-provided function on the local machine. A corresponding CLI command (`auto-annotate`) is also available. Some predefined functions using torchvision are also available. (<#6483>, <#6649>) - Included an indication for cached frames in the interface (<#6586>) ### Changed - Raised the default guide assets limitations to 30 assets, with a maximum size of 10MB each (<#6575>) - \[SDK\] Custom `ProgressReporter` implementations should now override `start2` instead of `start` The old implementation is still supported. (<#6556>) - Improved memory optimization and code in the decoding module (<#6585>) ### Removed - Removed the YOLOv5 serverless function (<#6618>) ### Fixed - Corrected an issue where the prebuilt FFmpeg bundled in PyAV was being used instead of the custom build. - Fixed the filename for labels in the CamVid format (<#6600>)
There are several problems with how progress reporting is handled in the SDK, both on the interface and implementation level: * The user is supposed to know, for a given function, which units it will report progress in. This is unnecessary coupling and prevents us from switching to different units, or to have several progress bars using different units. * To create a TqdmProgressReporter, you have to create a tqdm instance, which immediately draws a progress bar. This works poorly if the function prints any log messages before the progress actually starts. * There's no easy way to automatically call `finish` on a progress bar, so some functions don't (for example, `Downloader.download_file`). This can cause unexpected output, since tqdm will refresh the progress bar in a background thread (possibly after we've already printed something else). * `iter` is basically broken, because it divides by `period`, which is 0 in all current implementations. * Even ignoring that, it's hard to use correctly, because you need to manually call `finish` in case of an exception. * `split` is not implemented and not used. * `StreamWithProgress.seek` assumes that the progress bar is at 0 at the start, and so does `ProgressReporter.iter`. The former also works incorrectly if the second argument is not `SEEK_SET`. Fix these problems by doing the following: * Add a new `start2` method which accepts more parameters. The default implementation calls `start`, so that if a user has implemented the original interface, it'll keep working. * Add a `DeferredTqdmProgressReporter` that accepts tqdm parameters instead of a tqdm instance, and only creates an instance after `start2` is called. Use it where `TqdmProgressReporter` was used before. The old `TqdmProgressReporter` is kept for compatibility, but it doesn't support any `start2` arguments other than those supported by the original `start`. * Add a `task` context manager, which automatically calls `start2` and `finish`. Use it everywhere instead of explicit `start`/`finish` calls. Remove `start`/`finish` calls from `StreamWithProgress` and `iter`. * Implement basic assertions to ensure that `start2` and `finish` are used correctly. * Remove `period` and `split`. * Rewrite `StreamWithProgress.seek` and `ProgressReporter.iter` to use relative progress reports. These changes should be backwards compatible for users who pass predefined or custom progress reporters into SDK functions. They are not backwards compatible for users who try to use progress reporters directly (e.g. calling `start`/`finish`). I don't consider that a significant issue, since the purpose of the `ProgressReporter` interface is for the user to get progress information from the SDK, not for them to use it in their own code. Originally developed for cvat-ai#6483.
Introduce a `cvat-sdk auto-annotate` command that downloads data for a task, then runs a function on the local computer on that data, and uploads resulting annotations back to the task. To support this functionality, add a new SDK module, `cvat_sdk.auto_annotation`, that contains an interface that the functions must follow, and a driver that applies a function to a task. This will let users easily annotate their tasks with custom DL models.
Documentation for cvat-ai#6483.
Motivation and context
Introduce a
cvat-sdk auto-annotate
command that downloads data for a task,then runs a function on the local computer on that data, and uploads
resulting annotations back to the task.
To support this functionality, add a new SDK module,
cvat_sdk.auto_annotation
, that contains an interface that the functionsmust follow, and a driver that applies a function to a task.
This will let users easily annotate their tasks with custom DL models.
How has this been tested?
Unit tests, manual tests.
Checklist
develop
branch[ ] I have linked related issues (see GitHub docs)[ ] I have increased versions of npm packages if it is necessary(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)
License
Feel free to contact the maintainers if that's a concern.