auto-annotation fails for large videos #1224

gitunit · 2020-03-02T13:06:09Z

i have attempted to auto-annotate a large video (2 hours with appr. 140 000 frames) with a yolov3 model (OpenVino). the first attempt just stuck and there was no progress visible in the UI even after several days. later i did docker-compose down and up again and tried several times, it always got stuck and only recently showed popups that the task failed.
after another attempt i investigated with htop how much this process occupies CPUs and RAM, there i saw that as soon as RAM is fully occupied, the auto-annotation process stops (but the actual process keeps living).
so long story short, long videos seem not to work. with shorter videos there is no issue.

gitunit · 2020-03-02T13:12:01Z

i guess there must be some kind of memory leak happening. im currently testing with another model (SSD) which doesnt have that rapid RAM size increase

gitunit · 2020-03-02T14:18:05Z

i've just tried yolov3 auto-annotation on my local machine with 32 GB RAM, it ate all up. even the swap of 30 GB got fully occupied, so appr. 62 GB was used. there is obviously a big memory leak.

edit: at this stage im considering to drop the usage of OpenVINO altogether

gitunit · 2020-03-02T16:43:24Z

i've tried the C++ sample from the ModelZoo (OpenVINO native) called "object_detection_demo_yolov3_async" and there was no memory leak observable. can i assume the interp.py is to blame?

gitunit · 2020-03-02T16:57:06Z

after testing with an empty interpreter.py script, i think we can exclude that one. so there must be inside of CVAT some major memory leak happening.
i have also tested OpenVINO "native" (C++ and python) [see here] with this model, there i couldn't observe any memory leaks. thus it really must be inside CVAT.

benhoff · 2020-03-03T13:15:20Z

Wondering if the use of exec means that the interp.py code or the result object (class is Result from the Auto Annotation module) is never properly reference counted to 0. Technically the exec'd code has a copy/reference of the result as well as the main code.

I thought I'd also seen references to the rq workers not cleaning up memory properly wrt auto annotation, but a quick search didn't find anything.

This is the line that the interp code gets exec'd
https://github.com/opencv/cvat/blob/b3f7f5b8bcc40a10871a3a0aefe3d4757d78b4e3/cvat/apps/engine/utils.py#L45

The method gets called from here:

https://github.com/opencv/cvat/blob/b3f7f5b8bcc40a10871a3a0aefe3d4757d78b4e3/cvat/apps/auto_annotation/inference.py#L30

I'd probably check how the compiled interp code is getting cleaned up. I'd also check if the Result object gets cleaned up.

One could probably put a print statement in a deconstructor of the class Result to see if it's getting cleaned up correctly.

But those are just idle thoughts.

gitunit · 2020-03-04T11:16:46Z

@benhoff i agree, it's probably the Result object since i made a test with an empty interp.py file the memory leak was still there.
interesting is also that the memory leak is bigger for yolov3 than for SSD. which makes sense because yolov3 generates more results.
@bsekachev has probably more insights.

gitunit · 2020-03-06T11:06:35Z

i have identified the actual problem. it is in this line:
https://github.com/opencv/cvat/blob/b3f7f5b8bcc40a10871a3a0aefe3d4757d78b4e3/cvat/apps/auto_annotation/inference.py#L133

this object grows until all frames have been processed and thus always lives in RAM. maybe one way is to write it to disk and then if finished with inference, process it in chunks. any other ideas? @bsekachev

benhoff · 2020-03-07T15:12:53Z

this object grows until all frames have been processed and thus always lives in RAM. maybe one way is to write it to disk and then if finished with inference, process it in chunks. any other ideas?

You're probably better off batching it in the model manager. See here:

https://github.com/opencv/cvat/blob/24130cda415f3fce28ad5b6890368f284c18746c/cvat/apps/auto_annotation/model_manager.py#L244

The get_image_data is the start of the problem, because it has no idea how many frames are there.
Probably should check the amount of frames available, and then grab them in batches of 50 or 100 if it's over a threshold.

This file might already have modifications in a different pr. nmanovic mentioned that one of the upcoming pull requests was going to break some of my past work. See here for that note: #934 (comment)

It would be worth checking out what happened in #787 before you go fix the problem :)

benhoff · 2020-04-14T23:29:15Z

@gitunit , would it be possible for you to test #1328 and see if it resolves your problem?

gitunit · 2020-04-20T13:25:59Z

@benhoff i will try in the upcoming days

azhavoro added the help wanted label Mar 13, 2020

benhoff mentioned this issue Mar 29, 2020

Clean up memory for Auto Annotation #1328

Merged

nmanovic added the enhancement New feature or request label Apr 27, 2020

nmanovic added this to the 1.0.0-beta.2 milestone Apr 27, 2020

nmanovic closed this as completed Apr 27, 2020

This was referenced Sep 15, 2021

[Snyk] Fix for 1 vulnerabilities hixio-mh/cvat#80

Open

[Snyk] Fix for 1 vulnerabilities hixio-mh/cvat#81

Open

snyk-bot mentioned this issue Oct 19, 2021

[Snyk] Fix for 6 vulnerabilities hixio-mh/cvat#97

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto-annotation fails for large videos #1224

auto-annotation fails for large videos #1224

gitunit commented Mar 2, 2020 •

edited

Loading

gitunit commented Mar 2, 2020

gitunit commented Mar 2, 2020 •

edited

Loading

gitunit commented Mar 2, 2020 •

edited

Loading

gitunit commented Mar 2, 2020 •

edited

Loading

benhoff commented Mar 3, 2020 •

edited

Loading

gitunit commented Mar 4, 2020

gitunit commented Mar 6, 2020

benhoff commented Mar 7, 2020 •

edited

Loading

benhoff commented Apr 14, 2020

gitunit commented Apr 20, 2020

auto-annotation fails for large videos #1224

auto-annotation fails for large videos #1224

Comments

gitunit commented Mar 2, 2020 • edited Loading

gitunit commented Mar 2, 2020

gitunit commented Mar 2, 2020 • edited Loading

gitunit commented Mar 2, 2020 • edited Loading

gitunit commented Mar 2, 2020 • edited Loading

benhoff commented Mar 3, 2020 • edited Loading

gitunit commented Mar 4, 2020

gitunit commented Mar 6, 2020

benhoff commented Mar 7, 2020 • edited Loading

benhoff commented Apr 14, 2020

gitunit commented Apr 20, 2020

gitunit commented Mar 2, 2020 •

edited

Loading

gitunit commented Mar 2, 2020 •

edited

Loading

gitunit commented Mar 2, 2020 •

edited

Loading

gitunit commented Mar 2, 2020 •

edited

Loading

benhoff commented Mar 3, 2020 •

edited

Loading

benhoff commented Mar 7, 2020 •

edited

Loading