Support frame selection when create from video #437

zliang7 · 2019-04-30T10:38:44Z

nmanovic · 2019-04-30T12:34:31Z

@zliang7 , thank you for the merge request. I hope today we will merge updated documentation. After that I will create a new release for CVAT. After that I will review and merge your PR. Please expect delays because we have a lot of public holidays in Russia. I will try to review the patch ASAP in any case.

cvat/apps/dashboard/static/dashboard/js/dashboard.js

nmanovic · 2019-05-05T13:20:01Z

@zliang7 ,

Could you please look at comments from Boris Sekachev? My comments below:

First of all I don't like that we have start_frame, stop_frame, step in several places. Video and Task tables. I will prefer to keep them only in Task table.
Instead of a step I will prefer to have more common concept: frame_filter. For now you will implement "step" filter only (e.g. step=1). In the future we can add a filter by name, a filter which selects only very different frames, visual search like "frames which contain a car" (if we have a good classifier).
For me a video file it is a collection of images like a directory with images. This is why I don't understand why you implement the concept only for a video file. Imagine that you have a big directory on "share". Instead of choosing every 15th frame you can just say I want to annotate images with step 15.

cvat/apps/dashboard/templates/dashboard/dashboard.html

zliang7 · 2019-05-06T10:41:38Z

@zliang7 ,

Could you please look at comments from Boris Sekachev? My comments below:

First of all I don't like that we have start_frame, stop_frame, step in several places. Video and Task tables. I will prefer to keep them only in Task table.

I've tried to reuse the fields in Video table. But it seems every parameter in POST request needs a field in task table. If I didn't add these fields, the request failed.

Instead of a step I will prefer to have more common concept: frame_filter. For now you will implement "step" filter only (e.g. step=1). In the future we can add a filter by name, a filter which selects only very different frames, visual search like "frames which contain a car" (if we have a good classifier).

This PR is to implement the 'step' feature. You can see the generation of the command line of ffmepg, nothing else. I think supporting sophisticated filter is not an easy task, especially "frames which contain a car". How to express it on UI and internal data structure? I think this needs a serious design.

For me a video file it is a collection of images like a directory with images. This is why I don't understand why you implement the concept only for a video file. Imagine that you have a big directory on "share". Instead of choosing every 15th frame you can just say I want to annotate images with step 15.

I'm not quite understand. A video is a file contains many continuous images. It make sense to pick necessary images. But image files already are files, so why put those unnecessary image files into "share"? Many file utilities can do the task. I'm OK if you do think it make sense.

nmanovic · 2019-05-06T19:14:25Z

@zliang7 ,

I've tried to reuse the fields in Video table. But it seems every parameter in POST request needs a field in task table. If I didn't add these fields, the request failed.

I don't understand the statement. We don't need to have duplicated fields. Please keep them in Task table only and remove from Video table.

This PR is to implement the 'step' feature. You can see the generation of the command line of ffmepg, nothing else. I think supporting sophisticated filter is not an easy task, especially "frames which contain a car". How to express it on UI and internal data structure? I think this needs a serious design.

In UI now it should be just a string. For now you need to implement step=N filter. It is not big redesign of your patch. Internally in DB it should be also a string for now. When a task is created it will be used to filter provided frames or images. In the future we can provide other filter values like regex=.*\.jpg or glob=*.jpg or similarity=0.1 or detector=car,person or combination of these filters. I hope you understand that the feature has very good potential. One of fundumental problems in deep learning is to avoid annotate redundant data (which will only increase your dataset but will not provide any value for your model). I don't ask you to implement the feature. I just ask you to make a little step forward. I agree that in its final version your code will be changed but at least a couple of next steps will be easy to implement (like mention filters).

I'm not quite understand. A video is a file contains many continuous images. It make sense to pick necessary images. But image files already are files, so why put those unnecessary image files into "share"? Many file utilities can do the task. I'm OK if you do think it make sense.

I can say that for a video file there are a lot of utilities which can convert them to a number of frames with a specific step (like ffmpeg). But we want to implement the functionality because our customers don't want to do that manually. The same for images. Of course you can manually filter them, create another directory, put all matched files into the new directory but again it is a lot of manual work. I believe in unification. If we have a feature A which is applicable for a video file and it makes sense for images as well (especially if you think about filters) I don't understand why we should implement the feature only for a video file.

zliang7 · 2019-05-09T08:45:18Z

@nmanovic I removed the 'start_frame', 'stop_frame' and 'step' fields from video table, and changed 'step' in task table to 'frame_filter' which accept 'step=N' format parameter.
Regarding of applying to image files, I can't work out it in short time, because I will travel abroad next Monday. Since it doesn't affect the functionality, can we merge this PR firstly if no other problem? I will submit another PR for the 3rd problem.

cvat/apps/engine/static/engine/js/logger.js

cvat/apps/dashboard/templates/dashboard/dashboard.html

cvat/apps/engine/serializers.py

cvat/apps/engine/task.py

cvat/apps/engine/models.py

nmanovic · 2019-05-14T19:42:41Z

@zliang7 ,

Again, thanks for your patience. The patch looks much better. I have found a couple of problems which should be fixed before the merge. It is normal because CVAT is quite complex software and it is very difficult for a new developer to find a correct way to implement a feature. This is why I'm here and try to advice.

I don't really think that we need to hurry. I'm OK to commit the functionality in two steps but I don't see any reasons for that. I believe it is very easy to extend your patch on the image case. Should be a couple of lines of code.

zliang7 · 2019-05-21T14:06:10Z

@nmanovic I'm back and I updated the PR as per your comments.

nmanovic · 2019-05-23T23:11:07Z

@nmanovic I'm back and I updated the PR as per your comments.

@zliang7 , I cannot see modifications in dump function. Could you please attach an example of CVAT XML file for a video file with filter="step=2"?

nmanovic · 2019-05-23T23:24:54Z

@zliang7 , need to add information about the feature into CHANGELOG.md as well.

zliang7 · 2019-05-27T08:17:05Z

@nmanovic I'm back and I updated the PR as per your comments.

@zliang7 , I cannot see modifications in dump function. Could you please attach an example of CVAT XML file for a video file with filter="step=2"?

The change is in apps/engine/annotation.py. Here is an example of "Dump Annotation":

<?xml version="1.0" encoding="utf-8"?>
<annotations>
  <version>1.1</version>
  <meta>
    <task>
      <id>53</id>
      <name>video0</name>
      <size>7</size>
      <mode>interpolation</mode>
      <overlap>3</overlap>
      <bugtracker></bugtracker>
      <flipped>False</flipped>
      <created>2019-05-22 16:38:45.545441+03:00</created>
      <updated>2019-05-22 16:38:45.545482+03:00</updated>
      <start_frame>0</start_frame>
      <stop_frame>1379</stop_frame>
      <frame_filter>step = 200</frame_filter>
      <labels>
        <label>
          <name>label0</name>
          <attributes>
          </attributes>
        </label>
      </labels>
      <segments>
        <segment>
          <id>19</id>
          <start>0</start>
          <stop>6</stop>
          <url>http://127.0.0.1:8080/?id=19</url>
        </segment>
      </segments>
      <owner>
        <username>django</username>
        <email></email>
      </owner>
      <original_size>
        <width>720</width>
        <height>720</height>
      </original_size>
    </task>
    <dumped>2019-05-28 11:21:07.095873+03:00</dumped>
  </meta>
</annotations>

zliang7 · 2019-05-27T14:08:41Z

@zliang7 , need to add information about the feature into CHANGELOG.md as well.

Updated.

nmanovic · 2019-05-29T19:33:35Z

@zliang7 , thanks! It is a great patch. I hope we have to do only one final step. I asked you provide me an example of a dump file to be sure that you dumped the following frames: 0, 200, 400, .... Could you please confirm that? Could you please annotate a couple of frames and provide the dump file again? I just cannot see the code in your patch. As we discussed we need to dump right frames number instead of 0, 1, 2, ... in this case.

zliang7 · 2019-05-30T13:36:21Z

@zliang7 , thanks! It is a great patch. I hope we have to do only one final step. I asked you provide me an example of a dump file to be sure that you dumped the following frames: 0, 200, 400, .... Could you please confirm that? Could you please annotate a couple of frames and provide the dump file again? I just cannot see the code in your patch. As we discussed we need to dump right frames number instead of 0, 1, 2, ... in this case.

Updated the PR. A dump file:

<?xml version="1.0" encoding="utf-8"?>
<annotations>
  <version>1.1</version>
  <meta>
    <task>
      <id>55</id>
      <name>video1</name>
      <size>6</size>
      <mode>interpolation</mode>
      <overlap>3</overlap>
      <bugtracker></bugtracker>
      <flipped>False</flipped>
      <created>2019-05-28 17:13:46.883186+03:00</created>
      <updated>2019-05-31 15:58:04.568319+03:00</updated>
      <start_frame>300</start_frame>
      <stop_frame>1300</stop_frame>
      <frame_filter>step=200</frame_filter>
      <labels>
        <label>
          <name>label1</name>
          <attributes>
          </attributes>
        </label>
      </labels>
      <segments>
        <segment>
          <id>21</id>
          <start>0</start>
          <stop>5</stop>
          <url>http://127.0.0.1:8080/?id=21</url>
        </segment>
      </segments>
      <owner>
        <username>django</username>
        <email></email>
      </owner>
      <original_size>
        <width>720</width>
        <height>720</height>
      </original_size>
    </task>
    <dumped>2019-05-31 16:38:54.818640+03:00</dumped>
  </meta>
  <track id="0" label="label1">
    <box frame="300" outside="0" occluded="0" keyframe="1" xtl="186.50" ytl="19.30" xbr="342.64" ybr="258.78">
    </box>
    <box frame="500" outside="0" occluded="0" keyframe="1" xtl="81.20" ytl="53.51" xbr="260.20" ybr="290.40">
    </box>
    <box frame="700" outside="0" occluded="0" keyframe="1" xtl="149.62" ytl="132.46" xbr="328.62" ybr="381.60">
    </box>
    <box frame="900" outside="0" occluded="0" keyframe="1" xtl="180.32" ytl="165.80" xbr="356.70" ybr="401.80">
    </box>
    <box frame="1100" outside="0" occluded="0" keyframe="0" xtl="180.32" ytl="165.80" xbr="356.70" ybr="401.80">
    </box>
    <box frame="1300" outside="0" occluded="0" keyframe="0" xtl="180.32" ytl="165.80" xbr="356.70" ybr="401.80">
    </box>
  </track>
</annotations>

nmanovic

@zliang7 , It looks good to me. Thanks for the contribution. It was first and it was difficult but the final result is awesome. I was able to create a task with step 10, annotate it and load the annotation into a task with step 1. It works as expected (at least I was not able to see artefacts).

nmanovic added this to the 0.5.0 - Alpha milestone Apr 30, 2019

nmanovic requested review from bsekachev and nmanovic April 30, 2019 12:36

bsekachev reviewed Apr 30, 2019

View reviewed changes

cvat/apps/dashboard/static/dashboard/js/dashboard.js Outdated Show resolved Hide resolved

cvat/apps/dashboard/static/dashboard/js/dashboard.js Outdated Show resolved Hide resolved

nmanovic reviewed May 5, 2019

View reviewed changes

cvat/apps/dashboard/templates/dashboard/dashboard.html Outdated Show resolved Hide resolved

zliang7 force-pushed the develop branch from 3bf5c41 to 62374d7 Compare May 6, 2019 11:00

zliang7 force-pushed the develop branch from 62374d7 to c906dbf Compare May 9, 2019 08:33