cvat-ai · nmanovic · Mar 24, 2021 · Feb 2, 2021 · Feb 2, 2021 · Feb 2, 2021
@@ -5,6 +5,7 @@ branch = true
 source =
     cvat/apps/
     utils/cli/
+    utils/dataset_manifest
 
 omit =
     cvat/settings/*

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -26,6 +26,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - [Backup/Restore guide](cvat/apps/documentation/backup_guide.md) (<https://github.com/openvinotoolkit/cvat/pull/2964>)
 - Label deletion from tasks and projects (<https://github.com/openvinotoolkit/cvat/pull/2881>)
 - [Market-1501](https://www.aitribune.com/dataset/2018051063) format support (<https://github.com/openvinotoolkit/cvat/pull/2869>)
+- Ability of upload manifest for dataset with images (<https://github.com/openvinotoolkit/cvat/pull/2763>)
 - Annotations filters UI using react-awesome-query-builder (https://github.com/openvinotoolkit/cvat/issues/1418)
 
 ### Changed
@@ -40,6 +41,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Image visualizations settings on canvas for faster access (<https://github.com/openvinotoolkit/cvat/pull/2872>)
 - Better scale management of left panel when screen is too small (<https://github.com/openvinotoolkit/cvat/pull/2880>)
 - Improved error messages for annotation import (<https://github.com/openvinotoolkit/cvat/pull/2935>)
+- Using manifest support instead video meta information and dummy chunks (<https://github.com/openvinotoolkit/cvat/pull/2763>)
 
 ### Deprecated
 

@@ -2,40 +2,25 @@
 
 ## Description
 
-Data on the fly processing is a way of working with data, the main idea of which is as follows:
-Minimum necessary meta information is collected, when task is created.
-This meta information allows in the future to create a necessary chunks when receiving a request from a client.
+Data on the fly processing is a way of working with data, the main idea of which is as follows: when creating a task,
+the minimum necessary meta information is collected. This meta information allows in the future to create necessary
+chunks when receiving a request from a client.
 
-Generated chunks are stored in a cache of limited size with a policy of evicting less popular items.
+Generated chunks are stored in a cache of the limited size with a policy of evicting less popular items.
 
-When a request received from a client, the required chunk is searched for in the cache.
-If the chunk does not exist yet, it is created using a prepared meta information and then put into the cache.
+When a request is received from a client, the required chunk is searched for in the cache. If the chunk does not exist
+yet, it is created using prepared meta information and then put into the cache.
 
 This method of working with data allows:
 
 - reduce the task creation time.
-- store data in a cache of limited size with a policy of evicting less popular items.
+- store data in a cache of the limited size with a policy of evicting less popular items.
 
-## Prepare meta information
+Unfortunately, this method will not work for all videos with a valid manifest file. If there are not enough keyframes
+in the video for smooth video decoding, the task will be created in another way. Namely, all chunks will be prepared
+during task creation, which may take some time.
 
-Different meta information is collected for different types of uploaded data.
+#### Uploading a manifest with data
 
-### Video
-
-For video, this is a valid mapping of key frame numbers and their timestamps. This information is saved to `meta_info.txt`.
-
-Unfortunately, this method will not work for all videos with valid meta information.
-If there are not enough keyframes in the video for smooth video decoding, the task will be created in the old way.
-
-#### Uploading meta information along with data
-
-When creating a task, you can upload a file with meta information along with the video,
-which will further reduce the time for creating a task.
-You can see how to prepare meta information [here](/utils/prepare_meta_information/README.md).
-
-It is worth noting that the generated file also contains information about the number of frames in the video at the end.
-
-### Images
-
-Mapping of chunk number and paths to images that should enter the chunk
-is saved at the time of creating a task in a files `dummy_{chunk_number}.txt`
+When creating a task, you can upload a `manifest.jsonl` file along with the video or dataset with images.
+You can see how to prepare it [here](/utils/dataset_manifest/README.md).
@@ -15,7 +15,6 @@
 - [How to create a task with multiple jobs](#how-to-create-a-task-with-multiple-jobs)
 - [How to transfer CVAT to another machine](#how-to-transfer-cvat-to-another-machine)
 
-
 ## How to update CVAT
 
 Before upgrading, please follow the [backup guide](backup_guide.md) and backup all CVAT volumes.
@@ -151,4 +150,5 @@ Set the segment size when you create a new task, this option is available in the
 [Advanced configuration](user_guide.md#advanced-configuration) section.
 
 ## How to transfer CVAT to another machine
+
 Follow the [backup/restore guide](backup_guide.md#how-to-backup-all-cvat-data).
@@ -153,8 +153,8 @@ Go to the [Django administration panel](http://localhost:8080/admin). There you
     **Select files**. Press tab `My computer` to choose some files for annotation from your PC.
     If you select tab `Connected file share` you can choose files for annotation from your network.
     If you select ` Remote source` , you'll see a field where you can enter a list of URLs (one URL per line).
-    If you upload a video data and select `Use cache` option, you can along with the video file attach a file with meta information.
-    You can find how to prepare it [here](/utils/prepare_meta_information/README.md).
+    If you upload a video or dataset with images and select `Use cache` option, you can attach a `manifest.jsonl` file.
+    You can find how to prepare it [here](/utils/dataset_manifest/README.md).
 
     ![](static/documentation/images/image127.jpg)
 
@@ -1157,8 +1157,6 @@ Intelligent scissors is an CV method of creating a polygon by placing points wit
 The distance between the adjacent points is limited by the threshold of action,
 displayed as a red square which is tied to the cursor.
 
-
-
 - First, select the label and then click on the `intelligent scissors` button.
 
   ![](static/documentation/images/image199.jpg)

@@ -1,4 +1,4 @@
-# Copyright (C) 2020 Intel Corporation
+# Copyright (C) 2020-2021 Intel Corporation
 #
 # SPDX-License-Identifier: MIT
 
@@ -9,9 +9,9 @@
 from django.conf import settings
 
 from cvat.apps.engine.media_extractors import (Mpeg4ChunkWriter,
-    Mpeg4CompressedChunkWriter, ZipChunkWriter, ZipCompressedChunkWriter)
+    Mpeg4CompressedChunkWriter, ZipChunkWriter, ZipCompressedChunkWriter,
+    ImageDatasetManifestReader, VideoDatasetManifestReader)
 from cvat.apps.engine.models import DataChoice, StorageChoice
-from cvat.apps.engine.prepare import PrepareInfo
 from cvat.apps.engine.models import DimensionType
 
 class CacheInteraction:
@@ -51,17 +51,24 @@ def prepare_chunk_buff(self, db_data, quality, chunk_number):
                 StorageChoice.LOCAL: db_data.get_upload_dirname(),
                 StorageChoice.SHARE: settings.SHARE_ROOT
             }[db_data.storage]
-        if os.path.exists(db_data.get_meta_path()):
+        if hasattr(db_data, 'video'):
             source_path = os.path.join(upload_dir, db_data.video.path)
-            meta = PrepareInfo(source_path=source_path, meta_path=db_data.get_meta_path())
-            for frame in meta.decode_needed_frames(chunk_number, db_data):
-                images.append(frame)
-            writer.save_as_chunk([(image, source_path, None) for image in images], buff)
+            reader = VideoDatasetManifestReader(manifest_path=db_data.get_manifest_path(),
+                source_path=source_path, chunk_number=chunk_number,
+                chunk_size=db_data.chunk_size, start=db_data.start_frame,
+                stop=db_data.stop_frame, step=db_data.get_frame_step())
+            for frame in reader:
+                images.append((frame, source_path, None))
         else:
-            with open(db_data.get_dummy_chunk_path(chunk_number), 'r') as dummy_file:
-                images = [os.path.join(upload_dir, line.strip()) for line in dummy_file]
-            writer.save_as_chunk([(image, image, None) for image in images], buff)
+            reader = ImageDatasetManifestReader(manifest_path=db_data.get_manifest_path(),
+                chunk_number=chunk_number, chunk_size=db_data.chunk_size,
+                start=db_data.start_frame, stop=db_data.stop_frame,
+                step=db_data.get_frame_step())
+            for item in reader:
+                source_path = os.path.join(upload_dir, f"{item['name']}{item['extension']}")
+                images.append((source_path, source_path, None))
 
+        writer.save_as_chunk(images, buff)
         buff.seek(0)
         return buff, mime_type
 

@@ -11,6 +11,7 @@
 import struct
 import re
 from abc import ABC, abstractmethod
+from contextlib import closing
 
 import av
 import numpy as np
@@ -25,6 +26,7 @@
 ImageFile.LOAD_TRUNCATED_IMAGES = True
 
 from cvat.apps.engine.mime_types import mimetypes
+from utils.dataset_manifest import VideoManifestManager, ImageManifestManager
 
 def get_mime(name):
     for type_name, type_def in MEDIA_TYPES.items():
@@ -121,6 +123,10 @@ def get_image_size(self, i):
         img = Image.open(self._source_path[i])
         return img.width, img.height
 
+    @property
+    def absolute_source_paths(self):
+        return [self.get_path(idx) for idx, _ in enumerate(self._source_path)]
+
 class DirectoryReader(ImageListReader):
     def __init__(self, source_path, step=1, start=0, stop=None):
         image_paths = []
@@ -311,6 +317,103 @@ def get_image_size(self, i):
         image = (next(iter(self)))[0]
         return image.width, image.height
 
+class FragmentMediaReader:
+    def __init__(self, chunk_number, chunk_size, start, stop, step=1):
+        self._start = start
+        self._stop = stop + 1 # up to the last inclusive
+        self._step = step
+        self._chunk_number = chunk_number
+        self._chunk_size = chunk_size
+        self._start_chunk_frame_number = \
+            self._start + self._chunk_number * self._chunk_size * self._step
+        self._end_chunk_frame_number = min(self._start_chunk_frame_number \
+            + (self._chunk_size - 1) * self._step + 1, self._stop)
+        self._frame_range = self._get_frame_range()
+
+    @property
+    def frame_range(self):
+        return self._frame_range
+
+    def _get_frame_range(self):
+        frame_range = []
+        for idx in range(self._start, self._stop, self._step):
+            if idx < self._start_chunk_frame_number:
+                continue
+            elif idx < self._end_chunk_frame_number and \
+                    not ((idx - self._start_chunk_frame_number) % self._step):
+                frame_range.append(idx)
+            elif (idx - self._start_chunk_frame_number) % self._step:
+                continue
+            else:
+                break
+        return frame_range
+
+class ImageDatasetManifestReader(FragmentMediaReader):
+    def __init__(self, manifest_path, **kwargs):
+        super().__init__(**kwargs)
+        self._manifest = ImageManifestManager(manifest_path)
+        self._manifest.init_index()
+
+    def __iter__(self):
+        for idx in self._frame_range:
+            yield self._manifest[idx]
+
+class VideoDatasetManifestReader(FragmentMediaReader):
+    def __init__(self, manifest_path, **kwargs):
+        self.source_path = kwargs.pop('source_path')
+        super().__init__(**kwargs)
+        self._manifest = VideoManifestManager(manifest_path)
+        self._manifest.init_index()
+
+    def _get_nearest_left_key_frame(self):
+        if self._start_chunk_frame_number >= \
+                self._manifest[len(self._manifest) - 1].get('number'):
+            left_border = len(self._manifest) - 1
+        else:
+            left_border = 0
+            delta = len(self._manifest)
+            while delta:
+                step = delta // 2
+                cur_position = left_border + step
+                if self._manifest[cur_position].get('number') < self._start_chunk_frame_number:
+                    cur_position += 1
+                    left_border = cur_position
+                    delta -= step + 1
+                else:
+                    delta = step
+            if self._manifest[cur_position].get('number') > self._start_chunk_frame_number:
+                left_border -= 1
+        frame_number = self._manifest[left_border].get('number')
+        timestamp = self._manifest[left_border].get('pts')
+        return frame_number, timestamp
+
+    def __iter__(self):
+        start_decode_frame_number, start_decode_timestamp = self._get_nearest_left_key_frame()
+        with closing(av.open(self.source_path, mode='r')) as container:
+            video_stream = next(stream for stream in container.streams if stream.type == 'video')
+            video_stream.thread_type = 'AUTO'
+
+            container.seek(offset=start_decode_timestamp, stream=video_stream)
+
+            frame_number = start_decode_frame_number - 1
+            for packet in container.demux(video_stream):
+                for frame in packet.decode():
+                    frame_number += 1
+                    if frame_number in self._frame_range:
+                        if video_stream.metadata.get('rotate'):
+                            frame = av.VideoFrame().from_ndarray(
+                                rotate_image(
+                                    frame.to_ndarray(format='bgr24'),
+                                    360 - int(container.streams.video[0].metadata.get('rotate'))
+                                ),
+                                format ='bgr24'
+                            )
+                        yield frame
+                    elif frame_number < self._frame_range[-1]:
+                        continue
+                    else:
+                        return
+
 class IChunkWriter(ABC):
     def __init__(self, quality, dimension=DimensionType.DIM_2D):
         self._image_quality = quality

@@ -0,0 +1,83 @@
+# Generated by Django 3.1.1 on 2021-02-20 08:36
+
+import glob
+import os
+from re import search
+
+from django.conf import settings
+from django.db import migrations
+
+from cvat.apps.engine.models import (DimensionType, StorageChoice,
+                                     StorageMethodChoice)
+from utils.dataset_manifest import ImageManifestManager, VideoManifestManager
+
+def migrate_data(apps, shema_editor):
+    Data = apps.get_model("engine", "Data")
+    query_set = Data.objects.filter(storage_method=StorageMethodChoice.CACHE)
+    for db_data in query_set:
+        try:
+            upload_dir = '{}/{}/raw'.format(settings.MEDIA_DATA_ROOT, db_data.id)
+            if os.path.exists(os.path.join(upload_dir, 'meta_info.txt')):
+                    os.remove(os.path.join(upload_dir, 'meta_info.txt'))
+            else:
+                for path in glob.glob(f'{upload_dir}/dummy_*.txt'):
+                    os.remove(path)
+            # it's necessary for case with long data migration
+            if os.path.exists(os.path.join(upload_dir, 'manifest.jsonl')):
+                continue
+            data_dir = upload_dir if db_data.storage == StorageChoice.LOCAL else settings.SHARE_ROOT
+            if hasattr(db_data, 'video'):
+                media_file = os.path.join(data_dir, db_data.video.path)
+                manifest = VideoManifestManager(manifest_path=upload_dir)
+                meta_info = manifest.prepare_meta(media_file=media_file)
+                manifest.create(meta_info)
+                manifest.init_index()
+            else:
+                manifest = ImageManifestManager(manifest_path=upload_dir)
+                sources = []
+                if db_data.storage == StorageChoice.LOCAL:
+                    for (root, _, files) in os.walk(data_dir):
+                        sources.extend([os.path.join(root, f) for f in files])
+                    sources.sort()
+                # using share, this means that we can not explicitly restore the entire data structure
+                else:
+                    sources = [os.path.join(data_dir, db_image.path) for db_image in db_data.images.all().order_by('frame')]
+                if any(list(filter(lambda x: x.dimension==DimensionType.DIM_3D, db_data.tasks.all()))):
+                    content = []
+                    for source in sources:
+                        name, ext = os.path.splitext(os.path.relpath(source, upload_dir))
+                        content.append({
+                            'name': name,
+                            'extension': ext
+                        })
+                else:
+                    meta_info = manifest.prepare_meta(sources=sources, data_dir=data_dir)
+                    content = meta_info.content
+
+                if db_data.storage == StorageChoice.SHARE:
+                    def _get_frame_step(str_):
+                        match = search("step\s*=\s*([1-9]\d*)", str_)
+                        return int(match.group(1)) if match else 1
+                    step = _get_frame_step(db_data.frame_filter)
+                    start = db_data.start_frame
+                    stop = db_data.stop_frame + 1
+                    images_range = range(start, stop, step)
+                    result_content = []
+                    for i in range(stop):
+                        item = content.pop(0) if i in images_range else dict()
+                        result_content.append(item)
+                    content = result_content
+                manifest.create(content)
+                manifest.init_index()
+        except Exception as ex:
+            print(str(ex))
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('engine', '0037_task_subset'),
+    ]
+
+    operations = [
+        migrations.RunPython(migrate_data)
+    ]
@@ -138,11 +138,10 @@ def get_compressed_chunk_path(self, chunk_number):
     def get_preview_path(self):
         return os.path.join(self.get_data_dirname(), 'preview.jpeg')
 
-    def get_meta_path(self):
-        return os.path.join(self.get_upload_dirname(), 'meta_info.txt')
-
-    def get_dummy_chunk_path(self, chunk_number):
-        return os.path.join(self.get_upload_dirname(), 'dummy_{}.txt'.format(chunk_number))
+    def get_manifest_path(self):
+        return os.path.join(self.get_upload_dirname(), 'manifest.jsonl')
+    def get_index_path(self):
+        return os.path.join(self.get_upload_dirname(), 'index.json')
 
 class Video(models.Model):
     data = models.OneToOneField(Data, on_delete=models.CASCADE, related_name="video", null=True)