Skip to content

Commit

Permalink
Support for ICDAR dataset format (#2866)
Browse files Browse the repository at this point in the history
Co-authored-by: Maxim Zhiltsov <[email protected]>
  • Loading branch information
yasakova-anastasia and Maxim Zhiltsov authored Mar 26, 2021
1 parent 30bf11f commit efad0b0
Show file tree
Hide file tree
Showing 8 changed files with 382 additions and 14 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [Market-1501](https://www.aitribune.com/dataset/2018051063) format support (<https://github.com/openvinotoolkit/cvat/pull/2869>)
- Ability of upload manifest for dataset with images (<https://github.com/openvinotoolkit/cvat/pull/2763>)
- Annotations filters UI using react-awesome-query-builder (https://github.com/openvinotoolkit/cvat/issues/1418)
- [ICDAR](https://rrc.cvc.uab.es/?ch=2) format support (<https://github.com/openvinotoolkit/cvat/pull/2866>)

### Changed

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ For more information about supported formats look at the
| [WIDER Face](http://shuoyang1213.me/WIDERFACE/) | X | X |
| [VGGFace2](https://github.com/ox-vgg/vgg_face2) | X | X |
| [Market-1501](https://www.aitribune.com/dataset/2018051063) | X | X |
| [ICDAR13/15](https://rrc.cvc.uab.es/?ch=2) | X | X |

## Deep learning serverless functions for automatic labeling

Expand Down
90 changes: 80 additions & 10 deletions cvat/apps/dataset_manager/formats/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
- [WIDER Face](#widerface)
- [VGGFace2](#vggface2)
- [Market-1501](#market1501)
- [ICDAR13/15](#icdar)

## How to add a new annotation format support<a id="how-to-add"></a>

Expand Down Expand Up @@ -817,17 +818,17 @@ Downloaded file: a zip archive of the following structure:
```bash
# if we save images:
taskname.zip/
── label1/
├── label1_image1.jpg
└── label1_image2.jpg
── label1/
| ├── label1_image1.jpg
| └── label1_image2.jpg
└── label2/
├── label2_image1.jpg
├── label2_image3.jpg
└── label2_image4.jpg
# if we keep only annotation:
taskname.zip/
── <any_subset_name>.txt
── <any_subset_name>.txt
└── synsets.txt
```
Expand All @@ -849,12 +850,12 @@ Downloaded file: a zip archive of the following structure:
```bash
taskname.zip/
├── labelmap.txt # optional, required for non-CamVid labels
── <any_subset_name>/
├── image1.png
└── image2.png
── <any_subset_name>annot/
├── image1.png
└── image2.png
── <any_subset_name>/
| ├── image1.png
| └── image2.png
── <any_subset_name>annot/
| ├── image1.png
| └── image2.png
└── <any_subset_name>.txt
# labelmap.txt
Expand Down Expand Up @@ -974,3 +975,72 @@ s1 - sequence
Uploaded file: a zip archive of the structure above

- supported annotations: Label `market-1501` with atrributes (`query`, `person_id`, `camera_id`)

### [ICDAR13/15](https://rrc.cvc.uab.es/?ch=2)<a id="icdar" />

#### ICDAR13/15 Dumper

Downloaded file: a zip archive of the following structure:

```bash
# word recognition task
taskname.zip/
└── word_recognition/
└── <any_subset_name>/
├── images
| ├── word1.png
| └── word2.png
└── gt.txt
# text localization task
taskname.zip/
└── text_localization/
└── <any_subset_name>/
├── images
| ├── img_1.png
| └── img_2.png
├── gt_img_1.txt
└── gt_img_1.txt
#text segmentation task
taskname.zip/
└── text_localization/
└── <any_subset_name>/
├── images
| ├── 1.png
| └── 2.png
├── 1_GT.bmp
├── 1_GT.txt
├── 2_GT.bmp
└── 2_GT.txt
```

**Word recognition task**:

- supported annotations: Label `icdar` with attribute `caption`

**Text localization task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attribute `text`

**Text segmentation task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attributes `index`, `text`, `color`, `center`

#### ICDAR13/15 Loader

Uploaded file: a zip archive of the structure above

**Word recognition task**:

- supported annotations: Label `icdar` with attribute `caption`

**Text localization task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attribute `text`

**Text segmentation task**:

- supported annotations: Rectangles and Polygons with label `icdar`
and attributes `index`, `text`, `color`, `center`
131 changes: 131 additions & 0 deletions cvat/apps/dataset_manager/formats/icdar.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# Copyright (C) 2021 Intel Corporation
#
# SPDX-License-Identifier: MIT

import zipfile
from tempfile import TemporaryDirectory

from datumaro.components.dataset import Dataset
from datumaro.components.extractor import (AnnotationType, Caption, Label,
LabelCategories, Transform)

from cvat.apps.dataset_manager.bindings import (CvatTaskDataExtractor,
import_dm_annotations)
from cvat.apps.dataset_manager.util import make_zip_archive

from .registry import dm_env, exporter, importer


class AddLabelToAnns(Transform):
def __init__(self, extractor, label):
super().__init__(extractor)

assert isinstance(label, str)
self._categories = {}
label_cat = self._extractor.categories().get(AnnotationType.label)
if not label_cat:
label_cat = LabelCategories()
self._label = label_cat.add(label)
self._categories[AnnotationType.label] = label_cat

def categories(self):
return self._categories

def transform_item(self, item):
annotations = item.annotations
for ann in annotations:
if ann.type in [AnnotationType.polygon,
AnnotationType.bbox, AnnotationType.mask]:
ann.label = self._label
return item.wrap(annotations=annotations)

class CaptionToLabel(Transform):
def __init__(self, extractor, label):
super().__init__(extractor)

assert isinstance(label, str)
self._categories = {}
label_cat = self._extractor.categories().get(AnnotationType.label)
if not label_cat:
label_cat = LabelCategories()
self._label = label_cat.add(label)
self._categories[AnnotationType.label] = label_cat

def categories(self):
return self._categories

def transform_item(self, item):
annotations = item.annotations
captions = [ann for ann in annotations
if ann.type == AnnotationType.caption]
for ann in captions:
annotations.append(Label(self._label,
attributes={'text': ann.caption}))
annotations.remove(ann)
return item.wrap(annotations=annotations)

class LabelToCaption(Transform):
def transform_item(self, item):
annotations = item.annotations
anns = [p for p in annotations
if 'text' in p.attributes]
for ann in anns:
annotations.append(Caption(ann.attributes['text']))
annotations.remove(ann)
return item.wrap(annotations=annotations)

@exporter(name='ICDAR Recognition', ext='ZIP', version='1.0')
def _export_recognition(dst_file, task_data, save_images=False):
dataset = Dataset.from_extractors(CvatTaskDataExtractor(
task_data, include_images=save_images), env=dm_env)
dataset.transform(LabelToCaption)
with TemporaryDirectory() as temp_dir:
dataset.export(temp_dir, 'icdar_word_recognition', save_images=save_images)
make_zip_archive(temp_dir, dst_file)

@importer(name='ICDAR Recognition', ext='ZIP', version='1.0')
def _import(src_file, task_data):
with TemporaryDirectory() as tmp_dir:
zipfile.ZipFile(src_file).extractall(tmp_dir)
dataset = Dataset.import_from(tmp_dir, 'icdar_word_recognition', env=dm_env)
dataset.transform(CaptionToLabel, 'icdar')
import_dm_annotations(dataset, task_data)


@exporter(name='ICDAR Localization', ext='ZIP', version='1.0')
def _export_localization(dst_file, task_data, save_images=False):
dataset = Dataset.from_extractors(CvatTaskDataExtractor(
task_data, include_images=save_images), env=dm_env)
with TemporaryDirectory() as temp_dir:
dataset.export(temp_dir, 'icdar_text_localization', save_images=save_images)
make_zip_archive(temp_dir, dst_file)

@importer(name='ICDAR Localization', ext='ZIP', version='1.0')
def _import(src_file, task_data):
with TemporaryDirectory() as tmp_dir:
zipfile.ZipFile(src_file).extractall(tmp_dir)

dataset = Dataset.import_from(tmp_dir, 'icdar_text_localization', env=dm_env)
dataset.transform(AddLabelToAnns, 'icdar')
import_dm_annotations(dataset, task_data)


@exporter(name='ICDAR Segmentation', ext='ZIP', version='1.0')
def _export_segmentation(dst_file, task_data, save_images=False):
dataset = Dataset.from_extractors(CvatTaskDataExtractor(
task_data, include_images=save_images), env=dm_env)
with TemporaryDirectory() as temp_dir:
dataset.transform('polygons_to_masks')
dataset.transform('boxes_to_masks')
dataset.transform('merge_instance_segments')
dataset.export(temp_dir, 'icdar_text_segmentation', save_images=save_images)
make_zip_archive(temp_dir, dst_file)

@importer(name='ICDAR Segmentation', ext='ZIP', version='1.0')
def _import(src_file, task_data):
with TemporaryDirectory() as tmp_dir:
zipfile.ZipFile(src_file).extractall(tmp_dir)
dataset = Dataset.import_from(tmp_dir, 'icdar_text_segmentation', env=dm_env)
dataset.transform(AddLabelToAnns, 'icdar')
dataset.transform('masks_to_polygons')
import_dm_annotations(dataset, task_data)
1 change: 1 addition & 0 deletions cvat/apps/dataset_manager/formats/registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,3 +98,4 @@ def make_exporter(name):
import cvat.apps.dataset_manager.formats.widerface
import cvat.apps.dataset_manager.formats.vggface2
import cvat.apps.dataset_manager.formats.market1501
import cvat.apps.dataset_manager.formats.icdar
11 changes: 10 additions & 1 deletion cvat/apps/dataset_manager/tests/test_formats.py
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,9 @@ def test_export_formats_query(self):
'WiderFace 1.0',
'VGGFace2 1.0',
'Market-1501 1.0',
'ICDAR Recognition 1.0',
'ICDAR Localization 1.0',
'ICDAR Segmentation 1.0',
})

def test_import_formats_query(self):
Expand All @@ -306,6 +309,9 @@ def test_import_formats_query(self):
'WiderFace 1.0',
'VGGFace2 1.0',
'Market-1501 1.0',
'ICDAR Recognition 1.0',
'ICDAR Localization 1.0',
'ICDAR Segmentation 1.0',
})

def test_exports(self):
Expand All @@ -319,7 +325,7 @@ def check(file_path):

format_name = f.DISPLAY_NAME
if format_name == "VGGFace2 1.0":
self.skipTest("Format does not support multiple shapes for one item")
self.skipTest("Format is disabled")

for save_images in { True, False }:
images = self._generate_task_images(3)
Expand Down Expand Up @@ -349,6 +355,9 @@ def test_empty_images_are_exported(self):
('WiderFace 1.0', 'wider_face'),
('VGGFace2 1.0', 'vgg_face2'),
('Market-1501 1.0', 'market1501'),
('ICDAR Recognition 1.0', 'icdar_word_recognition'),
('ICDAR Localization 1.0', 'icdar_text_localization'),
('ICDAR Segmentation 1.0', 'icdar_text_segmentation'),
]:
with self.subTest(format=format_name):
if not dm.formats.registry.EXPORT_FORMATS[format_name].ENABLED:
Expand Down
Loading

0 comments on commit efad0b0

Please sign in to comment.