Data export documentation update (cvat-ai#6795)

retailnext · Oct 25, 2023 · 59ce551 · 59ce551
1 parent 55b6a95
commit 59ce551
Show file tree

Hide file tree

Showing 25 changed files with 682 additions and 385 deletions.
diff --git a/site/content/en/docs/manual/advanced/formats/_index.md b/site/content/en/docs/manual/advanced/formats/_index.md
@@ -1,28 +1,103 @@
 ---
-title: 'Formats'
-linkTitle: 'Formats'
+title: 'Export annotations and data from CVAT'
+linkTitle: 'Export annotations and data from CVAT'
 weight: 20
-description: 'List of annotation formats supported by CVAT.'
+description: 'List of data export formats formats supported by CVAT.'
 ---
 
-#### CVAT supported the following formats:
-
-- [CVAT](format-cvat)
-- [Datumaro](format-datumaro)
-- [LabelMe](format-labelme)
-- [MOT](format-mot)
-- [MOTS](format-mots)
-- [COCO](format-coco)
-- [PASCAL VOC and mask](format-voc)
-- [YOLO](format-yolo)
-- [TF detection API](format-tfrecord)
-- [ImageNet](format-imagenet)
-- [CamVid](format-camvid)
-- [WIDER Face](format-widerface)
-- [VGGFace2](format-vggface2)
-- [Market-1501](format-market1501)
-- [ICDAR13/15](format-icdar)
-- [Open Images](format-openimages)
-- [Cityscapes](format-cityscapes)
-- [KITTI](format-kitti)
-- [LFW](format-lfw)
+In CVAT, you have the option to export data in various formats.
+The choice of export format depends on the type of annotation as
+well as the intended future use of the dataset.
+
+See:
+
+- [Data export formats](#data-export-formats)
+- [Exporting dataset in CVAT](#exporting-dataset-in-cvat)
+  - [Exporting dataset from Task](#exporting-dataset-from-task)
+  - [Exporting dataset from Job](#exporting-dataset-from-job)
+- [Data export video tutorial](#data-export-video-tutorial)
+
+## Data export formats
+
+The table below outlines the available formats for data export in CVAT.
+
+<!--lint disable maximum-line-length-->
+
+| Format                                                                                                                              | Type          | Annotation Type                                             | Models                                                                                                                                                                                  | Shapes                                                                                 | Attributes           | Video Tracks  |
+| ----------------------------------------------------------------------------------------------------------------------------------- | ------------- | ----------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | -------------------- | ------------- |
+| [CamVid 1.0](format-camvid)                                                                                                         | .txt <br>.png | Semantic <br>Segmentation                                   | U-Net, SegNet, DeepLab, <br>PSPNet, FCN, Mask R-CNN, <br>ICNet, ERFNet, HRNet, <br>V-Net, and others.                                                                                   | Polygons                                                                               | Not supported        | Not supported |
+| [Cityscapes 1.0](format-cityscapes)                                                                                                 | .txt<br>.png  | Semantic<br>Segmentation                                    | U-Net, SegNet, DeepLab, <br>PSPNet, FCN, ERFNet, <br>ICNet, Mask R-CNN, HRNet, <br>ENet, and others.                                                                                    | Polygons                                                                               | Specific attributes  | Not supported |
+| [COCO 1.0](format-coco)                                                                                                             | JSON          | Detection, Semantic <br>Segmentation                        | YOLO (You Only Look Once), <br>Faster R-CNN, Mask R-CNN, SSD (Single Shot MultiBox Detector), <br> RetinaNet, EfficientDet, UNet, <br>DeepLabv3+, CenterNet, Cascade R-CNN, and others. | Bounding Boxes, Polygons                                                               | Specific attributes  | Not supported |
+| [COCO Keypoings 1.0](coco-keypoints)                                                                                                | .xml          | Keypoints                                                   | OpenPose, PoseNet, AlphaPose, <br> SPM (Single Person Model), <br>Mask R-CNN with Keypoint Detection:, and others.                                                                      | Skeletons                                                                              | Specific attributes  | Not supported |
+| [CVAT for images 1.1](/docs/manual/advanced/formats/format-cvat/#cvat-for-videos-export)                                            | .xml          | Universal format<br> for all types of <br>annotations.      | Universal format<br> for all types of <br>models.                                                                                                                                       | Bounding Boxes, Polygons, <br>Polylines, Points, Cuboids, <br>Skeletons, Tags.         | All attributes       | Not supported |
+| [CVAT for video 1.1](/docs/manual/advanced/formats/format-cvat/#cvat-for-videos-export)                                             | .xml          | Universal format<br> for all types of <br>annotations.      | Universal format<br> for all types of <br>annotations.                                                                                                                                  | Bounding Boxes, Polygons, <br>Polylines, Points, Cuboids, <br>Skeletons, Tags, Tracks. | All attributes       | Supported     |
+| [Datumaro 1.0](format-datumaro)                                                                                                     | JSON          | Universal format<br> for all types of <br>annotations.      | Universal format<br> for all types of <br>models.                                                                                                                                       | Bounding Boxes, Polygons, <br>Polylines, Points, Cuboids, <br>Skeletons, Tags, Tracks. | All attributes       | Supported     |
+| [ICDAR](format-icdar)<br> Includes ICDAR Recognition 1.0, <br>ICDAR Detection 1.0, <br>and ICDAR Segmentation 1.0 <br>descriptions. | .txt          | Text recognition, <br>Text detection, <br>Text segmentation | EAST: Efficient and Accurate <br>Scene Text Detector, CRNN, Mask TextSpotter, TextSnake, <br>and others.                                                                                | Tag, Bounding Boxes, Polygons                                                          | Specific attributes  | Not supported |
+| [ImageNet 1.0](format-imagenet)                                                                                                     | .jpg <br>.txt | Semantic Segmentation, <br>Classification, <br>Detection    | VGG (VGG16, VGG19), Inception, YOLO, Faster R-CNN , U-Net, and others                                                                                                                   | Tags                                                                                   | No attributes        | Not supported |
+| [KITTI 1.0](format-kitti)                                                                                                           | .txt <br>.png | Semantic Segmentation, Detection, 3D                        | PointPillars, SECOND, AVOD, YOLO, DeepSORT, PWC-Net, ORB-SLAM, and others.                                                                                                              | Bounding Boxes, Polygons                                                               | Specific attributes  | Not supported |
+| [LabelMe 3.0](format-labelme)                                                                                                       | .xml          | Compatibility, <br>Semantic Segmentation                    | U-Net, Mask R-CNN, Fast R-CNN,<br> Faster R-CNN, DeepLab, YOLO, <br>and others.                                                                                                         | Bounding Boxes, Polygons                                                               | Supported (Polygons) | Not supported |
+| [LFW 1.0](format-lfw)                                                                                                               | .txt          | Verification, <br>Face recognition                          | OpenFace, VGGFace & VGGFace2, <br>FaceNet, ArcFace, <br>and others.                                                                                                                     | Tags, Skeletons                                                                        | Specific attributes  | Not supported |
+| [Market-1501 1.0](format-market1501)                                                                                                | .txt          | Re-identification                                           | Triplet Loss Networks, <br>Deep ReID models, and others.                                                                                                                                | Bounding Boxes                                                                         | Specific attributes  | Not supported |
+| [MOT 1.0](format-mot)                                                                                                               | .txt          | Video Tracking, <br>Detection                               | SORT, MOT-Net, IOU Tracker, <br>and others.                                                                                                                                             | Bounding Boxes, Tracks                                                                 | Specific attributes  | Supported     |
+| [MOTS PNG 1.0](format-mots)                                                                                                         | .png<br>.txt  | Video Tracking, <br>Detection                               | SORT, MOT-Net, IOU Tracker, <br>and others.                                                                                                                                             | Bounding Boxes, Tracks, Masks                                                          | Specific attributes  | Supported     |
+| [Open Images 1.0](format-openimages)                                                                                                | .csv          | Detection, <br>Classification, <br>Semantic Segmentaion     | Faster R-CNN, YOLO, U-Net, <br>CornerNet, and others.                                                                                                                                   | Bounding Boxes, Tags, Polygons                                                         | Specific attributes  | Not supported |
+| [PASCAL VOC 1.0](format-voc)                                                                                                        | .xml          | Classification, Detection                                   | Faster R-CNN, SSD, YOLO, <br>AlexNet, and others.                                                                                                                                       | Bounding Boxes, Tags, Polygons                                                         | Specific attributes  | Not supported |
+| [Segmentation Mask 1.0](format-smask)                                                                                               | .txt          | Semantic Segmentation                                       | Faster R-CNN, SSD, YOLO, <br>AlexNet, and others.                                                                                                                                       | Polygons                                                                               | No attributes        | Not supported |
+| [TFRecord 1.0](format-tfrecord)                                                                                                     | .pbtxt        | Detection<br>Classification                                 | SSD, Faster R-CNN, YOLO, <br>GG16, ResNet, Inception, MobileNet, <br>and others.                                                                                                        | Bounding Boxes, Polygons                                                               | No attributes        | Not supported |
+| [VGGFace2 1.0](format-vggface2)                                                                                                     | .csv          | Face recognition                                            | VGGFace, ResNet, Inception, <br> and others.                                                                                                                                            | Bounding Boxes, Points                                                                 | No attributes        | Not supported |
+| [WIDER Face 1.0](format-widerface)                                                                                                  | .txt          | Detection                                                   | SSD (Single Shot MultiBox Detector), Faster R-CNN, YOLO, <br>and others.                                                                                                                | Bounding Boxes, Tags                                                                   | Specific attributes  | Not supported |
+| [YOLO 1.0](format-yolo)                                                                                                             | .txt          | Detection                                                   | YOLOv1, YOLOv2 (YOLO9000), <br>YOLOv3, YOLOv4, and others.                                                                                                                              | Bounding Boxes                                                                         | No attributes        | Not supported |
+
+<!--lint enable maximum-line-length-->
+
+## Exporting dataset in CVAT
+
+### Exporting dataset from Task
+
+To export the dataset from the task, follow these steps:
+
+1. Open Task.
+2. Go to **Actions** > **Export task dataset.**
+3. Choose the desired format from the list of available options.
+
+4. (Optional) Toggle the **Save images** switch if you
+   wish to include images in the export.
+
+   > **Note**: The **Save images** option is a **paid feature**.
+
+   ![Save images option](/images/export_job_as_dataset_dialog.png)
+
+5. Input a name for the resulting `.zip` archive.
+
+6. Click **OK** to initiate the export.
+
+### Exporting dataset from Job
+
+To export a dataset from Job follow these steps:
+
+1. Navigate to **Menu** > **Export job dataset**.
+
+   ![Export dataset](/images/export_job_as_dataset_menu.png)
+
+2. Choose the desired format from the list of available options.
+
+3. (Optional) Toggle the **Save images** switch
+   if you wish to include images in the export.
+
+   > **Note**: The **Save images** option is a **paid feature**.
+
+   ![Save images option](/images/export_job_as_dataset_dialog.png)
+
+4. Input a name for the resulting `.zip` archive.
+
+5. Click **OK** to initiate the export.
+
+## Data export video tutorial
+
+For more information on the process, see the following tutorial:
+
+<!--lint disable maximum-line-length-->
+
+<iframe width="560" height="315" src="https://www.youtube.com/embed/gzjVpVV9orE?si=2tiBIqts8nk_byTH" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
+
+<!--lint enable maximum-line-length-->
diff --git a/site/content/en/docs/manual/advanced/formats/coco-keypoints.md b/site/content/en/docs/manual/advanced/formats/coco-keypoints.md
@@ -0,0 +1,65 @@
+---
+linkTitle: 'COCO Keypoints'
+weight: 5
+---
+
+The COCO Keypoints format is designed specifically for human pose estimation tasks, where the objective
+is to identify and localize body joints (keypoints) on a human figure within an image.
+
+This specialized format is used with a variety of state-of-the-art models focused on pose estimation.
+
+For more information, see:
+
+- [COCO Keypoint site](https://cocodataset.org/#keypoints-2020)
+- [Format specification](https://openvinotoolkit.github.io/datumaro/latest/docs/data-formats/formats/coco.html)
+- [Example of the archive](https://openvinotoolkit.github.io/datumaro/latest/docs/data-formats/formats/coco.html#import-coco-dataset)
+
+## COCO Keypoints export
+
+For export of images:
+
+- Supported annotations: Skeletons
+- Attributes:
+  - `is_crowd` This can either be a checkbox or an integer
+    (with values of 0 or 1). It indicates that the instance
+    (or group of objects) should include an RLE-encoded mask in the `segmentation` field.
+    All shapes within the group coalesce into a single, overarching mask,
+    with the largest shape setting the properties for the entire object group.
+  - `score`: This numerical field represents the annotation `score`.
+  - Arbitrary attributes: These will be stored within the `attributes`
+    section of the annotation.
+- Tracks: Not supported.
+
+Downloaded file is a .zip archive with the following structure:
+
+```
+archive.zip/
+├── images/
+│
+│   ├── <image_name1.ext>
+│   ├── <image_name2.ext>
+│   └── ...
+├──<annotations>.xml
+```
+
+## COCO import
+
+Uploaded file: a single unpacked `*.json` or a zip archive with the structure described
+[here](https://openvinotoolkit.github.io/datumaro/latest/docs/data-formats/formats/coco.html#import-coco-dataset)
+(without images).
+
+- supported annotations: Skeletons
+
+`person_keypoints`,
+
+Support for COCO tasks via Datumaro is described [here](https://openvinotoolkit.github.io/datumaro/latest/docs/data-formats/formats/coco.html#export-to-other-formats)
+For example, [support for COCO keypoints over Datumaro](https://github.com/openvinotoolkit/cvat/issues/2910#issuecomment-726077582):
+
+1. Install [Datumaro](https://github.com/openvinotoolkit/datumaro)
+   `pip install datumaro`
+2. Export the task in the `Datumaro` format, unzip
+3. Export the Datumaro project in `coco` / `coco_person_keypoints` formats
+   `datum export -f coco -p path/to/project [-- --save-images]`
+
+This way, one can export CVAT points as single keypoints or
+keypoint lists (without the `visibility` COCO flag).
diff --git a/site/content/en/docs/manual/advanced/formats/format-camvid.md b/site/content/en/docs/manual/advanced/formats/format-camvid.md
@@ -3,13 +3,25 @@ linkTitle: 'CamVid'
 weight: 10
 ---
 
-# [CamVid](http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/)
+The CamVid (Cambridge-driving Labeled Video Database) format is most commonly used
+in the realm of semantic segmentation tasks. It is particularly useful for training
+and evaluating models for autonomous driving and other vision-based robotics
+applications.
 
+For more information, see:
+
+- [CamVid Specification](http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/)
 - [Dataset examples](https://github.com/cvat-ai/datumaro/tree/v0.3/tests/assets/camvid_dataset)
 
 ## CamVid export
 
-Downloaded file: a zip archive of the following structure:
+For export of images and videos:
+
+- Supported annotations: Bounding Boxes, Polygons.
+- Attributes: Not supported.
+- Tracks: Not supported.
+
+The downloaded file is a .zip archive with the following structure:
 
 ```bash
 taskname.zip/
@@ -41,14 +53,18 @@ Bicyclist
 Bridge
 ```
 
-Mask is a `png` image with 1 or 3 channels where each pixel
-has own color which corresponds to a label.
-`(0, 0, 0)` is used for background by default.
+A mask in the CamVid dataset is typically a **.png**
+image with either one or three channels.
+
+In this image, each pixel is assigned a specific color
+that corresponds to a particular label.
 
-- supported annotations: Rectangles, Polygons
+By default, the color `(0, 0, 0)`—or `black`—is used
+to represent the background.
 
 ## CamVid import
 
-Uploaded file: a zip archive of the structure above
+For import of images:
 
+- Uploaded file: a _.zip_ archive of the structure above
 - supported annotations: Polygons