Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add tool for converting label studio json result to coco format #2385

Merged
merged 5 commits into from
Jun 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions docs/en/dataset_zoo/dataset_tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -361,3 +361,38 @@ For example,
```shell
python tools/dataset/mat2json work_dirs/res50_mpii_256x256/pred.mat data/mpii/annotations/mpii_val.json pred.json
```

## Label Studio

<details>
<summary align="right"><a href="https://github.com/heartexlabs/label-studio/">Label Studio</a></summary>

```bibtex
@misc{Label Studio,
title={{Label Studio}: Data labeling software},
url={https://github.com/heartexlabs/label-studio},
note={Open source software available from https://github.com/heartexlabs/label-studio},
author={
Maxim Tkachenko and
Mikhail Malyuk and
Andrey Holmanyuk and
Nikolai Liubimov},
year={2020-2022},
}
```

</details>

For users of [Label Studio](https://github.com/heartexlabs/label-studio/), please follow the instructions in the [Label Studio to COCO document](./label_studio.md) to annotate and export the results as a Label Studio `.json` file. And save the `Code` from the `Labeling Interface` as an `.xml` file.

We provide a script to convert Label Studio `.json` annotation file to COCO `.json` format file. It can be used by running the following command:

```shell
python tools/dataset_converters/labelstudio2coco.py ${LS_JSON_FILE} ${LS_XML_FILE} ${OUTPUT_COCO_JSON_FILE}
```

For example,

```shell
python tools/dataset_converters/labelstudio2coco.py config.xml project-1-at-2023-05-13-09-22-91b53efa.json output/result.json
```
76 changes: 76 additions & 0 deletions docs/en/dataset_zoo/label_studio.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Label Studio Annotations to COCO Script

[Label Studio](https://labelstud.io/) is a popular deep learning annotation tool that can be used for annotating various tasks. However, for keypoint annotation, Label Studio can not directly export to the COCO format required by MMPose. This article will explain how to use Label Studio to annotate keypoint data and convert it into the required COCO format using the [labelstudio2coco.py](../../../tools/dataset_converters/labelstudio2coco.py) tool.

## Label Studio Annotation Requirements

According to the COCO format requirements, each annotated instance needs to include information about keypoints, segmentation, and bounding box (bbox). However, Label Studio scatters this information across different instances during annotation. Therefore, certain rules need to be followed during annotation to ensure proper usage with the subsequent scripts.

1. Label Interface Setup

For a newly created Label Studio project, the label interface needs to be set up. There should be three types of annotations: `KeyPointLabels`, `PolygonLabels`, and `RectangleLabels`, which correspond to `keypoints`, `segmentation`, and `bbox` in the COCO format, respectively. The following is an example of a label interface. You can find the `Labeling Interface` in the project's `Settings`, click on `Code`, and paste the following example.

```xml
<View>
<KeyPointLabels name="kp-1" toName="img-1">
<Label value="person" background="#D4380D"/>
</KeyPointLabels>
<PolygonLabels name="polygonlabel" toName="img-1">
<Label value="person" background="#0DA39E"/>
</PolygonLabels>
<RectangleLabels name="label" toName="img-1">
<Label value="person" background="#DDA0EE"/>
</RectangleLabels>
<Image name="img-1" value="$img"/>
</View>
```

2. Annotation Order

Since it is necessary to combine annotations of different types into one instance, a specific order of annotation is required to determine whether the annotations belong to the same instance. Annotations should be made in the order of `KeyPointLabels` -> `PolygonLabels`/`RectangleLabels`. The order and number of `KeyPointLabels` should match the order and number of keypoints specified in the `dataset_info` in MMPose configuration file. The annotation order of `PolygonLabels` and `RectangleLabels` can be interchangeable, and only one of them needs to be annotated. The annotation should be within one instance starts with keypoints and ends with non-keypoints. The following image shows an annotation example:

*Note: The bbox and area will be calculated based on the later PolygonLabels/RectangleLabels. If you annotate PolygonLabels first, the bbox will be based on the range of the later RectangleLabels, and the area will be equal to the area of the rectangle. Conversely, they will be based on the minimum bounding rectangle of the polygon and the area of the polygon.*

![image](https://github.com/open-mmlab/mmpose/assets/15847281/b2d004d0-8361-42c5-9180-cfbac0373a94)

3. Exporting Annotations

Once the annotations are completed as described above, they need to be exported. Select the `Export` button on the project interface, choose the `JSON` format, and click `Export` to download the JSON file containing the labels.

*Note: The exported file only contains the labels and does not include the original images. Therefore, the corresponding annotated images need to be provided separately. It is not recommended to use directly uploaded files because Label Studio truncates long filenames. Instead, use the export COCO format tool available in the `Export` functionality, which includes a folder with the image files within the downloaded compressed package.*

![image](https://github.com/open-mmlab/mmpose/assets/15847281/9f54ca3d-8cdd-4d7f-8ed6-494badcfeaf2)

## Usage of the Conversion Tool Script

The conversion tool script is located at `tools/dataset_converters/labelstudio2coco.py`and can be used as follows:

```bash
python tools/dataset_converters/labelstudio2coco.py config.xml project-1-at-2023-05-13-09-22-91b53efa.json output/result.json
```

Where `config.xml` contains the code from the Labeling Interface mentioned earlier, `project-1-at-2023-05-13-09-22-91b53efa.json` is the JSON file exported from Label Studio, and `output/result.json` is the path to the resulting JSON file in COCO format. If the path does not exist, the script will create it automatically.

Afterward, place the image folder in the output directory to complete the conversion of the COCO dataset. The directory structure can be as follows:

```bash
.
├── images
│   ├── 38b480f2.jpg
│   └── aeb26f04.jpg
└── result.json

```

If you want to use this dataset in MMPose, you can make modifications like the following example:

```python
dataset=dict(
type=dataset_type,
data_root=data_root,
data_mode=data_mode,
ann_file='result.json',
data_prefix=dict(img='images/'),
pipeline=train_pipeline,
)
```
35 changes: 35 additions & 0 deletions docs/zh_cn/dataset_zoo/dataset_tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -376,3 +376,38 @@ python tools/dataset_converters/mat2json ${PRED_MAT_FILE} ${GT_JSON_FILE} ${OUTP
```shell
python tools/dataset/mat2json work_dirs/res50_mpii_256x256/pred.mat data/mpii/annotations/mpii_val.json pred.json
```

## Label Studio 数据集

<details>
<summary align="right"><a href="https://github.com/heartexlabs/label-studio/">Label Studio</a></summary>

```bibtex
@misc{Label Studio,
title={{Label Studio}: Data labeling software},
url={https://github.com/heartexlabs/label-studio},
note={Open source software available from https://github.com/heartexlabs/label-studio},
author={
Maxim Tkachenko and
Mikhail Malyuk and
Andrey Holmanyuk and
Nikolai Liubimov},
year={2020-2022},
}
```

</details>

对于 [Label Studio](https://github.com/heartexlabs/label-studio/) 用户,请依照 [Label Studio 转换工具文档](./label_studio.md) 中的方法进行标注,并将结果导出为 Label Studio 标准的 `.json` 文件,将 `Labeling Interface` 中的 `Code` 保存为 `.xml` 文件。

我们提供了一个脚本来将 Label Studio 标准的 `.json` 格式标注文件转换为 COCO 标准的 `.json` 格式。这可以通过运行以下命令完成:

```shell
python tools/dataset_converters/labelstudio2coco.py ${LS_JSON_FILE} ${LS_XML_FILE} ${OUTPUT_COCO_JSON_FILE}
```

例如:

```shell
python tools/dataset_converters/labelstudio2coco.py config.xml project-1-at-2023-05-13-09-22-91b53efa.json output/result.json
```
76 changes: 76 additions & 0 deletions docs/zh_cn/dataset_zoo/label_studio.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Label Studio 标注工具转COCO脚本

[Label Studio](https://labelstud.io/) 是一款广受欢迎的深度学习标注工具,可以对多种任务进行标注,然而对于关键点标注,Label Studio 无法直接导出成 MMPose 所需要的 COCO 格式。本文将介绍如何使用Label Studio 标注关键点数据,并利用 [labelstudio2coco.py](../../../tools/dataset_converters/labelstudio2coco.py) 工具将其转换为训练所需的格式。

## Label Studio 标注要求

根据 COCO 格式的要求,每个标注的实例中都需要包含关键点、分割和 bbox 的信息,然而 Label Studio 在标注时会将这些信息分散在不同的实例中,因此需要按一定规则进行标注,才能正常使用后续的脚本。

1. 标签接口设置

对于一个新建的 Label Studio 项目,首先要设置它的标签接口。这里需要有三种类型的标注:`KeyPointLabels`、`PolygonLabels`、`RectangleLabels`,分别对应 COCO 格式中的`keypoints`、`segmentation`、`bbox`。以下是一个标签接口的示例,可以在项目的`Settings`中找到`Labeling Interface`,点击`Code`,粘贴使用该示例。

```xml
<View>
<KeyPointLabels name="kp-1" toName="img-1">
<Label value="person" background="#D4380D"/>
</KeyPointLabels>
<PolygonLabels name="polygonlabel" toName="img-1">
<Label value="person" background="#0DA39E"/>
</PolygonLabels>
<RectangleLabels name="label" toName="img-1">
<Label value="person" background="#DDA0EE"/>
</RectangleLabels>
<Image name="img-1" value="$img"/>
</View>
```

2. 标注顺序

由于需要将多个标注实例中的不同类型标注组合到一个实例中,因此采取了按特定顺序标注的方式,以此来判断各标注是否位于同一个实例。标注时须按照 **KeyPointLabels -> PolygonLabels/RectangleLabels** 的顺序标注,其中 KeyPointLabels 的顺序和数量要与 MMPose 配置文件中的`dataset_info`的关键点顺序和数量一致, PolygonLabels 和 RectangleLabels 的标注顺序可以互换,且可以只标注其中一个,只要保证一个实例的标注中,以关键点开始,以非关键点结束即可。下图为标注的示例:

*注:bbox 和 area 会根据靠后的 PolygonLabels/RectangleLabels 来计算,如若先标 PolygonLabels,那么bbox会是靠后的 RectangleLabels 的范围,面积为矩形的面积,反之则是多边形外接矩形和多边形的面积*

![image](https://github.com/open-mmlab/mmpose/assets/15847281/b2d004d0-8361-42c5-9180-cfbac0373a94)

3. 导出标注

上述标注完成后,需要将标注进行导出。选择项目界面的`Export`按钮,选择`JSON`格式,再点击`Export`即可下载包含标签的 JSON 格式文件。

*注:上述文件中仅仅包含标签,不包含原始图片,因此需要额外提供标注对应的图片。由于 Label Studio 会对过长的文件名进行截断,因此不建议直接使用上传的文件,而是使用`Export`功能中的导出 COCO 格式工具,使用压缩包内的图片文件夹。*

![image](https://github.com/open-mmlab/mmpose/assets/15847281/9f54ca3d-8cdd-4d7f-8ed6-494badcfeaf2)

## 转换工具脚本的使用

转换工具脚本位于`tools/dataset_converters/labelstudio2coco.py`,使用方式如下:

```bash
python tools/dataset_converters/labelstudio2coco.py config.xml project-1-at-2023-05-13-09-22-91b53efa.json output/result.json
```

其中`config.xml`的内容为标签接口设置中提到的`Labeling Interface`中的`Code`,`project-1-at-2023-05-13-09-22-91b53efa.json`即为导出标注时导出的 Label Studio 格式的 JSON 文件,`output/result.json`为转换后得到的 COCO 格式的 JSON 文件路径,若路径不存在,该脚本会自动创建路径。

随后,将图片的文件夹放置在输出目录下,即可完成 COCO 数据集的转换。目录结构示例如下:

```bash
.
├── images
│   ├── 38b480f2.jpg
│   └── aeb26f04.jpg
└── result.json

```

若想在 MMPose 中使用该数据集,可以进行类似如下的修改:

```python
dataset=dict(
type=dataset_type,
data_root=data_root,
data_mode=data_mode,
ann_file='result.json',
data_prefix=dict(img='images/'),
pipeline=train_pipeline,
)
```
Loading