diff --git a/CHANGELOG.md b/CHANGELOG.md index 3a4432c8641c..dab0bc8565a4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,25 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [Unreleased] +### Added +- OpenVINO auto annotation: it is possible to upload a custom model and annotate images automatically. + +### Changed +- + +### Deprecated +- + +### Removed +- + +### Fixed +- + +### Security +- + ## [0.3.0] - 2018-12-29 ### Added - Ability to copy Object URL and Frame URL via object context menu and player context menu respectively. diff --git a/README.md b/README.md index bb573e0ebd7c..962f473fe80a 100644 --- a/README.md +++ b/README.md @@ -126,6 +126,13 @@ volumes: ``` You can change the share device path to your actual share. For user convenience we have defined the enviroment variable $CVAT_SHARE_URL. This variable contains a text (url for example) which will be being shown in the client-share browser. +### Additional optional components + +- [Support for Intel OpenVINO: auto annotation](components/openvino/README.md) +- [Analytics: management and monitoring of data annotation team](components/analytics/README.md) +- [TF Object Detection API: auto annotation](components/tf_annotation/README.md) +- [Support for NVIDIA GPUs](components/cuda/README.md) + ## Questions CVAT usage related questions or unclear concepts can be posted in our [Gitter chat](https://gitter.im/opencv-cvat) for **quick replies** from contributors and other users. diff --git a/components/README.md b/components/README.md deleted file mode 100644 index f02bee3b1c63..000000000000 --- a/components/README.md +++ /dev/null @@ -1,6 +0,0 @@ -### There are some additional components for CVAT - -* [NVIDIA CUDA](cuda/README.md) -* [OpenVINO](openvino/README.md) -* [Tensorflow Object Detector](tf_annotation/README.md) -* [Analytics](analytics/README.md) diff --git a/components/analytics/README.md b/components/analytics/README.md index c7cb7fdc12dc..b46a8853322e 100644 --- a/components/analytics/README.md +++ b/components/analytics/README.md @@ -1,5 +1,7 @@ ## Analytics for Computer Vision Annotation Tool (CVAT) +![](/cvat/apps/documentation/static/documentation/images/image097.jpg) + It is possible to proxy annotation logs from client to ELK. To do that run the following command below: ### Build docker image diff --git a/cvat/apps/auto_annotation/README.md b/cvat/apps/auto_annotation/README.md new file mode 100644 index 000000000000..df4f4b3ca9c7 --- /dev/null +++ b/cvat/apps/auto_annotation/README.md @@ -0,0 +1,163 @@ +## Auto annotation + +### Description + +The application will be enabled automatically if OpenVINO™ component is +installed. It allows to use custom models for auto annotation. Only models in +OpenVINO™ toolkit format are supported. If you would like to annotate a +task with a custom model please convert it to the intermediate representation +(IR) format via the model optimizer tool. See [OpenVINO documentation](https://software.intel.com/en-us/articles/OpenVINO-InferEngine) for details. + +### Usage + +To annotate a task with a custom model you need to prepare 4 files: +1. __Model config__ (*.xml) - a text file with network configuration. +1. __Model weights__ (*.bin) - a binary file with trained weights. +1. __Label map__ (*.json) - a simple json file with `label_map` dictionary like +object with string values for label numbers. Values in `label_map` should be +exactly equal to labels for the annotation task, otherwise objects with mismatched +labels will be ignored. + Example: + ```json + { + "label_map": { + "0": "background", + "1": "aeroplane", + "2": "bicycle", + "3": "bird", + "4": "boat", + "5": "bottle", + "6": "bus", + "7": "car", + "8": "cat", + "9": "chair", + "10": "cow", + "11": "diningtable", + "12": "dog", + "13": "horse", + "14": "motorbike", + "15": "person", + "16": "pottedplant", + "17": "sheep", + "18": "sofa", + "19": "train", + "20": "tvmonitor" + } + } + ``` +1. __Interpretation script__ (*.py) - a file used to convert net output layer +to a predefined structure which can be processed by CVAT. This code will be run +inside a restricted python's environment, but it's possible to use some +builtin functions like __str, int, float, max, min, range__. + + Also two variables are available in the scope: + + - __detections__ - a list of dictionaries with detections for each frame: + * __frame_id__ - frame number + * __frame_height__ - frame height + * __frame_width__ - frame width + * __detections__ - output np.ndarray (See [ExecutableNetwork.infer](https://software.intel.com/en-us/articles/OpenVINO-InferEngine#inpage-nav-11-6-3) for details). + + - __results__ - an instance of python class with converted results. + Following methods should be used to add shapes: + ```python + # xtl, ytl, xbr, ybr - expected values are float or int + # label - expected value is int + # frame_number - expected value is int + # attributes - dictionary of attribute_name: attribute_value pairs, for example {"confidence": "0.83"} + add_box(self, xtl, ytl, xbr, ybr, label, frame_number, attributes=None) + + # points - list of (x, y) pairs of float or int, for example [(57.3, 100), (67, 102.7)] + # label - expected value is int + # frame_number - expected value is int + # attributes - dictionary of attribute_name: attribute_value pairs, for example {"confidence": "0.83"} + add_points(self, points, label, frame_number, attributes=None) + add_polygon(self, points, label, frame_number, attributes=None) + add_polyline(self, points, label, frame_number, attributes=None) + ``` + +### Examples + +#### [Person-vehicle-bike-detection-crossroad-0078](https://github.com/opencv/open_model_zoo/blob/2018/intel_models/person-vehicle-bike-detection-crossroad-0078/description/person-vehicle-bike-detection-crossroad-0078.md) (OpenVINO toolkit) + +__Note__: Model configuration(*.xml) and weights (*.bin) are available in OpenVINO redistributable package. + +__Task labels__: person vehicle non-vehicle + +__label_map.json__: +```json +{ +"label_map": { + "1": "person", + "2": "vehicle", + "3": "non-vehicle" + } +} +``` +__Interpretation script for SSD based networks__: +```python +def clip(value): + return max(min(1.0, value), 0.0) + +for frame_results in detections: + frame_height = frame_results["frame_height"] + frame_width = frame_results["frame_width"] + frame_number = frame_results["frame_id"] + + for i in range(frame_results["detections"].shape[2]): + confidence = frame_results["detections"][0, 0, i, 2] + if confidence < 0.5: + continue + + results.add_box( + xtl=clip(frame_results["detections"][0, 0, i, 3]) * frame_width, + ytl=clip(frame_results["detections"][0, 0, i, 4]) * frame_height, + xbr=clip(frame_results["detections"][0, 0, i, 5]) * frame_width, + ybr=clip(frame_results["detections"][0, 0, i, 6]) * frame_height, + label=int(frame_results["detections"][0, 0, i, 1]), + frame_number=frame_number, + attributes={ + "confidence": "{:.2f}".format(confidence), + }, + ) +``` + + +#### [Landmarks-regression-retail-0009](https://github.com/opencv/open_model_zoo/blob/2018/intel_models/landmarks-regression-retail-0009/description/landmarks-regression-retail-0009.md) (OpenVINO toolkit) + +__Note__: Model configuration (.xml) and weights (.bin) are available in OpenVINO redistributable package. + +__Task labels__: left_eye right_eye tip_of_nose left_lip_corner right_lip_corner + +__label_map.json__: +```json +{ + "label_map": { + "0": "left_eye", + "1": "right_eye", + "2": "tip_of_nose", + "3": "left_lip_corner", + "4": "right_lip_corner" + } +} +``` +__Interpretation script__: +```python +def clip(value): + return max(min(1.0, value), 0.0) + +for frame_results in detections: + frame_height = frame_results["frame_height"] + frame_width = frame_results["frame_width"] + frame_number = frame_results["frame_id"] + + for i in range(0, frame_results["detections"].shape[1], 2): + x = frame_results["detections"][0, i, 0, 0] + y = frame_results["detections"][0, i + 1, 0, 0] + + results.add_points( + points=[(clip(x) * frame_width, clip(y) * frame_height)], + label=i // 2, # see label map and model output specification, + frame_number=frame_number, + ) +``` diff --git a/cvat/apps/auto_annotation/__init__.py b/cvat/apps/auto_annotation/__init__.py new file mode 100644 index 000000000000..a78eca367e39 --- /dev/null +++ b/cvat/apps/auto_annotation/__init__.py @@ -0,0 +1,8 @@ + +# Copyright (C) 2018 Intel Corporation +# +# SPDX-License-Identifier: MIT + +from cvat.settings.base import JS_3RDPARTY + +JS_3RDPARTY['dashboard'] = JS_3RDPARTY.get('dashboard', []) + ['auto_annotation/js/auto_annotation.js'] diff --git a/cvat/apps/auto_annotation/admin.py b/cvat/apps/auto_annotation/admin.py new file mode 100644 index 000000000000..a59acdef3783 --- /dev/null +++ b/cvat/apps/auto_annotation/admin.py @@ -0,0 +1,4 @@ + +# Copyright (C) 2018 Intel Corporation +# +# SPDX-License-Identifier: MIT diff --git a/cvat/apps/auto_annotation/apps.py b/cvat/apps/auto_annotation/apps.py new file mode 100644 index 000000000000..f421e132c45a --- /dev/null +++ b/cvat/apps/auto_annotation/apps.py @@ -0,0 +1,11 @@ + +# Copyright (C) 2018 Intel Corporation +# +# SPDX-License-Identifier: MIT + +from django.apps import AppConfig + + +class AutoAnnotationConfig(AppConfig): + name = "auto_annotation" + diff --git a/cvat/apps/auto_annotation/image_loader.py b/cvat/apps/auto_annotation/image_loader.py new file mode 100644 index 000000000000..67ce281dafd9 --- /dev/null +++ b/cvat/apps/auto_annotation/image_loader.py @@ -0,0 +1,24 @@ + +# Copyright (C) 2018 Intel Corporation +# +# SPDX-License-Identifier: MIT + +import cv2 + +class ImageLoader(): + def __init__(self, image_list): + self.image_list = image_list + + def __getitem__(self, i): + return self.image_list[i] + + def __iter__(self): + for imagename in self.image_list: + yield imagename, self._load_image(imagename) + + def __len__(self): + return len(self.image_list) + + @staticmethod + def _load_image(path_to_image): + return cv2.imread(path_to_image) diff --git a/cvat/apps/auto_annotation/migrations/__init__.py b/cvat/apps/auto_annotation/migrations/__init__.py new file mode 100644 index 000000000000..d8e62e54b356 --- /dev/null +++ b/cvat/apps/auto_annotation/migrations/__init__.py @@ -0,0 +1,5 @@ + +# Copyright (C) 2018 Intel Corporation +# +# SPDX-License-Identifier: MIT + diff --git a/cvat/apps/auto_annotation/model_loader.py b/cvat/apps/auto_annotation/model_loader.py new file mode 100644 index 000000000000..4359d997b893 --- /dev/null +++ b/cvat/apps/auto_annotation/model_loader.py @@ -0,0 +1,59 @@ + +# Copyright (C) 2018 Intel Corporation +# +# SPDX-License-Identifier: MIT + +import json +import cv2 +import os +import subprocess + +from openvino.inference_engine import IENetwork, IEPlugin + +class ModelLoader(): + def __init__(self, model, weights): + self._model = model + self._weights = weights + + IE_PLUGINS_PATH = os.getenv("IE_PLUGINS_PATH") + if not IE_PLUGINS_PATH: + raise OSError("Inference engine plugin path env not found in the system.") + + plugin = IEPlugin(device="CPU", plugin_dirs=[IE_PLUGINS_PATH]) + if (self._check_instruction("avx2")): + plugin.add_cpu_extension(os.path.join(IE_PLUGINS_PATH, "libcpu_extension_avx2.so")) + elif (self._check_instruction("sse4")): + plugin.add_cpu_extension(os.path.join(IE_PLUGINS_PATH, "libcpu_extension_sse4.so")) + else: + raise Exception("Inference engine requires a support of avx2 or sse4.") + + network = IENetwork.from_ir(model=self._model, weights=self._weights) + supported_layers = plugin.get_supported_layers(network) + not_supported_layers = [l for l in network.layers.keys() if l not in supported_layers] + if len(not_supported_layers) != 0: + raise Exception("Following layers are not supported by the plugin for specified device {}:\n {}". + format(plugin.device, ", ".join(not_supported_layers))) + + self._input_blob_name = next(iter(network.inputs)) + self._output_blob_name = next(iter(network.outputs)) + + self._net = plugin.load(network=network, num_requests=2) + input_type = network.inputs[self._input_blob_name] + self._input_layout = input_type if isinstance(input_type, list) else input_type.shape + + def infer(self, image): + _, _, h, w = self._input_layout + in_frame = image if image.shape[:-1] == (h, w) else cv2.resize(image, (w, h)) + in_frame = in_frame.transpose((2, 0, 1)) # Change data layout from HWC to CHW + return self._net.infer(inputs={self._input_blob_name: in_frame})[self._output_blob_name].copy() + + @staticmethod + def _check_instruction(instruction): + return instruction == str.strip( + subprocess.check_output( + "lscpu | grep -o \"{}\" | head -1".format(instruction), shell=True + ).decode("utf-8")) + +def load_label_map(labels_path): + with open(labels_path, "r") as f: + return json.load(f)["label_map"] diff --git a/cvat/apps/auto_annotation/models.py b/cvat/apps/auto_annotation/models.py new file mode 100644 index 000000000000..a59acdef3783 --- /dev/null +++ b/cvat/apps/auto_annotation/models.py @@ -0,0 +1,4 @@ + +# Copyright (C) 2018 Intel Corporation +# +# SPDX-License-Identifier: MIT diff --git a/cvat/apps/auto_annotation/static/auto_annotation/js/auto_annotation.js b/cvat/apps/auto_annotation/static/auto_annotation/js/auto_annotation.js new file mode 100644 index 000000000000..767c9744753e --- /dev/null +++ b/cvat/apps/auto_annotation/static/auto_annotation/js/auto_annotation.js @@ -0,0 +1,184 @@ +/* + * Copyright (C) 2018 Intel Corporation + * + * SPDX-License-Identifier: MIT +*/ + +"use strict"; + +window.cvat = window.cvat || {}; +window.cvat.dashboard = window.cvat.dashboard || {}; +window.cvat.dashboard.uiCallbacks = window.cvat.dashboard.uiCallbacks || []; +window.cvat.dashboard.uiCallbacks.push(function(newElements) { + let tids = []; + for (let el of newElements) { + tids.push(el.id.split("_")[1]); + } + + $.ajax({ + type: "POST", + url: "/auto_annotation/meta/get", + data: JSON.stringify(tids), + contentType: "application/json; charset=utf-8", + success: (data) => { + newElements.each(function() { + let elem = $(this); + let tid = +elem.attr("id").split("_")[1]; + + const autoAnnoButton = $("").addClass("semiBold dashboardButtonUI dashboardAutoAnno"); + autoAnnoButton.appendTo(elem.find("div.dashboardButtonsUI")[0]); + + if (tid in data && data[tid].active) { + autoAnnoButton.text("Cancel auto annotation"); + autoAnnoButton.addClass("autoAnnotationProcess"); + window.cvat.autoAnnotation.checkAutoAnnotationRequest(tid, autoAnnoButton); + } + + autoAnnoButton.on("click", () => { + if (autoAnnoButton.hasClass("autoAnnotationProcess")) { + $.post(`/auto_annotation/cancel/task/${tid}`).fail( (data) => { + let message = `Error during cancel auto annotation request. Code: ${data.status}. Message: ${data.responseText || data.statusText}`; + showMessage(message); + throw Error(message); + }); + } + else { + let dialogWindow = $(`#${window.cvat.autoAnnotation.modalWindowId}`); + dialogWindow.attr("current_tid", tid); + dialogWindow.removeClass("hidden"); + } + }); + }); + }, + error: (data) => { + let message = `Can not get auto annotation meta info. Code: ${data.status}. Message: ${data.responseText || data.statusText}`; + window.cvat.autoAnnotation.badResponse(message); + } + }); +}); + +window.cvat.autoAnnotation = { + modalWindowId: "autoAnnotationWindow", + autoAnnoFromId: "autoAnnotationForm", + autoAnnoModelFieldId: "autoAnnotationModelField", + autoAnnoWeightsFieldId: "autoAnnotationWeightsField", + autoAnnoConfigFieldId: "autoAnnotationConfigField", + autoAnnoConvertFieldId: "autoAnnotationConvertField", + autoAnnoCloseButtonId: "autoAnnoCloseButton", + autoAnnoSubmitButtonId: "autoAnnoSubmitButton", + + checkAutoAnnotationRequest: (tid, autoAnnoButton) => { + function timeoutCallback() { + $.get(`/auto_annotation/check/task/${tid}`).done((data) => { + if (data.status === "started" || data.status === "queued") { + let progress = Math.round(data.progress) || 0; + autoAnnoButton.text(`Cancel auto annotation (${progress}%)`); + setTimeout(timeoutCallback, 1000); + } + else { + autoAnnoButton.text("Run auto annotation"); + autoAnnoButton.removeClass("autoAnnotationProcess"); + } + }).fail((data) => { + let message = "Error was occurred during check annotation status. " + + `Code: ${data.status}, text: ${data.responseText || data.statusText}`; + window.cvat.autoAnnotation.badResponse(message); + }); + } + setTimeout(timeoutCallback, 1000); + }, + + badResponse(message) { + showMessage(message); + throw Error(message); + }, +}; + +function submitButtonOnClick() { + const annoWindow = $(`#${window.cvat.autoAnnotation.modalWindowId}`); + const tid = annoWindow.attr("current_tid"); + const modelInput = $(`#${window.cvat.autoAnnotation.autoAnnoModelFieldId}`); + const weightsInput = $(`#${window.cvat.autoAnnotation.autoAnnoWeightsFieldId}`); + const configInput = $(`#${window.cvat.autoAnnotation.autoAnnoConfigFieldId}`); + const convFileInput = $(`#${window.cvat.autoAnnotation.autoAnnoConvertFieldId}`); + + const modelFile = modelInput.prop("files")[0]; + const weightsFile = weightsInput.prop("files")[0]; + const configFile = configInput.prop("files")[0]; + const convFile = convFileInput.prop("files")[0]; + + if (!modelFile || !weightsFile || !configFile || !convFile) { + showMessage("All files must be selected"); + return; + } + + let taskData = new FormData(); + taskData.append("model", modelFile); + taskData.append("weights", weightsFile); + taskData.append("config", configFile); + taskData.append("conv_script", convFile); + + $.ajax({ + url: `/auto_annotation/create/task/${tid}`, + type: "POST", + data: taskData, + contentType: false, + processData: false, + }).done(() => { + annoWindow.addClass("hidden"); + const autoAnnoButton = $(`#dashboardTask_${tid} div.dashboardButtonsUI button.dashboardAutoAnno`); + autoAnnoButton.addClass("autoAnnotationProcess"); + window.cvat.autoAnnotation.checkAutoAnnotationRequest(tid, autoAnnoButton); + }).fail((data) => { + let message = "Error was occurred during run annotation request. " + + `Code: ${data.status}, text: ${data.responseText || data.statusText}`; + window.cvat.autoAnnotation.badResponse(message); + }); +} + +document.addEventListener("DOMContentLoaded", () => { + $(`
`).appendTo("body"); + + const annoWindow = $(`#${window.cvat.autoAnnotation.modalWindowId}`); + const closeWindowButton = $(`#${window.cvat.autoAnnotation.autoAnnoCloseButtonId}`); + const submitButton = $(`#${window.cvat.autoAnnotation.autoAnnoSubmitButtonId}`); + + + closeWindowButton.on("click", () => { + annoWindow.addClass("hidden"); + }); + + submitButton.on("click", () => { + submitButtonOnClick(); + }); +}); diff --git a/cvat/apps/auto_annotation/tests.py b/cvat/apps/auto_annotation/tests.py new file mode 100644 index 000000000000..a59acdef3783 --- /dev/null +++ b/cvat/apps/auto_annotation/tests.py @@ -0,0 +1,4 @@ + +# Copyright (C) 2018 Intel Corporation +# +# SPDX-License-Identifier: MIT diff --git a/cvat/apps/auto_annotation/urls.py b/cvat/apps/auto_annotation/urls.py new file mode 100644 index 000000000000..87131757b20e --- /dev/null +++ b/cvat/apps/auto_annotation/urls.py @@ -0,0 +1,14 @@ + +# Copyright (C) 2018 Intel Corporation +# +# SPDX-License-Identifier: MIT + +from django.urls import path +from . import views + +urlpatterns = [ + path("create/task/