Skip to content

🐰 Python lib for remo - the app for annotations and images management in Computer Vision

License

Notifications You must be signed in to change notification settings

rediscovery-io/remo-python

Repository files navigation

WelcomeFeaturesInstallationPython libraryWhat's newWhat's nextCiteGet in touch


Welcome to remo

Remo is a web-based application to organize, annotate and visualize Computer Vision datasets.

It has been designed to be your team's private platform to manage images, in an end-to-end fashion.

Use Remo to:

  • access your datasets from one place, avoiding scattered files and keeping data secure locally
  • quickly annotate your images. We designed our annotation tool from the ground-up
  • build better datasets and models, by exploring in depth your Images and Annotations data
  • collaborate with your team, accessing the same data remotely

Remo runs on Windows, Linux, Mac or directly in Google Colab Notebooks. It can also be served on a private server for team collaboration, or embedded in Jupyter Notebooks.

It's installed via pip or via Docker.

This repo is the open source repo for the Remo python library. To access the docs and try the online demo: https://remo.ai



🌳 Features

Integration from code

  • Easily visualize and browse images, predictions and annotations
  • Flexibility in slicing data, without moving it around: you can create virtual train/test/splits, have data in different folders or even select specific images using tags
  • Allows for a more standardized code interface across tasks


Annotation

  • Faster annotation thanks to an annotation tool we designed from the ground-up
  • Manage annotation progress: organize images by status (to do, done, on hold) and track % completion
  • One-click edits on multiple objects: rename or delete all the objects of a class, duplicate sets of annotation

Supported formats: Polygons, Bounding boxes, Image labels and Tags.

Multiple import and export formats (CoCo, Pascal, CSV, etc). Convenient import and export options (skip images without annotations, append file paths, label encoding, etc)

Read more here: https://remo.ai/docs/annotation-formats/


Dataset management

  • Centralized access to your data - link directly to your images, in whatever folder they are
  • Easily query your data, searching by filename, class, tag
  • Immediately visualize aggregated statistics on your datasets
  • Manage multiple versions of your annotations using Annotation Sets

alt text

🐰 Remo python library

You can see example of usage of the library in our documentiation or in the examples folder:

What Where Colab Links
Documentation Official Docs -
Intro Notebook Intro to Remo-Python notebook -
Uploading annotations Upload Annotations and Predictions Tutorial notebook -
PyTorch Image Classification using Remo PyTorch Image Classification notebook im_classification_tutorial
PyTorch Object Detection using Remo PyTorch Object Detection Notebook obj_detection_tutorial
PyTorch Instance Segmentation with Detectron 2 and Remo PyTorch Instance Segmentation Notebook instance_segmentation_tutorial

😏 Quick installation

You can install Remo via Pip or via Docker

Pip installation

  1. In a Python 3.6+ environment: pip install remo

This will install both the Python library and the remo app.

  1. Initialise config: python -m remo_app init

That's it!

To launch Remo, run python -m remo_app. To call Remo from python once you have a server running, use import remo.

Docker installation

Here are the main steps to install Remo via Docker. For more options and detailed instructions, you can refer to the Remo Docker installation page.

  1. Download docker-compose.yml
  2. Make sure you are using the latest tag available in Docker Hub
  3. Run the following from the same directory where the file lives:
    docker-compose up -d
  1. Access Remo by browsing to http://localhost:8123/

🎉 What's new

01-Sep-2020: Added tutorial on Remo for PyTorch Object Detection

30-Sep-2020: Added export annotations with filtering by tags

30-Oct-2020: Added tutorial using PyTorch's Detectron2 and Remo for Instance Segmentation

06-Nov-2020: Added ability to search images by filename, class or tag - you can now do dataset.search_images() or remo.search_images()


🎁 What we are working on next

  • Tighter integration with PyTorch
  • Ability to split datasets in train vs test
  • Ability to store and inspect models' performance in remo

🐛 Get in touch

If you have any issues around the library, feel free to open an issue in the repo.

For anything else, you can write on our discuss forum.


🔖 Cite

@misc{remo2019,
  author =       {Remo.ai},
  title =        {{Remo.ai: Image Datasets management}},
  howpublished = {\url{https://github.com/rediscovery-io/remo-python}},
  year =         {2019}
}

🙋 For contributors

Contributions to the library are welcome!

Before starting working on something, we suggest to open an issue on the repo or open a thread on the discuss forum to present your plan. It would be great if you could include:

  • what you plan to work on (e.g. model predictions)
  • what's the use case (e.g. I want to be able to calculate performance of my model)
  • what flow you envision (e.g. I have annotated data in Remo, I run the model and save predictions, I run some code to print performance on the prediction)
  • any mod you'd like to see in the Remo app itself (e.g. It'd be great to see an interactive chart of model performance in the app)

We are looking for help with the following, but we are also open to suggestions:

  • integration with Deep Learning frameworks

    • PyTorch / PyTorch lightning
    • Tensorflow / keras
    • Fast.ai
  • expand on the design of Annotation, Image, and AnnotationSet objects

    • general summary statistics
    • exporting annotations
    • statistics on comparison of two annotation sets
    • performance statistics on comparison of two annotation sets (one being predictions)

Structure of the library

The library is organized in 3 main layers:

  • api
  • sdk
  • domain objects, such as datasets

We exepect the end user to use mainly the SDK layer and domain objets.

API is responsible for low level communication with the server. It mostly returns raw data.

SDK doesn't access backend endpoints directly, rather it uses the API layer for that. This layer knows about domain objects, so instead of raw data, it returns domain objects.

Domain objects keeps entity information and knows about the SDK layer. Most functions are simple short-hands for sdk methods. This layer doesn't know anything about API.

Naming conventions

  • Functions which are responsible to open the UI on a specific page use the view_ prefix

      view_dataset, view_annotations
    
  • Functions which return always only one object, present the name of that object in singular form.

      get_image(id) - returns one image
    
  • Functions which might return multiple objects use the plural form of that object

      get_images() - may return multiple images