Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support cloud storage #3326

Merged
merged 36 commits into from
Jun 16, 2021
Merged

Support cloud storage #3326

merged 36 commits into from
Jun 16, 2021

Conversation

nmanovic
Copy link
Contributor

Original PR: #2620

Motivation and context

Support work with remote cloud storage without copying data to CVAT
Related issue: #863

How to generate temporary credentials (S3) e.g.

import boto3
# generate temporatry credentials
sts_client = boto3.client('sts', aws_access_key_id="...", aws_secret_access_key="...")
tokens = sts_client.get_session_token()
aws_access_key_id = tokens.get('Credentials').get('AccessKeyId')
aws_secret_access_key = tokens.get('Credentials').get('SecretAccessKey')
aws_session_token = tokens.get('Credentials').get('SessionToken')

# test a credentials validity
s3 = boto3.client('s3', 
	aws_access_key_id=aws_access_key_id, 
	aws_secret_access_key=aws_secret_access_key,
	aws_session_token=aws_session_token
)
s3.list_buckets()

How has this been tested?

Manually with swagger

REST API

  • GET /api/v1/cloudstorages
  • POST /api/v1/cloudstorages
  • GET /api/v1/cloudstorages/{id}
  • GET /api/v1/cloudstorages/{id}/content
  • PATCH /api/v1/cloudstorages/{id}

Supported cloud providers:

  • Azure Blob container (implemented, not tested)
  • AWS S3 bucket (implemented, tested)
  • Google Drive

Iterations:

  1. Support images
  2. Support video
  3. Support archive

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

@nmanovic nmanovic requested a review from azhavoro as a code owner June 15, 2021 11:58
@nmanovic nmanovic mentioned this pull request Jun 15, 2021
10 tasks
@nmanovic
Copy link
Contributor Author

@Marishka17 , I will merge the PR as is. We discussed a couple of issues:

  • it is not convenient for users to generate 3 tokens manually. Probably the only solution is to use the original tokens. Potentially it can create some security problems. Need to check with appropriate security guys in our org.
  • I have a bucket with a couple of images and manifest.jsonl. When I try to get content of the bucket, it doesn't work with "out of index" error.
    image
{"version":"1.1"}
{"type":"images"}
{"name":"000000000139","extension":".jpg","width":640,"height":426,"meta":{"related_images":[]},"checksum":"b242df990b7315d5ad71c8cd3bc8bfdb"}
{"name":"000000000285","extension":".jpg","width":586,"height":640,"meta":{"related_images":[]},"checksum":"3bc71e9139761c5308587dd47c08695f"}
{"name":"000000000632","extension":".jpg","width":640,"height":483,"meta":{"related_images":[]},"checksum":"bf1e99429d341e28fcfcfc36b2edff06"}
{"name":"000000000724","extension":".jpg","width":375,"height":500,"meta":{"related_images":[]},"checksum":"e2a379e1649fdcc25a5f3829778525f8"}
{"name":"000000000776","extension":".jpg","width":428,"height":640,"meta":{"related_images":[]},"checksum":"3b7f7967c63f61a6139b25280ca0b372"}
{"name":"000000000785","extension":".jpg","width":640,"height":425,"meta":{"related_images":[]},"checksum":"647645423232ef03691064eda55f9f36"}
{"name":"000000000802","extension":".jpg","width":424,"height":640,"meta":{"related_images":[]},"checksum":"9f244e7c14cadca4195093a06bcbaf16"}
{"name":"000000000872","extension":".jpg","width":621,"height":640,"meta":{"related_images":[]},"checksum":"5516b8d3bad2c08f2c86a2ca95d0d062"}
{"name":"000000000885","extension":".jpg","width":640,"height":427,"meta":{"related_images":[]},"checksum":"ec166d3d69fc7475d4a2a4445b4bf35a"}

image

@nmanovic nmanovic merged commit b18482b into develop Jun 16, 2021
@nmanovic nmanovic deleted the nm/cloud-storage-server branch June 16, 2021 11:30
@Marishka17 Marishka17 mentioned this pull request Jun 16, 2021
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants