Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support of Google Cloud Storage for cloud storage #3561

Merged
merged 4 commits into from
Aug 27, 2021

Conversation

jasonkit
Copy link
Contributor

@jasonkit jasonkit commented Aug 18, 2021

Motivation and context

Add Google Cloud Storage support for cloud storage.
Also update Credentials class to include key_file_path and
add KEY_FILE_PATH as a new credential type which is required for using
google cloud storage.
It is expected to set the key_file_path to the Google cloud's service account key json file.

The specific_attributes for Google Cloud Storage includes:

  • prefix: key prefix for filtering blobs that going to fetch during initialize_content()
  • location: The location of bucket if we are going to create a bucket
  • project: The Google Cloud project id the bucket if we are going to create a bucket

All of the above attributes are optional.

How has this been tested?

Tested manually.

Tested sending HTTP POST to /api/v1/cloudstorages with following body

{
  "provider_type": "GOOGLE_CLOUD_STORAGE",
  "resource": "my_bucket",
  "display_name": "My Bucket",
  "credentials_type": "KEY_FILE_PATH",
  "key_file_path": "/credentials/gcs.json",
  "specific_attributes": "prefix=test"
}

and confirmed the cloudstorages record could be created on DB

To make this call works, we need to ensure /credentials/gcs.json exists on cvat server
and this service account key json file should has permission to access gs://my_bucket

Also tested creating new task data with using this newly created cloud storage
by sending HTTP POST to /api/v1/tasks/:task_id/data with following body

{
  "chunk_size": 1,
  "image_quality": 70,
  "server_files": [
     "test/img1.jpg",
     "test/img2.jpg",
     "test/manifest.jsonl"
  ],
  "cloud_storage_id": 1,
  "use_zip_chunks": true,
  "use_cache": true
}

and confirmed the task could be created, and the all the images could be loaded.

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2021 Intel Corporation
#
# SPDX-License-Identifier: MIT

Implement GoogleCloudStorage and add KEY_FILE_PATH to
CredentialsTypeChoice, and key_file_path in Credentials
@jasonkit jasonkit requested a review from nmanovic as a code owner August 18, 2021 07:29
@nmanovic
Copy link
Contributor

@jasonkit , thanks for the contribution. Please give us some time to review the PR. We are in the process of adding UI for the feature.

@Marishka17 , please review the PR next week.

@Marishka17
Copy link
Contributor

@jasonkit , Hi. Thanks for the contribution, please give me some time to test the functionality of the PR.

@nmanovic nmanovic requested a review from Marishka17 August 25, 2021 04:36
Copy link
Contributor

@Marishka17 Marishka17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, I have only one small comment: i don`t see the migration file, could you please add it?

try:
self._bucket = self._storage_client.create_bucket(
self.bucket,
location=self._bucket_location
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it work if location == None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, location is indeed an optional argument for create_bucket, if location is None, it will default to us

ref: https://googleapis.dev/python/storage/latest/client.html#google.cloud.storage.client.Client.create_bucket

@jasonkit
Copy link
Contributor Author

@Marishka17
I have added a commit for the auto generated migration file for change in engine_cloud_provider table

@bsekachev bsekachev merged commit 6a29b34 into cvat-ai:develop Aug 27, 2021
@bsekachev
Copy link
Member

Thank you for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants