Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Share without copying & mount cloud storages #2377

Merged
merged 24 commits into from
Dec 2, 2020

Conversation

Marishka17
Copy link
Contributor

@Marishka17 Marishka17 commented Oct 30, 2020

Motivation and context

Related issue: #204

This PR will allow not copy files inside CVAT when creating a task using share files.
Also PR contains documentation about mounting cloud storage(e.g AWS S3, Azure container, Google Drive) as FUSE.

How has this been tested?

Manually, tests

Checklist

  • fix uploaded archive/zip
  • check automount using fstab
  • check automount using systemd

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

@nmanovic
Copy link
Contributor

nmanovic commented Nov 2, 2020

@Marishka17 , please check the following use case:

  • Mount a local directory to CVAT server with photos
  • Try to classify images. For example, if the directory has your family photos, you can have labels me, mother, father, brother, etc.

Expected results:

  • Task should be created mostly immediately
  • Navigation by the task should be without significant delays
  • You can annotate these photos with the annotation speed more than 1000 tags/hour.
  • Photos are not copied into the CVAT container (one copy of data).
  • The cache with chunks can be zero size.

@nmanovic nmanovic closed this Nov 2, 2020
@azhavoro azhavoro reopened this Nov 2, 2020
@coveralls
Copy link

coveralls commented Nov 5, 2020

Coverage Status

Coverage increased (+0.01%) to 61.21% when pulling fcc15ce on mk/share_without_copying_ into d6ac8cc on develop.

@Marishka17 Marishka17 changed the title [WIP] Share without copying & mount cloud storages Share without copying & mount cloud storages Nov 9, 2020
@@ -54,6 +54,18 @@ def choices(cls):
def __str__(self):
return self.value

class UploadedDataStorageLocationChoice(str, Enum):
Copy link
Contributor

@nmanovic nmanovic Nov 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will vote for a shorter name like StorageChoice, DataLocationChoice or something like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed

bsekachev
bsekachev previously approved these changes Nov 11, 2020
@bsekachev
Copy link
Member

@Marishka17 I've just tested the PR in local development.
I have a share directory with some files. I selected on of files and set copy_data to True in advanced dialog.
After that I created the task, went to share directory, renamed the file and everything still works well. Shoudn't this PR allow to avoid copying?
Also in data/data/<id>/raw I see the file, so, it was copied. Am I wrong with understanding something?

@Marishka17
Copy link
Contributor Author

Marishka17 commented Nov 11, 2020

@bsekachev, by default files are not copied to ../data/id/raw/ and customer works with files from share. If customer selects the copy data checkbox, then files are copied to ../data/id/raw/ and he works with files inside CVAT.
Is there something wrong?

@bsekachev
Copy link
Member

@bsekachev, by default files are not copied to ../data/id/raw/ and customer works with files from share. If customer selects the copy data checkbox, then files is copied to ../data/id/raw/ and he works with files inside CVAT.
Is there something wrong?

Oh, sure, got it.

@bsekachev
Copy link
Member

@Marishka17
One enhancement I could suggest is when we have a task with data on the share and file is not available on the share anymore (I just renamed a file to emulate it), we have infinite loading of an annotation view. No any errors occur client side, client just receives empty zip archive. I believe it would be better if UI says that the file disappear.

It could be a different PR I suppose.

@nmanovic
Copy link
Contributor

@Marishka17
One enhancement I could suggest is when we have a task with data on the share and file is not available on the share anymore (I just renamed a file to emulate it), we have infinite loading of an annotation view. No any errors occur client side, client just receives empty zip archive. I believe it would be better if UI says that the file disappear.

It could be a different PR I suppose.

I think we need to address it in the PR. It is a bug and could lead to a lot of issues which will difficult to investigate.

@Marishka17 Marishka17 changed the title Share without copying & mount cloud storages [WIP] Share without copying & mount cloud storages Nov 20, 2020
@Marishka17 Marishka17 changed the title [WIP] Share without copying & mount cloud storages Share without copying & mount cloud storages Nov 27, 2020
@Marishka17 Marishka17 requested a review from bsekachev November 27, 2020 08:27
@nmanovic
Copy link
Contributor

@azhavoro , could you please look at the PR and provide your decision?

@azhavoro
Copy link
Contributor

azhavoro commented Dec 1, 2020

@Marishka17 PR is ok for me, please fix the conflict with views.py

azhavoro
azhavoro previously approved these changes Dec 1, 2020
bsekachev
bsekachev previously approved these changes Dec 1, 2020
@bsekachev bsekachev dismissed stale reviews from azhavoro and themself via fcc15ce December 2, 2020 06:51
@bsekachev bsekachev merged commit 004cb64 into develop Dec 2, 2020
@bsekachev bsekachev deleted the mk/share_without_copying_ branch December 2, 2020 09:05
@iraadit
Copy link

iraadit commented Feb 22, 2021

Is the "Share without copying" working with the cli?

@Marishka17
Copy link
Contributor Author

@iraadit, there is currently no way to use this feature using the cli. Could you please create the issue?

@iraadit
Copy link

iraadit commented Feb 24, 2021

@iraadit, there is currently no way to use this feature using the cli. Could you please create the issue?

Hi, I created the new issue #2862

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Any way to avoid copying and compressing files when creating new task?
6 participants