Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Remote repositories for versioning #26

Closed
wants to merge 133 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
133 commits
Select commit Hold shift + click to select a range
068bc96
Add mots png format and tests
Sep 18, 2020
07e8fa2
update docs
Sep 18, 2020
97438ab
update changelog
Sep 18, 2020
def1c3a
Update plugins and tests
Sep 21, 2020
b69878e
Merge branch 'develop' into zm/refactor-plugins
Sep 21, 2020
bf1fadc
update changelog
Sep 21, 2020
a19711e
Add declarations
Sep 22, 2020
192399e
t
Sep 24, 2020
c756809
implement vcs interaction
Sep 25, 2020
7bee7c6
t
Sep 30, 2020
855c2b7
t
Oct 5, 2020
9d4bf38
move create and import commands
Oct 5, 2020
8363242
t
Oct 6, 2020
864812b
add source
Oct 6, 2020
061c9ba
add bunch of cli commands
Oct 8, 2020
e714570
Merge branch 'develop' into zm/versioning
Oct 8, 2020
11e6386
implement pipelines, add 'apply' draft
Oct 14, 2020
e5fbec7
implement pipelines
Oct 16, 2020
6cd0d81
complete source commands
Oct 20, 2020
c2350eb
pipelines
Oct 21, 2020
d299536
implement build
Oct 22, 2020
1aa5d80
implement tranforms
Oct 22, 2020
778b796
implement remotes
Oct 23, 2020
2cba7bd
implement status checking
Oct 26, 2020
71abd59
Implement checkout
Oct 27, 2020
24d3605
add versioning commands
Oct 27, 2020
e1c556c
add data remotes
Oct 28, 2020
6da1c0b
build as destructive op with cache, transform
Oct 30, 2020
11cb955
Change importers to return a list of sources
Nov 27, 2020
44ebfc3
dont remove source data on inplace saving
Nov 27, 2020
0c64d6b
fix positional arg parsing
Nov 27, 2020
f0b018d
Add repository remotes
Nov 30, 2020
2e0e3af
Implement destructive strategy for build
Dec 1, 2020
a7439e9
Add gitignore management
Dec 2, 2020
c4e86db
Added cleanup on error in source addition
Dec 2, 2020
fb7bc71
Fix Config class comparison
Dec 4, 2020
0c62575
Allow passing File to Config load and dump
Dec 4, 2020
2c1c935
Implement pretty status output
Dec 4, 2020
61c7b5b
Add status checks and output dir to build
Dec 4, 2020
789b387
Add source name validation and dvc aux dir maintenance
Dec 4, 2020
abc9c1c
Allow checkout for specific sources from a commit
Dec 4, 2020
918f665
Implement add and gitignore maintenance
Dec 4, 2020
7ce0c24
Add explanatory error message to diff
Dec 4, 2020
ba7b7ec
Pretty output for refs
Dec 4, 2020
55a12f0
Fix build and source update
Dec 4, 2020
813d321
fix config dump
Dec 4, 2020
d4fade8
Refactor error cleanup logic
Dec 4, 2020
86ab281
Add env.detect_dataset
Dec 7, 2020
dc54e2c
Add backward compatibility for sources, enable source addition in rea…
Dec 7, 2020
dc4bd30
Little refactoring
Dec 7, 2020
6b8f517
Fix project saving
Dec 7, 2020
2890adc
Update convert command
Dec 7, 2020
6e6f18a
Restore import command
Dec 7, 2020
a6c5676
Add extra option to forward function kwargs
Dec 8, 2020
978ebe9
Fix format tests
Dec 8, 2020
f32bc2d
Add repository access checks
Dec 9, 2020
8341128
Add default source format
Dec 9, 2020
4cb85c6
Bump format version
Dec 9, 2020
9f66273
move config and dataset tests
Dec 9, 2020
79991e7
Add auto-answering on input requests in DVC
Dec 11, 2020
500f20b
Fix voc converter args
Dec 11, 2020
41930af
Add default remote type
Dec 11, 2020
ebb6c8c
Little fixes
Dec 11, 2020
d5efa18
Add checkout comment
Dec 11, 2020
22bf155
Add source and remote - variant 1
Dec 11, 2020
01542ff
Adding urls - variant 2 (direct)
Dec 14, 2020
fc2f0d3
Fix source addition, disallow adding a source without pulling
Dec 15, 2020
f635f4d
Add VCS interaction tests
Dec 15, 2020
f8effee
update dependencies
Dec 15, 2020
76dbaa0
Add dataset importing
Dec 15, 2020
bbf3e32
Little fixes
Dec 15, 2020
5bdb68b
Add build and tag tests, update some old source tests
Dec 15, 2020
21237e4
Add format detection in dataset
Dec 16, 2020
747d43f
Add stage tests and fixes. Add convenience methods
Dec 16, 2020
a211ab7
Fix windows installation
Dec 17, 2020
2bbf6cd
Fix yolo extractor on windows
Dec 17, 2020
d382bbe
Update dataset tests
Dec 17, 2020
d18f2c7
Update model interaction and project tests
Dec 17, 2020
7d71d70
Implement pushing
Dec 18, 2020
d32e55d
Add old project example
Dec 18, 2020
d6a24a7
Merge branch 'develop' into zm/versioning
Jan 6, 2021
efed2ca
fix changelog
Jan 6, 2021
59c9ef3
remove extra config test
Jan 6, 2021
7381ff5
Merge branch 'develop' into zm/versioning
Jan 11, 2021
b78d49b
merge
Jan 11, 2021
0415561
Fix backward compatibility
Jan 11, 2021
0348119
change dataset export arg order
Jan 11, 2021
5ff23d9
Merge branch 'develop' into zm/versioning
Jan 11, 2021
304fde8
fix merge
Jan 11, 2021
b3b4be9
Merge branch 'develop' into zm/versioning
Jan 25, 2021
9da3687
Use default value in dictconfig as schema
Jan 25, 2021
8cfa2d0
Fix merge
Jan 25, 2021
0e2215c
fix merge
Jan 25, 2021
cb10db1
Merge branch 'develop' into zm/versioning
Mar 3, 2021
cd2edea
t
Mar 9, 2021
59db785
Merge branch 'develop' into zm/versioning
Mar 22, 2021
aed94e5
Merge branch 'develop' into zm/versioning
Mar 24, 2021
f33efb9
t
Mar 25, 2021
ee8fc2b
unify models and sources, add type hints in project
Apr 1, 2021
7995c6f
Merge branch 'develop' into zm/versioning
Apr 1, 2021
ccdebf1
Merge branch 'zm/vcs-core' into zm/versioning
Apr 1, 2021
eb3f3e3
Update model cli context, add versioning commands
Apr 1, 2021
91b82cc
update model cli
Apr 1, 2021
d70bda7
Add return code when undefined
Apr 6, 2021
77b222f
Stop using deprecated logging.warn
Apr 6, 2021
decd99b
Allow tuples in project config serialization
Apr 6, 2021
155064b
Launch transform and filter commands through build
Apr 6, 2021
73f06b5
Disallow dots in source names, disallow unexising paths in sources
Apr 6, 2021
dd7464c
update method call
Apr 6, 2021
32891a2
add dataset label transform test
Apr 6, 2021
497d649
Add more project build tests
Apr 6, 2021
1b08636
Add cli e2e tests
Apr 6, 2021
017897c
Add cli e2e tests
Apr 7, 2021
d998bf2
Fix model cli args
Apr 8, 2021
600f91a
Update project cli calls
Apr 8, 2021
d33f99a
add error on dirty source building
Apr 8, 2021
1107dc6
fix vcs usage test and project building
Apr 8, 2021
f05b6d2
update project source dataset
Apr 8, 2021
e2b69e6
Merge branch 'zm/versioning' of https://github.com/openvinotoolkit/da…
Apr 8, 2021
7376cbe
Merge branch 'develop' into zm/versioning
Apr 8, 2021
485648c
Make errors more typed
Apr 9, 2021
155ad45
Add tests for dvc and git sources
Apr 12, 2021
9ad32f5
Add convert test
Apr 13, 2021
0d4ef37
Remove import deprecation message
Apr 13, 2021
ec66d38
Fix importof extractors
Apr 13, 2021
82a3069
improve dataset merge errors and cli output
Apr 13, 2021
9ed795d
extend support of detached mode
Apr 14, 2021
5e70607
Remove deprecation notification
Apr 16, 2021
f96ca9b
Add more structure to cli commands definition
Apr 16, 2021
46fad3d
Remove exposition for unsupported commands, revisit vcs commands
Apr 16, 2021
ecb5268
Add tag command
Apr 19, 2021
296ab50
Add sections to cli commands description
Apr 19, 2021
276e4df
add targets, rename data dir, add tests
Apr 26, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 40 additions & 7 deletions datumaro/cli/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,22 +56,47 @@ def make_parser():

parser.add_argument('--version', action='version', version=VERSION)
_LogManager._define_loglevel_option(parser)
parser.add_argument('--detached', action='store_true',
help=argparse.SUPPRESS)
# help="Work in VCS-detached mode. VCS operations will be unavailable.")

known_contexts = [
('project', contexts.project, "Actions with project (deprecated)"),
('project', contexts.project, "Actions with project"),
('repo', contexts.repository, "Actions with VCS repositories"),
('remote', contexts.remote, "Actions with data remotes"),
('source', contexts.source, "Actions with data sources"),
('model', contexts.model, "Actions with models"),
]
known_commands = [
('create', commands.create, "Create project"),
("Project modification:", None, ''),
('create', commands.create, "Create empty project"),
('import', commands.import_, "Create project from existing dataset"),
('add', commands.add, "Add data source to project"),
('remove', commands.remove, "Remove data source from project"),

("", None, ''),
("Project versioning:", None, ''),
('check_updates', commands.check_updates, "Check remote repository for updates"),
('fetch', commands.fetch, "Fetch updates from remote repository"),
('pull', commands.pull, "Pull updates from remote repository"),
('push', commands.push, "Push updates to remote repository"),
('checkout', commands.checkout, "Switch to another branch or revision"),
('commit', commands.commit, "Commit changes in tracked files"),
('status', commands.status, "Show status information"),
('refs', commands.refs, "List branches and revisions"),
('tag', commands.tag, "Give name to revision"),
('track', commands.track, "Start tracking a local file or directory"),
('update', commands.update, "Change data source revision"),

("", None, ''),
("Dataset and project operations:", None, ''),
('export', commands.export, "Export project in some format"),
('filter', commands.filter, "Filter project"),
('transform', commands.transform, "Transform project"),
('filter', commands.filter, "Filter project items"),
('transform', commands.transform, "Modify project items"),
('apply', commands.apply, "Apply a few transforms to project"),
('build', commands.build, "Build project"),
('merge', commands.merge, "Merge projects"),
('convert', commands.convert, "Convert dataset into another format"),
('convert', commands.convert, "Convert dataset between formats"),
('diff', commands.diff, "Compare projects with intersection"),
('ediff', commands.ediff, "Compare projects for equality"),
('stats', commands.stats, "Compute project statistics"),
Expand Down Expand Up @@ -104,7 +129,8 @@ def make_parser():
subcommands = parser.add_subparsers(title=subcommands_desc,
description="", help=argparse.SUPPRESS)
for command_name, command, _ in known_contexts + known_commands:
add_subparser(subcommands, command_name, command.build_parser)
if command is not None:
add_subparser(subcommands, command_name, command.build_parser)

return parser

Expand All @@ -119,8 +145,15 @@ def main(args=None):
parser.print_help()
return 1

if args.detached:
from datumaro.components.project import ProjectVcs
ProjectVcs.G_DETACHED = True

try:
return args.command(args)
retcode = args.command(args)
if retcode is None:
retcode = 0
return retcode
except CliException as e:
log.error(e)
return 1
Expand Down
3 changes: 2 additions & 1 deletion datumaro/cli/commands/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@
from . import (
create, add, remove, import_,
explain,
export, merge, convert, transform, filter,
export, merge, convert, apply, transform, filter, build, update,
diff, ediff, stats,
commit, fetch, pull, push, track, checkout, refs, status, check_updates, tag,
info, validate
)
7 changes: 7 additions & 0 deletions datumaro/cli/commands/apply.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Copyright (C) 2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

# pylint: disable=unused-import

from ..contexts.project import build_apply_parser as build_parser
7 changes: 7 additions & 0 deletions datumaro/cli/commands/build.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Copyright (C) 2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

# pylint: disable=unused-import

from ..contexts.project import build_build_parser as build_parser
26 changes: 26 additions & 0 deletions datumaro/cli/commands/check_updates.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Copyright (C) 2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

import argparse

from ..util.project import load_project


def build_parser(parser_ctor=argparse.ArgumentParser):
parser = parser_ctor()

parser.add_argument('targets', nargs='*',
help="Names of sources and models")
parser.add_argument('-p', '--project', dest='project_dir', default='.',
help="Directory of the project to operate on (default: current dir)")
parser.set_defaults(command=check_updates_command)

return parser

def check_updates_command(args):
project = load_project(args.project_dir)

project.vcs.check_updates(targets=args.targets)

return 0
39 changes: 39 additions & 0 deletions datumaro/cli/commands/checkout.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Copyright (C) 2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

import argparse

from ..util.project import load_project


def build_parser(parser_ctor=argparse.ArgumentParser):
parser = parser_ctor()

parser.add_argument('_positionals', nargs=argparse.REMAINDER,
help=argparse.SUPPRESS) # args can't be resolved automatically
parser.add_argument('rev', nargs='?',
help="Commit or tag (default: current)")
parser.add_argument('targets', nargs='*',
help="Names of sources, models, tracked files and dirs (default: all)")
parser.add_argument('-p', '--project', dest='project_dir', default='.',
help="Directory of the project to operate on (default: current dir)")
parser.set_defaults(command=checkout_command)

return parser

def checkout_command(args):
try:
pos = args._positionals.index('--')
has_sep = True
except ValueError:
pos = 1
has_sep = False
args.rev = args._positionals[:pos] or []
args.targets = args._positionals[pos + has_sep:]

project = load_project(args.project_dir)

project.vcs.checkout(rev=args.rev, targets=args.targets)

return 0
27 changes: 27 additions & 0 deletions datumaro/cli/commands/commit.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright (C) 2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

import argparse

from ..util.project import load_project


def build_parser(parser_ctor=argparse.ArgumentParser):
parser = parser_ctor()

parser.add_argument('paths', nargs='*',
help="Files to include in the commit (default: all tracked)")
parser.add_argument('-m', '--message', required=True, help="Commit message")
parser.add_argument('-p', '--project', dest='project_dir', default='.',
help="Directory of the project to operate on (default: current dir)")
parser.set_defaults(command=commit_command)

return parser

def commit_command(args):
project = load_project(args.project_dir)

project.vcs.commit(args.paths, args.message)

return 0
62 changes: 61 additions & 1 deletion datumaro/cli/commands/create.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,64 @@

# pylint: disable=unused-import

from ..contexts.project import build_create_parser as build_parser
import argparse
import logging as log
import os
import os.path as osp
import shutil

from datumaro.components.project import \
PROJECT_DEFAULT_CONFIG as DEFAULT_CONFIG
from datumaro.components.project import Project

from ..util import CliException, MultilineFormatter


def build_parser(parser_ctor=argparse.ArgumentParser):
parser = parser_ctor(help="Create empty project",
description="""
Create a new empty project.|n
|n
Examples:|n
- Create a project in the current directory:|n
|s|screate -n myproject|n
|n
- Create a project in other directory:|n
|s|screate -o path/I/like/
""",
formatter_class=MultilineFormatter)

parser.add_argument('-o', '--output-dir', default='.', dest='dst_dir',
help="Save directory for the new project (default: current dir")
parser.add_argument('-n', '--name', default=None,
help="Name of the new project (default: same as project dir)")
parser.add_argument('--overwrite', action='store_true',
help="Overwrite existing files in the save directory")
parser.set_defaults(command=create_command)

return parser

def create_command(args):
project_dir = osp.abspath(args.dst_dir)

project_env_dir = osp.join(project_dir, DEFAULT_CONFIG.env_dir)
if osp.isdir(project_env_dir) and os.listdir(project_env_dir):
if args.overwrite:
shutil.rmtree(project_env_dir, ignore_errors=True)
else:
raise CliException("Directory '%s' already exists "
"(pass --overwrite to overwrite)" % project_env_dir)

project_name = args.name
if project_name is None:
project_name = osp.basename(project_dir)

log.info("Creating project at '%s'" % project_dir)

Project.generate(project_dir, {
'project_name': project_name,
})

log.info("Project has been created at '%s'" % project_dir)

return 0
85 changes: 83 additions & 2 deletions datumaro/cli/commands/diff.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,87 @@
#
# SPDX-License-Identifier: MIT

# pylint: disable=unused-import
import argparse
import logging as log
import os
import os.path as osp
import shutil

from ..contexts.project import build_diff_parser as build_parser
from datumaro.components.operations import DistanceComparator
from datumaro.util import error_rollback

from ..util import CliException, MultilineFormatter
from ..util.project import generate_next_file_name, load_project
from ..contexts.project.diff import DatasetDiffVisualizer


def build_parser(parser_ctor=argparse.ArgumentParser):
parser = parser_ctor(help="Compare projects",
description="""
Compares two projects, match annotations by distance.|n
|n
Examples:|n
- Compare two projects, match boxes if IoU > 0.7,|n
|s|s|s|sprint results to Tensorboard:
|s|sdiff path/to/other/project -o diff/ -v tensorboard --iou-thresh 0.7
""",
formatter_class=MultilineFormatter)

parser.add_argument('other_project_dir',
help="Directory of the second project to be compared")
parser.add_argument('-o', '--output-dir', dest='dst_dir', default=None,
help="Directory to save comparison results (default: do not save)")
parser.add_argument('-v', '--visualizer',
default=DatasetDiffVisualizer.DEFAULT_FORMAT.name,
choices=[f.name for f in DatasetDiffVisualizer.OutputFormat],
help="Output format (default: %(default)s)")
parser.add_argument('--iou-thresh', default=0.5, type=float,
help="IoU match threshold for detections (default: %(default)s)")
parser.add_argument('--conf-thresh', default=0.5, type=float,
help="Confidence threshold for detections (default: %(default)s)")
parser.add_argument('--overwrite', action='store_true',
help="Overwrite existing files in the save directory")
parser.add_argument('-p', '--project', dest='project_dir', default='.',
help="Directory of the first project to be compared (default: current dir)")
parser.set_defaults(command=diff_command)

return parser

@error_rollback('on_error', implicit=True)
def diff_command(args):
first_project = load_project(args.project_dir)

try:
second_project = load_project(args.other_project_dir)
except FileNotFoundError:
if first_project.vcs.is_ref(args.other_project_dir):
raise NotImplementedError("It seems that you're trying to compare "
"different revisions of the project. "
"Comparisons between project revisions are not implemented yet.")
raise

comparator = DistanceComparator(iou_threshold=args.iou_thresh)

dst_dir = args.dst_dir
if dst_dir:
if not args.overwrite and osp.isdir(dst_dir) and os.listdir(dst_dir):
raise CliException("Directory '%s' already exists "
"(pass --overwrite to overwrite)" % dst_dir)
else:
dst_dir = generate_next_file_name('%s-%s-diff' % (
first_project.config.project_name,
second_project.config.project_name)
)
dst_dir = osp.abspath(dst_dir)
log.info("Saving diff to '%s'" % dst_dir)

if not osp.exists(dst_dir):
on_error.do(shutil.rmtree, dst_dir, ignore_errors=True)

visualizer = DatasetDiffVisualizer(save_dir=dst_dir,
comparator=comparator, output_format=args.visualizer)
visualizer.save_dataset_diff(
first_project.make_dataset(),
second_project.make_dataset())

return 0
Loading