Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: dvc pkg cmd ref (and other misc updates) #417

Closed
wants to merge 9 commits into from
18 changes: 18 additions & 0 deletions src/Documentation/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,16 @@
],
"move.md",
["pipeline.md", "pipeline_list.md", "pipeline_show.md"],
[
"pkg.md",
"pkg_install.md",
"pkg_uninstall.md",
"pkg_add.md",
"pkg_remove.md",
"pkg_modify.md",
"pkg_list.md",
"pkg_import.md"
],
"push.md",
"pull.md",
[
Expand Down Expand Up @@ -157,6 +167,14 @@
"pipeline.md": "pipeline",
"pipeline_list.md": "pipeline list",
"pipeline_show.md": "pipeline show",
"pkg.md": "pkg",
"pkg_install.md": "pkg install",
"pkg_uninstall.md": "pkg uninstall",
"pkg_add.md": "pkg add",
"pkg_remove.md": "pkg remove",
"pkg_modify.md": "pkg modify",
"pkg_list.md": "pkg list",
"pkg_import.md": "pkg import",
"destroy.md": "destroy",
"unprotect.md": "unprotect",
"cache.md": "cache",
Expand Down
Empty file.
3 changes: 2 additions & 1 deletion static/docs/commands-reference/cache_dir.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ file, as they are expected in the config file.
specify private config options in your config, that you don't want to track
and share through Git.

- `-u`, `--unset` - remove a specified config option from a config file.
- `-u`, `--unset` - remove the `cache.dir` config option from the config file.
Don't provide a `value` when using this flag.

- `-h`, `--help` - prints the usage/help message, and exit.

Expand Down
8 changes: 7 additions & 1 deletion static/docs/commands-reference/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ positional arguments:

You can query/set/replace/unset DVC configuration options with this command. It
takes a config option `name` (a section and a key, separated by a dot) and its
`value`.
`value` (any valid alpha-numeric string generally).

This command reads and overwrites the DVC config file `.dvc/config`. If
`--local` option is specified, `.dvc/config.local` is modified instead.
Expand Down Expand Up @@ -168,6 +168,12 @@ more about the state file that is used for optimization.
so that when it needs to cleanup the database it could sort them by the
timestamp and remove the oldest ones. Default quota is set to 50(percent).

### pkg

These are sections in the config file that describe specific
[DVC packages](/doc/commands-reference/pkg). These sections contain the `url`
and `_cwd` keys, which values are used internally by the `dvc pkg` commands.

## Examples: Core config options

Set the `dvc` log level to `debug`:
Expand Down
Empty file.
Empty file.
44 changes: 44 additions & 0 deletions static/docs/commands-reference/pkg.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# pkg

A set of commands to to manage DVC packages:
[install](/doc/commands-reference/pkg-install),
[uninstall](/doc/commands-reference/pkg-uninstall),
[add](/doc/commands-reference/pkg-add),
[remove](/doc/commands-reference/pkg-remove),
[modify](/doc/commands-reference/pkg-modify),
[list](/doc/commands-reference/pkg-list), and
[import](/doc/commands-reference/pkg-import).

## Synopsis

```usage
usage: dvc pkg [-h | -q | -v]
{install,uninstall,add,remove,modify,list,import} ...

positional arguments:
{install,uninstall,add,remove,modify,list,import}
Use dvc pkg CMD --help for command-specific help.
install Install package(s).
uninstall Uninstall package(s).
add Add package.
remove Remove package.
modify Modify package.
list List packages.
import Import data from package.
```

## Description

Manage DVC packages. See `dvc pkg install`.

Any DVC project can be used as a DVC package in order to reuse its code, stages,
and related data artifacts in the current project workspace.

## Options

- `-h`, `--help` - prints the usage/help message, and exit.

- `-q`, `--quiet` - does not write anything to standard output. Exit with 0 if
no problems arise, otherwise 1.

- `-v`, `--verbose` - displays detailed tracing information.
82 changes: 82 additions & 0 deletions static/docs/commands-reference/pkg_add.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# pkg add

Add a DVC package. The package is registered in the DVC project configuration.

See also [install](/doc/commands-reference/pkg-install),
[uninstall](/doc/commands-reference/pkg-uninstall),
[remove](/doc/commands-reference/pkg-remove),
[modify](/doc/commands-reference/pkg-modify),
[list](/doc/commands-reference/pkg-list), and
[import](/doc/commands-reference/pkg-import).

## Synopsis

```usage
usage: dvc pkg add [-h] [--global] [--system] [--local] [-q | -v] [-f]
[name] url

positional arguments:
name Package name.
url Package URL.
```

## Description

Any DVC project can be used as a DVC package in order to reuse its code, stages,
and related data artifacts in the current project workspace.

A valid `url` should be either an HTTP or SSH Git repository address such as
`https://github.com/iterative/example-get-started` or
`[email protected]:iterative/example-get-started.git` respectively – corresponding
to our sample [get started](/doc/get-started) DVC project. Note that
`dvc pkg add` does NOT validate the URL at this point, however inexistent or
unreachable addresses will result in a failure of the package
[install](/doc/commands-reference/pkg-install) and
[import](/doc/commands-reference/pkg-import) commands.

A `name` is required to identify the package configuration in the DVC project.
Such name can be any valid continuous alpha-numeric string like `my-pkg_name`.
However, the `name` argument is optional as it can be extracted from the `url`
path. If the name is already registered (check with `dvc pkg list`), the package
`url` is overwritten.

Adding a package registers it in the DVC config file (typically in `.dvc/config`
– see `dvc config`). Note that nothing is downloaded from the package URL. (Use
`dvc pkg install` or `dvc pkg import` to actually get files from the package).

## Options

- `-f`, `--force` - to overwrite existing package with new `url` value.

- `--global` - modify a global config file (e.g. `~/.config/dvc/config`) instead
of the project's `.dvc/config`.

- `--system` - modify a system config file (e.g. `/etc/dvc.config`) instead of
`.dvc/config`.

- `--local` - modify a local config file instead of `.dvc/config`. It is located
in `.dvc/config.local` and is Git-ignored. This is useful when you need to
specify private config options in your config, that you don't want to track
and share through Git.

- `-h`, `--help` - prints the usage/help message, and exit.

- `-q`, `--quiet` - does not write anything to standard output. Exit with 0 if
no problems arise, otherwise 1.

- `-v`, `--verbose` - displays detailed tracing information.

## Example

```dvc
$ dvc pkg add get-started https://github.com/iterative/example-get-started
```

Results in the DVC config file (typically `.dvc/config`) being appended a `pkg`
section like:

```ini
['pkg "get-started"']
url = https://github.com/iterative/example-get-started
_cwd = /Users/username/dvcproject/.dvc
```
104 changes: 104 additions & 0 deletions static/docs/commands-reference/pkg_import.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# pkg import

Import a data artifact from a DVC package.

See also [uninstall](/doc/commands-reference/pkg-uninstall),
[add](/doc/commands-reference/pkg-add),
[remove](/doc/commands-reference/pkg-remove),
[modify](/doc/commands-reference/pkg-modify),
[list](/doc/commands-reference/pkg-list), and
[import](/doc/commands-reference/pkg-import).

## Synopsis

```usage
usage: dvc pkg import [-h] [-q | -v] pkg_name pkg_path [out]

positional arguments:
pkg_name Package name.
pkg_path Path to data in the package.
out Destination path to put data to.
```

## Description

Any DVC project can be used as a DVC package in order to reuse its code, stages,
and related data artifacts in the current project workspace.

When importing data from a package, the provided name (`pkg_name`) can be
previously registered with `dvc pkg add`. The first thing that this command does
is to install the package in `.dvc/pkg/{pkg_name}`. (See `dvc pkg install` for
more details.)

The provided name (`pkg_name`) may also be the URL to the location of the DVC
package (same as `url` in `dvc pkg add`), in that case, the implicit package
name will be extracted from the given address and used for the package
installation.

> Note that like with installing, importing data from a package with implicit
> names does NOT add the package to the config file.

`pkg_path` is the path to the data artifact in the package after its `url` root
`/`, such as `scripts/innosetup/dvc.ico` (see [example](#example) below). A data
artifact is any one output defined in a DVC-file in the package. Note that since
these data artifacts are controlled by DVC and not by the SCM system (e.g. Git),
they can't be found by browsing the code repository. This command has to read
the package configuration to connect to a remote of that project in order to
fetch the data file to the local cache, and "check it out" (see `dvc checkout`)
it to the current project's workspace.

Data artifacts are placed in the current working directory with the same file
name as the original output from the package. To use custom path and file name
instead, and optional `out` argument can be used with this command.

Finally, the data import process creates a DVC-file in the same location as the
imported data, specifying the package dependency for the imported data similar
as to having added the imported data with `dvc add`. This way `dvc repro` will
be able to reproduce the import operation as a regular stage in this project's
pipeline.

## Options

- `-h`, `--help` - prints the usage/help message, and exit.

- `-q`, `--quiet` - does not write anything to standard output. Exit with 0 if
no problems arise, otherwise 1.

- `-v`, `--verbose` - displays detailed tracing information.

# Example

```dvc
$ mkdir import && cd import
$ dvc pkg import https://github.com/iterative/dvc scripts/innosetup/dvc.ico
Preparing to collect status from https://dvc.org/s3/dvc
...
```

Will find the `dvc.ico` file in the output of a stage file in the
https://github.com/iterative/dvc package and import it into the current working
dir (`import/`).

```dvc
$ ls
dvc.ico dvc.ico.dvc
```

The `dvc.ico.dvc` file contents should look something like:

```ini
md5: 7aac042f559753a470723d44b2384a61
wdir: .
deps:
- md5: 90104d9e83cfb825cf45507e90aadd27
path: scripts/innosetup/dvc.ico
pkg:
name: dvc
url: https://github.com/iterative/dvc
outs:
- md5: 90104d9e83cfb825cf45507e90aadd27
path: dvc.ico
cache: true
metric: false
persist: false
```
94 changes: 94 additions & 0 deletions static/docs/commands-reference/pkg_install.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# pkg install

Install DVC package(s).

See also [uninstall](/doc/commands-reference/pkg-uninstall),
[add](/doc/commands-reference/pkg-add),
[remove](/doc/commands-reference/pkg-remove),
[modify](/doc/commands-reference/pkg-modify),
[list](/doc/commands-reference/pkg-list), and
[import](/doc/commands-reference/pkg-import).

## Synopsis

```usage
usage: dvc pkg install [-h] [-q | -v] [targets [targets ...]]

positional arguments:
targets Package name.
```

## Description

Any DVC project can be used as a DVC package in order to reuse its code, stages,
and related data artifacts in the current project workspace.

When installing a package, the provided name(s) (`targets`) can be previously
registered with `dvc pkg add`. Each name will be created as a subdirectory of
`.dvc/pkg/`, where the corresponding package source files (code and DVC-files)
will be placed. (`.dvc/pkg/` will be added to the `.dvc/.gitignore` file if
needed.)

The provided `targets` may also be URLs to the location of the DVC packages
(same as `url` in `dvc pkg add`), in that case, the implicit package name will
be extracted from the given address and used for the subdirectory of `.dvc/pkg/`
as explained in the previous paragraph.

> Note that installing packages with implicit names does NOT add them to the
> config file.

## Options

- `-h`, `--help` - prints the usage/help message, and exit.

- `-q`, `--quiet` - does not write anything to standard output. Exit with 0 if
no problems arise, otherwise 1.

- `-v`, `--verbose` - displays detailed tracing information.

## Examples: With an implicit package name

Having a DVC project in https://github.com/iterative/example-get-started

```dvc
$ dvc pkg install https://github.com/iterative/example-get-started
...
```

The result is the `example-get-started` package fully installed in the
`.dvc/pkg/` directory.

```dvc
$ tree .dvc/pkg
.dvc/pkg
└── example-get-started
├── README.md
├── auc.metric
├── data
│   └── data.xml.dvc
├── evaluate.dvc
├── featurize.dvc
├── prepare.dvc
├── requirements.txt
├── src
│   ├── evaluate.py
│   ├── featurization.py
│   ├── prepare.py
│   └── train.py
└── train.dvc
```

## Examples: Having added the package first

Having the same DVC project in https://github.com/iterative/example-get-started
as in the previous example:

```dvc
$ dvc pkg add https://github.com/iterative/example-get-started
$ dvc pkg install example-get-started
...
```

Same result as the previous example, except that additionally, the DVC config
file (typically `.dvc/config` will contain a `['pkg "example-get-started"']`
section due to the `dvc pkg add` command above.
Loading