From e86eb116e822c0997639268f4c49da63ff851ef8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pawe=C5=82=20Redzy=C5=84ski?= Date: Fri, 18 Dec 2020 13:22:57 +0100 Subject: [PATCH 1/7] params: document --targets option --- content/docs/command-reference/params/diff.md | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/content/docs/command-reference/params/diff.md b/content/docs/command-reference/params/diff.md index 02a51fa522..422d6c750f 100644 --- a/content/docs/command-reference/params/diff.md +++ b/content/docs/command-reference/params/diff.md @@ -7,7 +7,8 @@ commits in the DVC repository, or between a commit and the ## Synopsis ```usage -usage: dvc params diff [-h] [-q | -v] [--all] [--show-json] [--show-md] +usage: dvc params diff [-h] [-q | -v] [--targets [ [ ...]]] + [--all] [--show-json] [--show-md] [--no-path] [a_rev] [b_rev] positional arguments: @@ -35,6 +36,20 @@ itself does not ascribe any specific meaning for these values. ## Options +- `--targets` - Limit command scope to these params files. When specifying + arguments for `--targets` before `revisions`, you should use `--` after this + option's arguments, e.g.: + + ```dvc + $ dvc params diff --targets m1.json m2.yaml -- HEAD v1 + ``` + + Alternatively, you can also run the above statement as: + + ```dvc + $ dvc params diff HEAD v1 --targets m1.json m2.json + ``` + - `--all` - prints all parameters including not changed. - `--show-json` - prints the command's output in easily parsable JSON format, From ca2cb30337ef42196147c632752944b6a070c7be Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Sun, 20 Dec 2020 13:40:07 -0600 Subject: [PATCH 2/7] cmd: fux config file paths --- content/docs/command-reference/config.md | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/content/docs/command-reference/config.md b/content/docs/command-reference/config.md index 44bf7fba48..ce3fc46bc5 100644 --- a/content/docs/command-reference/config.md +++ b/content/docs/command-reference/config.md @@ -38,18 +38,21 @@ instead, to set (or override) secrets: The `--global` and `--system` flags are also available to set config options for multiple projects and users, respectively: -| Flag | Priority | Mac location | Linux location | Windows location | -| ---------- | -------- | ---------------------------------------- | -------------------------- | --------------------------------------------------------- | -| `--global` | 3 | `$HOME/Library/Application\ Support/dvc` | `$HOME/.config/dvc/config` | `%LocalAppData%\iterative\dvc\config` | -| `--system` | 4 | `/Library/Application\ Support/dvc` | `/etc/dvc/config` | `%AllUsersProfile%\Application Data\iterative\dvc\config` | - - + +| Flag | Priority | Mac location | Linux location (typical\*) | Windows location | +| ---------- | -------- | -------------------------------------- | -------------------------- | --------------------------------------------------------- | +| `--global` | 3 | `$HOME/Library/Preferences/dvc/config` | `$HOME/.config/dvc/config` | `%LocalAppData%\iterative\dvc\config` | +| `--system` | 4 | `/Library/Preferences/dvc/config` | `/etc/xdg/dvc/config` | `%AllUsersProfile%\Application Data\iterative\dvc\config` | + +> \* For Linux, the global `dvc/config` may be found in `$XDG_CONFIG_HOME`, and +> the system-wide one in `$XDG_CONFIG_DIRS[0]`, if those env vars are defined. + ## Command options (flags) - `-u`, `--unset` - remove a specified config option from a config file. @@ -64,7 +67,8 @@ multiple projects and users, respectively: - `--system` - modify a system config file (e.g. `/etc/dvc/config`) instead of `.dvc/config`. Useful to apply config options to all the projects (all users) - in the machine. + in the machine. May require superuser access e.g. + `sudo dvc config --system ...` (Linux). - `-l`, `--list` - lists all defined config values. From 42c61b54aebe4758ff605b1fd5eca9238bd27841 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pawe=C5=82=20Redzy=C5=84ski?= Date: Mon, 21 Dec 2020 13:20:14 +0100 Subject: [PATCH 3/7] Update content/docs/command-reference/params/diff.md Co-authored-by: Jorge Orpinel --- content/docs/command-reference/params/diff.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/command-reference/params/diff.md b/content/docs/command-reference/params/diff.md index 422d6c750f..cf4f531ff0 100644 --- a/content/docs/command-reference/params/diff.md +++ b/content/docs/command-reference/params/diff.md @@ -36,7 +36,7 @@ itself does not ascribe any specific meaning for these values. ## Options -- `--targets` - Limit command scope to these params files. When specifying +- `--targets ` - limit command scope to these params files. When specifying arguments for `--targets` before `revisions`, you should use `--` after this option's arguments, e.g.: From 2064cceec6b0aa2c5b50f6306b306f4b0280cb79 Mon Sep 17 00:00:00 2001 From: "restyled-io[bot]" <32688539+restyled-io[bot]@users.noreply.github.com> Date: Mon, 21 Dec 2020 12:20:58 -0800 Subject: [PATCH 4/7] Restyled by prettier (#2042) Co-authored-by: Restyled.io --- content/docs/command-reference/params/diff.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/docs/command-reference/params/diff.md b/content/docs/command-reference/params/diff.md index cf4f531ff0..cc4316324e 100644 --- a/content/docs/command-reference/params/diff.md +++ b/content/docs/command-reference/params/diff.md @@ -36,9 +36,9 @@ itself does not ascribe any specific meaning for these values. ## Options -- `--targets ` - limit command scope to these params files. When specifying - arguments for `--targets` before `revisions`, you should use `--` after this - option's arguments, e.g.: +- `--targets ` - limit command scope to these params files. When + specifying arguments for `--targets` before `revisions`, you should use `--` + after this option's arguments, e.g.: ```dvc $ dvc params diff --targets m1.json m2.yaml -- HEAD v1 From 365b7680898f752b9e7d7197f40d8be287161698 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 21 Dec 2020 09:49:52 -0600 Subject: [PATCH 5/7] Revert "Update the cmd entry of the dvc.yaml file to add cmd as list option (#1980)" This reverts commit 23247d8660b323cf760a55a7ffe1dffcf39f2a3b. --- content/docs/command-reference/repro.md | 14 +++++--------- .../docs/user-guide/dvc-files-and-directories.md | 9 ++------- 2 files changed, 7 insertions(+), 16 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index 8257d4cf0e..c571646d45 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -48,10 +48,6 @@ commands (`cmd` field of `dvc.yaml`). [Stage](/doc/command-reference/run) outputs are deleted from the workspace before executing the stage commands that produce them (unless `persist: true` is used in `dvc.yaml`). -For stages with multiple commands (having a list in the `cmd` field), commands -are run one after the other in the order they are defined. The failure of any -command will halt the remaining stage execution, and raises an error. - There are a few ways to restrict what will be regenerated by this command: by specifying specific reproduction [`targets`](#options), or by using certain command [options](#options), such as `--single-item` or `--all-pipelines`. @@ -185,7 +181,7 @@ up-to-date and only execute the final stage. - `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if all stages are up to date or if all stages are successfully executed, otherwise - exit with 1. The commands defined in the stage are free to write output + exit with 1. The command defined in the stage is free to write output regardless of this flag. - `-v`, `--verbose` - displays detailed tracing information. @@ -268,8 +264,8 @@ If we now run `dvc repro`, we should see this: ```dvc $ dvc repro Stage 'filter' didn't change, skipping -Running stage 'count': -> python process.py numbers.txt > count.txt +Running stage 'count' with command: + python process.py numbers.txt > count.txt Updating lock file 'dvc.lock' ``` @@ -305,8 +301,8 @@ of only the target (`count`) and following stages (none in this case): ```dvc $ dvc repro --downstream count -Running stage 'count': -> python process.py numbers.txt > count.txt +Running stage 'count' with command: + python process.py numbers.txt > count.txt Updating lock file 'dvc.lock' ``` diff --git a/content/docs/user-guide/dvc-files-and-directories.md b/content/docs/user-guide/dvc-files-and-directories.md index 91920dfb9f..5d687dafb4 100644 --- a/content/docs/user-guide/dvc-files-and-directories.md +++ b/content/docs/user-guide/dvc-files-and-directories.md @@ -135,9 +135,7 @@ stages: - performance.json training: desc: Training stage description - cmd: - - pip install -r requirements.txt - - python train.py + cmd: python train.py deps: - train.py - features @@ -156,10 +154,7 @@ stages: by the user with the `--name` (`-n`) option of `dvc run`. Each stage can contain the following fields: -- `cmd` (always present): One or more commands executed by the stage (may - contain either a single value, or a list). Commands are executed sequentially - until all are finished or until one of them fails (see - [`dvc repro`](/doc/command-reference/repro) for details). +- `cmd` (always present): Executable command defined in this stage - `wdir`: Working directory for the stage command to run in (relative to the file's location). If this field is not present explicitly, it defaults to `.` (the file's location). From 79ec70b943eb780d6aa04a57b9f6d9080c8dbbfd Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 23 Dec 2020 21:08:27 -0600 Subject: [PATCH 6/7] docs: misc copy edits --- content/docs/start/index.md | 14 ++++++++------ content/docs/use-cases/data-registries.md | 2 +- .../docs/user-guide/dvc-files-and-directories.md | 10 ++++++---- content/docs/user-guide/how-to/merge-conflicts.md | 5 ++--- .../docs/user-guide/setup-google-drive-remote.md | 2 +- 5 files changed, 18 insertions(+), 15 deletions(-) diff --git a/content/docs/start/index.md b/content/docs/start/index.md index 060edd3e9a..a31786869e 100644 --- a/content/docs/start/index.md +++ b/content/docs/start/index.md @@ -49,13 +49,15 @@ Changes to be committed: $ git commit -m "Initialize DVC" ``` -DVC features can be grouped into functional components. We'll explore them one -by one in the next few sections: +Now you're ready to DVC! -- [**Data versioning**](/doc/start/data-versioning) is the base layer of DVC for - large files, datasets, and machine learning models. It looks like a regular - Git workflow, but without storing large files in the repo (think "Git for - data"). Data is stored separately, which allows for efficient sharing. +DVC's features can be grouped into functional components. We'll explore them one +by one in the next few pages: + +- [**Data versioning**](/doc/start/data-versioning) (try this next) is the base + layer of DVC for large files, datasets, and machine learning models. Use a + regular Git workflow, but without storing large files in the repo (think "Git + for data"). Data is stored separately, which allows for efficient sharing. - [**Data access**](/doc/start/data-access) shows how to use data artifacts from outside of the project and how to import data artifacts from another DVC diff --git a/content/docs/use-cases/data-registries.md b/content/docs/use-cases/data-registries.md index 2848f75877..3cdaa4fb2d 100644 --- a/content/docs/use-cases/data-registries.md +++ b/content/docs/use-cases/data-registries.md @@ -30,7 +30,7 @@ Advantages of data registries: management and optimizes space requirements. - **Data as code**: leverage Git workflows such as commits, branching, pull requests, reviews, and even CI/CD for your data and models lifecycle. Think - "Git for cloud storage", but without ad-hoc conventions. + "Git for cloud storage". - **Security**: registries can be setup with read-only remote storage (e.g. an HTTP server). diff --git a/content/docs/user-guide/dvc-files-and-directories.md b/content/docs/user-guide/dvc-files-and-directories.md index 5d687dafb4..d7760b6272 100644 --- a/content/docs/user-guide/dvc-files-and-directories.md +++ b/content/docs/user-guide/dvc-files-and-directories.md @@ -72,14 +72,15 @@ An _output entry_ (`outs`) can have these fields: HTTP, S3, or Azure [external outputs](/doc/user-guide/managing-external-data); and a special _checksum_ for HDFS and WebHDFS. - `size`: Size of the file or directory (sum of all files). -- `nfiles`: If a directory, number of files inside. +- `nfiles`: If this output is a directory, the number of files inside + (recursive). - `cache`: Whether or not this file or directory is cached (`true` by default, if not present). See the `--no-commit` option of `dvc add`. - `persist`: Whether the output file/dir should remain in place while `dvc repro` runs. By default outputs are deleted when `dvc repro` starts (if this value is not present). -- `desc`: User description for this output. This doesn't affect any DVC - operations. +- `desc` (optional): User description for this output. This doesn't affect any + DVC operations. A _dependency entry_ (`deps`) can have these fields: @@ -91,7 +92,8 @@ A _dependency entry_ (`deps`) can have these fields: HTTP, S3, or Azure external dependencies; and a special _checksum_ for HDFS and WebHDFS. See `dvc import-url` for more information. - `size`: Size of the file or directory (sum of all files). -- `nfiles`: If a directory, number of files inside. +- `nfiles`: If this dependency is a directory, the number of files inside + (recursive). - `repo`: This entry is only for external dependencies created with `dvc import`, and can contains the following fields: diff --git a/content/docs/user-guide/how-to/merge-conflicts.md b/content/docs/user-guide/how-to/merge-conflicts.md index 89c003b9ab..dd4408ef15 100644 --- a/content/docs/user-guide/how-to/merge-conflicts.md +++ b/content/docs/user-guide/how-to/merge-conflicts.md @@ -36,9 +36,8 @@ stages: ## `dvc.lock` -There's no need to resolve lock file conflicts manually. You can safely delete -this file and then use `dvc repro` after merging `dvc.yaml` to regenerate this -file. +There's no need to resolve lock file conflicts manually. You can safely +overwrite this file by using `dvc repro` after merging `dvc.yaml`. > `dvc commit` can also be a good option, but only for the specific case where > the `HEAD` version is chosen. diff --git a/content/docs/user-guide/setup-google-drive-remote.md b/content/docs/user-guide/setup-google-drive-remote.md index e86e7961fb..cecff6b626 100644 --- a/content/docs/user-guide/setup-google-drive-remote.md +++ b/content/docs/user-guide/setup-google-drive-remote.md @@ -215,7 +215,7 @@ individually. If you use multiple GDrive remotes, by default they will be sharing the same `.dvc/tmp/gdrive-user-credentials.json` file. It can be overridden with the -`gdrive_user_credentials_file` setting: +`gdrive_user_credentials_file` parameter: ```dvc $ dvc remote modify myremote gdrive_user_credentials_file \ From 690b55f5f381f5061431afa48c4784a9bfbee10a Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 23 Dec 2020 21:21:35 -0600 Subject: [PATCH 7/7] cmd: roll back wrong changes to repro --- content/docs/command-reference/repro.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index 5fe2dd456c..f8b34d2d8c 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -186,7 +186,7 @@ up-to-date and only execute the final stage. - `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if all stages are up to date or if all stages are successfully executed, otherwise - exit with 1. The command defined in the stage is free to write output + exit with 1. The commands defined in the stage are free to write output regardless of this flag. - `-v`, `--verbose` - displays detailed tracing information. @@ -269,8 +269,8 @@ If we now run `dvc repro`, we should see this: ```dvc $ dvc repro Stage 'filter' didn't change, skipping -Running stage 'count' with command: - python process.py numbers.txt > count.txt +Running stage 'count': +> python process.py numbers.txt > count.txt Updating lock file 'dvc.lock' ``` @@ -306,8 +306,8 @@ of only the target (`count`) and following stages (none in this case): ```dvc $ dvc repro --downstream count -Running stage 'count' with command: - python process.py numbers.txt > count.txt +Running stage 'count': +> python process.py numbers.txt > count.txt Updating lock file 'dvc.lock' ```