From 6da4cccc538574b6202033f6a9746fd775c629ee Mon Sep 17 00:00:00 2001 From: Saugat Pachhai Date: Tue, 24 Mar 2020 19:25:45 +0545 Subject: [PATCH 1/8] cmd-ref: document checkout summary flag --- content/docs/command-reference/checkout.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/content/docs/command-reference/checkout.md b/content/docs/command-reference/checkout.md index 1b1d02e9cc..197f0aea0c 100644 --- a/content/docs/command-reference/checkout.md +++ b/content/docs/command-reference/checkout.md @@ -6,7 +6,7 @@ DVC-files. ## Synopsis ```usage -usage: dvc checkout [-h] [-q | -v] [-d] [-R] [-f] [--relink] +usage: dvc checkout [-h] [-q | -v] [--summary] [-d] [-R] [-f] [--relink] [targets [targets ...]] positional arguments: @@ -58,9 +58,8 @@ restoring any file size will be almost instantaneous. > `cache.slow_link_warning` config option to `false` with `dvc config cache`. This command will fail to checkout files that are missing from the cache. In -such a case, `dvc checkout` prints a warning message. It also lists removed -files. Any files that can be checked out without error will be restored without -being reported individually. +such a case, `dvc checkout` prints a warning message. It also displays a list of +changes made by the `checkout`. There are two methods to restore a file missing from the cache, depending on the situation. In some cases a pipeline must be reproduced (using `dvc repro`) to @@ -69,6 +68,8 @@ be pulled from remote storage using `dvc pull`. ## Options +- `--summary` - displays summary of the changes. + - `-d`, `--with-deps` - determines files to update by tracking dependencies to the target DVC-files (stages). If no `targets` are provided, this option is ignored. By traversing all stage dependencies, DVC searches backward from the From 4f050b9698ef1d2ac2ae8ae952de63d5c0b15f0c Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 6 Apr 2020 08:47:33 -0600 Subject: [PATCH 2/8] Update content/docs/command-reference/checkout.md --- content/docs/command-reference/checkout.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/command-reference/checkout.md b/content/docs/command-reference/checkout.md index 197f0aea0c..0ceb3d205c 100644 --- a/content/docs/command-reference/checkout.md +++ b/content/docs/command-reference/checkout.md @@ -58,8 +58,8 @@ restoring any file size will be almost instantaneous. > `cache.slow_link_warning` config option to `false` with `dvc config cache`. This command will fail to checkout files that are missing from the cache. In -such a case, `dvc checkout` prints a warning message. It also displays a list of -changes made by the `checkout`. +such a case, `dvc checkout` prints a warning message. It also lists the +partial progress made by the checkout. There are two methods to restore a file missing from the cache, depending on the situation. In some cases a pipeline must be reproduced (using `dvc repro`) to From a1d87411a2fef2959a50020dd6a6f10310525948 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 6 Apr 2020 08:47:47 -0600 Subject: [PATCH 3/8] Update content/docs/command-reference/checkout.md --- content/docs/command-reference/checkout.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/docs/command-reference/checkout.md b/content/docs/command-reference/checkout.md index 0ceb3d205c..689baca2c9 100644 --- a/content/docs/command-reference/checkout.md +++ b/content/docs/command-reference/checkout.md @@ -68,7 +68,8 @@ be pulled from remote storage using `dvc pull`. ## Options -- `--summary` - displays summary of the changes. +- `--summary` - displays summary of the changes done by this command in the + workspace. - `-d`, `--with-deps` - determines files to update by tracking dependencies to the target DVC-files (stages). If no `targets` are provided, this option is From a131ca7d584623f1363c36a9421c828e417c74cd Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Thu, 9 Apr 2020 00:28:11 -0500 Subject: [PATCH 4/8] cmd ref: update checkout example --- content/docs/command-reference/checkout.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/content/docs/command-reference/checkout.md b/content/docs/command-reference/checkout.md index 891f4eece1..c985e40d96 100644 --- a/content/docs/command-reference/checkout.md +++ b/content/docs/command-reference/checkout.md @@ -58,8 +58,8 @@ restoring any file size will be almost instantaneous. > `cache.slow_link_warning` config option to `false` with `dvc config cache`. This command will fail to checkout files that are missing from the cache. In -such a case, `dvc checkout` prints a warning message. It also lists the -partial progress made by the checkout. +such a case, `dvc checkout` prints a warning message. It also lists the partial +progress made by the checkout. There are two methods to restore a file missing from the cache, depending on the situation. In some cases a pipeline must be reproduced (using `dvc repro`) to @@ -149,7 +149,7 @@ This project comes with a predefined HTTP [remote storage](/doc/command-reference/remote). We can now just run `dvc pull` that will fetch and checkout the most recent `model.pkl`, `data.xml`, and other files that are tracked by DVC. The model file hash -`3863d0e317dee0a55c4e59d2ec0eef33` will be used in the `train.dvc` +`662eb7f64216d9c2c1088d0a5e2c6951` will be used in the `train.dvc` [stage file](/doc/command-reference/run): ```dvc @@ -190,6 +190,8 @@ doesn't track those files; DVC does, so we must do this: ```dvc $ dvc fetch $ dvc checkout +M model.pkl +M data\features\ $ md5 model.pkl MD5 (model.pkl) = 43630cce66a2432dcecddc9dd006d0a7 From b90f6136523df5d0140f8c398320d7532148621f Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Fri, 17 Apr 2020 23:26:55 -0500 Subject: [PATCH 5/8] cmd ref: improve desc. of checkout --summary option per https://github.com/iterative/dvc.org/pull/1054#pullrequestreview-391199946 --- content/docs/command-reference/checkout.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/content/docs/command-reference/checkout.md b/content/docs/command-reference/checkout.md index c985e40d96..f2e22ea25e 100644 --- a/content/docs/command-reference/checkout.md +++ b/content/docs/command-reference/checkout.md @@ -68,8 +68,9 @@ be pulled from remote storage using `dvc pull`. ## Options -- `--summary` - displays summary of the changes done by this command in the - workspace. +- `--summary` - in addition to checking out DVC-tracked data, display a short + summary of the changes done by this command in the workspace, for example how + many files were added or deleted. - `-d`, `--with-deps` - determines files to update by tracking dependencies to the target DVC-files (stages). If no `targets` are provided, this option is From 5d29f1d79f5ca64a1f0e69a92ca33955298b3a75 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 22 Apr 2020 16:42:55 -0500 Subject: [PATCH 6/8] cmd ref: clarify about checkout output and --summary option desc. per https://github.com/iterative/dvc.org/pull/1054#pullrequestreview-395924744 --- content/docs/command-reference/checkout.md | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/content/docs/command-reference/checkout.md b/content/docs/command-reference/checkout.md index f2e22ea25e..1a9cea33f8 100644 --- a/content/docs/command-reference/checkout.md +++ b/content/docs/command-reference/checkout.md @@ -37,9 +37,9 @@ The execution of `dvc checkout` does the following: DVC-files. Scanning is limited to the given `targets` (if any). See also options `--with-deps` and `--recursive` below. -- Missing data files or directories, or those that don't match with any - DVC-file, are restored from the cache. See options `--force` and - `--relink`. +- Missing data files or directories are restored from the cache. + Those that don't match with any DVC-file are removed. See options `--force` + and `--relink`. A list of the changes done is printed. By default, this command tries not make copies of cached files in the workspace, using reflinks instead when supported by the file system (refer to @@ -63,14 +63,13 @@ progress made by the checkout. There are two methods to restore a file missing from the cache, depending on the situation. In some cases a pipeline must be reproduced (using `dvc repro`) to -regenerate its outputs. (See also `dvc pipeline`.) In other cases the cache can +regenerate its outputs (see also `dvc pipeline`). In other cases the cache can be pulled from remote storage using `dvc pull`. ## Options -- `--summary` - in addition to checking out DVC-tracked data, display a short - summary of the changes done by this command in the workspace, for example how - many files were added or deleted. +- `--summary` - display a short summary of the changes done by this command in + the workspace, instead of a full list of change. - `-d`, `--with-deps` - determines files to update by tracking dependencies to the target DVC-files (stages). If no `targets` are provided, this option is From a8199278b1c156476fad6fd71317659cfb3001ba Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 22 Apr 2020 17:00:49 -0500 Subject: [PATCH 7/8] use cases: add checkout output per https://github.com/iterative/dvc.org/pull/1054#issuecomment-611806881 --- content/docs/use-cases/shared-development-server.md | 3 ++- content/docs/use-cases/versioning-data-and-model-files.md | 7 +++++-- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/content/docs/use-cases/shared-development-server.md b/content/docs/use-cases/shared-development-server.md index f4f061bcf2..4058c808b6 100644 --- a/content/docs/use-cases/shared-development-server.md +++ b/content/docs/use-cases/shared-development-server.md @@ -96,7 +96,7 @@ manually. After this, they could decide to continue building this ```dvc $ git pull $ dvc checkout - # Data is linked from cache to workspace. +A raw # Data is linked from cache to workspace. $ dvc run -d clean -o processed ./process.py clean process $ git add processed.dvc $ git commit -m "process clean data" @@ -108,4 +108,5 @@ And now you can just as easily make their work appear in your workspace with: ```dvc $ git pull $ dvc checkout +A processed ``` diff --git a/content/docs/use-cases/versioning-data-and-model-files.md b/content/docs/use-cases/versioning-data-and-model-files.md index dbf45e37ce..01d160104b 100644 --- a/content/docs/use-cases/versioning-data-and-model-files.md +++ b/content/docs/use-cases/versioning-data-and-model-files.md @@ -92,6 +92,8 @@ file. Let's consider the full checkout first. It's quite straightforward: ```dvc $ git checkout v1.0 $ dvc checkout +M images +M model.pkl ``` These commands will restore the workspace to the first snapshot we made - code, @@ -105,8 +107,9 @@ the previous dataset only, we can do something like this (make sure that you don't have uncommitted changes in the `data.dvc`): ```dvc -$ git checkout v1.0 data.dvc -$ dvc checkout data.dvc +$ git checkout v1.0 images.dvc +$ dvc checkout images.dvc +M images ``` If you run `git status` you will see that `data.dvc` is modified and currently From 5a146f173a77a79c633ae470afbbd0b29280c5d9 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 22 Apr 2020 17:04:50 -0500 Subject: [PATCH 8/8] Update content/docs/command-reference/checkout.md --- content/docs/command-reference/checkout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/command-reference/checkout.md b/content/docs/command-reference/checkout.md index 677068f9ea..2748775b9f 100644 --- a/content/docs/command-reference/checkout.md +++ b/content/docs/command-reference/checkout.md @@ -69,7 +69,7 @@ be pulled from remote storage using `dvc pull`. ## Options - `--summary` - display a short summary of the changes done by this command in - the workspace, instead of a full list of change. + the workspace, instead of a full list of changes. - `-R`, `--recursive` - determines the files to checkout by searching each target directory and its subdirectories for DVC-files to inspect. If there are