From 5e4359158ed244611070a9c794cdbf2c589c6815 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Fri, 8 Oct 2021 20:40:51 -0400 Subject: [PATCH 01/39] guide: add DVC Experiments page and links + some copy edits --- content/docs/api-reference/make_checkpoint.md | 10 +- content/docs/command-reference/exp/apply.md | 2 +- content/docs/command-reference/exp/branch.md | 6 +- content/docs/command-reference/exp/run.md | 16 +-- content/docs/sidebar.json | 3 +- .../cleaning-experiments.md | 5 +- .../experiment-management/dvc-experiments.md | 36 +++++ .../user-guide/experiment-management/index.md | 133 +++++++----------- .../running-experiments.md | 3 +- 9 files changed, 112 insertions(+), 102 deletions(-) create mode 100644 content/docs/user-guide/experiment-management/dvc-experiments.md diff --git a/content/docs/api-reference/make_checkpoint.md b/content/docs/api-reference/make_checkpoint.md index 0d9132d7e4..4d19241b0d 100644 --- a/content/docs/api-reference/make_checkpoint.md +++ b/content/docs/api-reference/make_checkpoint.md @@ -1,7 +1,6 @@ # dvc.api.make_checkpoint() -Make an -[in-code checkpoint](/doc/user-guide/experiment-management#checkpoints-in-source-code). +Make an [in-code checkpoint](/doc/user-guide/experiment-management/checkpoints). ```py def make_checkpoint() @@ -59,9 +58,7 @@ stages: The code in `iterate.py` will execute continuously increment an integer number saved in `int.txt` (starting at 0). At 0 and every 100 loops, it makes a -checkpoint for [DVC experiments]: - -[dvc experiments]: /doc/user-guide/experiment-management#experiments +checkpoint for `dvc experiments`: ```py import os @@ -143,5 +140,4 @@ $ dvc exp show If we use `dvc exp run` again, the process will start from 200 (since that's what the workspace reflects). -See [Experiment Management](/doc/user-guide/experiment-management) for more -details on managing experiments. +See `dvc experiments` for more details on managing experiments. diff --git a/content/docs/command-reference/exp/apply.md b/content/docs/command-reference/exp/apply.md index 5ecbcbdf67..d22f1c2afd 100644 --- a/content/docs/command-reference/exp/apply.md +++ b/content/docs/command-reference/exp/apply.md @@ -20,7 +20,7 @@ can be referenced by name or hash (see `dvc exp run` for details). This is typically used after choosing a target `experiment` with `dvc exp show` or `dvc exp diff`, and before committing it to Git (making it -[persistent](/doc/user-guide/experiment-management#persistent-experiments)). +[persistent](/doc/user-guide/experiment-management/dvc-experiments#persistent-experiments)). `dvc exp apply` changes any files (code, data, parameters, metrics, etc.) needed to reflect the experiment conditions and diff --git a/content/docs/command-reference/exp/branch.md b/content/docs/command-reference/exp/branch.md index a4ac9a4c11..6b837088fd 100644 --- a/content/docs/command-reference/exp/branch.md +++ b/content/docs/command-reference/exp/branch.md @@ -18,9 +18,9 @@ positional arguments: Makes a named Git [`branch`](https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging) containing the target `experiment` (making it -[persistent](/doc/user-guide/experiment-management#persistent-experiments)). For -[checkpoint experiments](/doc/command-reference/exp/run#checkpoints), the new -branch will contain multiple commits (the checkpoints). +[persistent](/doc/user-guide/experiment-management/dvc-experiments#persistent-experiments)). +For [checkpoint experiments](/doc/command-reference/exp/run#checkpoints), the +new branch will contain multiple commits (the checkpoints). The new `branch` will be based on the experiment's parent commit (`HEAD` at the time that the experiment was run). Note that DVC **does not** switch into the diff --git a/content/docs/command-reference/exp/run.md b/content/docs/command-reference/exp/run.md index de2277c076..5a86b9b656 100644 --- a/content/docs/command-reference/exp/run.md +++ b/content/docs/command-reference/exp/run.md @@ -18,14 +18,14 @@ positional arguments: ## Description -Provides a way to execute and track experiments in your +Provides a way to execute and track `dvc experiments` in your project without polluting it with unnecessary commits, branches, directories, etc. -> `dvc exp run` is equivalent to `dvc repro` for experiments. It has the same -> behavior when it comes to `targets` and stage execution (restores the -> dependency graph, etc.). See the command [options](#options) for more on the -> differences. +> `dvc exp run` is equivalent to `dvc repro` for experiments. It +> has the same behavior when it comes to `targets` and stage execution (restores +> the dependency graph, etc.). See the command [options](#options) for more on +> the differences. Before running an experiment, you'll probably want to make modifications such as data and code updates, or hyperparameter tuning. For the latter, @@ -44,7 +44,7 @@ option. Experiments are custom [Git references](https://git-scm.com/book/en/v2/Git-Internals-Git-References) (found in `.git/refs/exps`) with a single commit based on `HEAD` (not checked -out by DVC). Note that these commits are not pushed to the Git remote by default +out by DVC). Note that these commits are not pushed to Git remotes by default (see `dvc exp push`). @@ -55,8 +55,8 @@ and compare multiple experiments, use `dvc exp show` or `dvc exp diff` to restore the results of any other experiment instead. Successful experiments can be made -[persistent](/doc/user-guide/experiment-management#persistent-experiments) by -committing them to the Git repo. Unnecessary ones can be removed with +[persistent](/doc/user-guide/experiment-management/dvc-experiments#persistent-experiments) +by committing them to the Git repo. Unnecessary ones can be removed with `dvc exp remove`or `dvc exp gc` (or abandoned). > Note that experiment data will remain in the cache until you use diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json index 27f553b91e..2a2d2bfcef 100644 --- a/content/docs/sidebar.json +++ b/content/docs/sidebar.json @@ -147,6 +147,7 @@ "slug": "experiment-management", "source": "experiment-management/index.md", "children": [ + "dvc-experiments", "running-experiments", "sharing-experiments", "cleaning-experiments", @@ -245,7 +246,7 @@ "slug": "doctor" }, { - "label": "exp", + "label": "experiments", "slug": "exp", "source": "exp/index.md", "children": [ diff --git a/content/docs/user-guide/experiment-management/cleaning-experiments.md b/content/docs/user-guide/experiment-management/cleaning-experiments.md index bd48c09f97..18b4d94ba3 100644 --- a/content/docs/user-guide/experiment-management/cleaning-experiments.md +++ b/content/docs/user-guide/experiment-management/cleaning-experiments.md @@ -2,7 +2,10 @@ Although DVC uses minimal resources to keep track of the experiments, they may clutter tables and the workspace. DVC allows to remove specific experiments from -the workspace or delete all not-yet-persisted experiments at once. +the workspace or delete all not-yet-[persisted] experiments at once. + +[persisted]: + /doc/user-guide/experiment-management/dvc-experiments#persistent-experiments ## Removing specific experiments diff --git a/content/docs/user-guide/experiment-management/dvc-experiments.md b/content/docs/user-guide/experiment-management/dvc-experiments.md new file mode 100644 index 0000000000..45f456acb8 --- /dev/null +++ b/content/docs/user-guide/experiment-management/dvc-experiments.md @@ -0,0 +1,36 @@ +## DVC Experiments + +_New in DVC 2.0_ + +`dvc exp` commands let you automatically track a variation to an established +[data pipeline](/doc/command-reference/dag) baseline. You can create multiple +isolated experiments this way, as well as review, compare, and restore them +later, or roll back to the baseline. The basic workflow goes like this: + +- Modify stage parameters or other dependencies (e.g. input data, + source code) of committed stages. +- [Run experiments] with `dvc exp run` (instead of `repro`) to execute the + pipeline. The results are reflected in your workspace, and + tracked automatically. +- Use `dvc metrics` to identify the best experiment(s). +- Visualize, compare experiments with `dvc exp show` or `dvc exp diff`. Repeat + 🔄 +- Use `dvc exp apply` to roll back to the best one. +- Make the selected experiment persistent by committing its results to Git. This + cleans the slate so you can repeat the process. + +[run experiments]: /doc/user-guide/experiment-management/running-experiments + +## Persistent Experiments + +When your experiments are good enough to save or share, you may want to store +them persistently as Git commits in your repository. + +Whether the results were produced with `dvc repro` directly, or after a +`dvc exp` workflow, `dvc.yaml` and `dvc.lock` will define the experiment as a +new project version. The right outputs (including +[metrics](/doc/command-reference/metrics)) should also be present, or available +via `dvc checkout`. + +Use `dvc exp apply` and `dvc exp branch` to persist experiments in your Git +history. diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md index 7841b5a8ab..3203494af9 100644 --- a/content/docs/user-guide/experiment-management/index.md +++ b/content/docs/user-guide/experiment-management/index.md @@ -1,89 +1,71 @@ # Experiment Management -_New in DVC 2.0_ - Data science and ML are iterative processes that require a large number of attempts to reach a certain level of a metric. Experimentation is part of the development of data features, hyperspace exploration, deep learning -optimization, etc. DVC helps you codify and manage all of your -experiments, supporting these main approaches: - -1. Create [experiments](#experiments) that derive from your latest project - version without having to track them manually. DVC does that automatically, - letting you list and compare them. The best ones can be made persistent, and - the rest archived. -2. Place in-code [checkpoints](#checkpoints-in-source-code) that mark a series - of variations, forming a deep experiment. DVC helps you capture them at - runtime, and manage them in batches. -3. Make experiments or checkpoints [persistent](#persistent-experiments) by - committing them to your repository. Or create these versions - from scratch like typical project changes. - - At this point you may also want to consider the different - [ways to organize](#organization-patterns) experiments in your project (as - Git branches, as folders, etc.). - -DVC also provides specialized features to codify and analyze experiments. +optimization, etc. + +Some of DVC's base features already help you codify and analyze experiments. [Parameters](/doc/command-reference/params) are simple values you can tweak in a -human-readable text file, which cause different behaviors in your code and -models. On the other end, [metrics](/doc/command-reference/metrics) (and +formatted text file; They cause different behaviors in your code and models. On +the other end, [metrics](/doc/command-reference/metrics) (and [plots](/doc/command-reference/plots)) let you define, visualize, and compare -meaningful measures for the experimental results. - -> 👨‍💻 See [Get Started: Experiments](/doc/start/experiments) for a hands-on -> introduction to DVC experiments. +quantitative measures of your results. -## Experiments +
-`dvc exp` commands let you automatically track a variation to an established -[data pipeline](/doc/command-reference/dag). You can create multiple isolated -experiments this way, as well as review, compare, and restore them later, or -roll back to the baseline. The basic workflow goes like this: +## 💡 Run Cache: Automatic Log of Stage Runs -- Modify stage parameters or other dependencies (e.g. input data, - source code) of committed stages. -- Use `dvc exp run` (instead of `repro`) to execute the pipeline. The results - are reflected in your workspace, and tracked automatically. -- Use [metrics](/doc/command-reference/metrics) to identify the best - experiment(s). -- Visualize, compare experiments with `dvc exp show` or `dvc exp diff`. Repeat - 🔄 -- Use `dvc exp apply` to roll back to the best one. -- Make the selected experiment persistent by committing its results to Git. This - cleans the slate so you can repeat the process. - -## Checkpoints in source code +Every time you [reproduce](/doc/command-reference/repro) a pipeline with DVC, it +logs the unique signature of each stage run (in `.dvc/cache/runs` by default). +If it never happened before, the stage command(s) are executed normally. Every +subsequent time a [stage](/doc/command-reference/run) runs under the same +conditions, the previous results can be restored instantly, without wasting time +or computing resources. -To track successive steps in a longer experiment, you can register checkpoints -from your code at runtime. This allows you, for example, to track the progress -in deep learning techniques such as evolving neural networks. +✅ This built-in feature is called run-cache and it can +dramatically improve performance. It's enabled out-of-the-box (can be disabled), +which means DVC is already saving all of your tests and experiments behind the +scene. But there's no easy way to explore it. -This kind of experiments track a series of variations (the checkpoints) and its -execution can be stopped and resumed as needed. You interact with them using -`dvc exp run` and its `--rev`, `--reset` options (see also the `checkpoint` -field in `dvc.yaml` `outs`). +
-> 📖 To learn more, see the dedicated -> [Checkpoints](/doc/user-guide/experiment-management/checkpoints) guide. +## DVC Experiments -## Persistent experiments +_New in DVC 2.0_ -When your experiments are good enough to save or share, you may want to store -them persistently as Git commits in your repository. +The `dvc experiments` features are designed to support these main approaches: + +1. [Run] and capture [experiments] that derive from your latest project version + without polluting your Git history. DVC tracks them for you, letting you list + and compare them. The best ones can be made persistent, and the rest left as + history or cleared. +1. [Queue] and process series of experiments based on a parameter search or + other modifications to your baseline. +1. Generate [checkpoints] during your code execution to analyze the internal + progress of deep experiments. DVC captures them at runtime, and can manage + them in batches. +1. Make experiments [persistent] by committing them to your + repository history. + +[run]: /doc/user-guide/experiment-management/running-experiments +[experiments]: /doc/user-guide/experiment-management/dvc-experiments +[queue]: + /doc/user-guide/experiment-management/running-experiments#the-experiments-queue +[checkpoints]: /doc/user-guide/experiment-management/checkpoints +[persistent]: + /doc/user-guide/experiment-management/dvc-experiments#persistent-experiments + +📖 More information in the +[full guide](/doc/user-guide/experiment-management/dvc-experiments). -Whether the results were produced with `dvc repro` directly, or after a -`dvc exp` workflow (refer to previous sections), the `dvc.yaml` and `dvc.lock` -pair in the workspace will codify the experiment as a new project -version. The right outputs (including -[metrics](/doc/command-reference/metrics)) should also be present, or available -via `dvc checkout`. +> 👨‍💻 See [Get Started: Experiments](/doc/start/experiments) for a hands-on +> introduction to DVC experiments. -### Organization patterns +### Organization Patterns -DVC takes care of arranging `dvc exp` experiments and the data -cache under the hood. But when it comes to full-blown persistent -experiments, it's up to you to decide how to organize them in your project. -These are the main alternatives: +It's up to you to decide how to organize completed experiments. These are the +main alternatives: - **Git tags and branches** - use the repo's "time dimension" to distribute your experiments. This makes the most sense for experiments that build on each @@ -99,15 +81,6 @@ These are the main alternatives: Completely independent experiments live in separate directories, while their progress can be found in different branches. -## Automatic log of stage runs (run-cache) - -Every time you `dvc repro` pipelines or `dvc exp run` experiments, DVC logs the -unique signature of each stage run (to `.dvc/cache/runs` by default). If it -never happened before, the stage command(s) are executed normally. Every -subsequent time a [stage](/doc/command-reference/run) runs under the same -conditions, the previous results can be restored instantly, without wasting time -or computing resources. - -✅ This built-in feature is called run-cache and it can -dramatically improve performance. It's enabled out-of-the-box (but can be -disabled with the `--no-run-cache` command option). +DVC takes care of arranging `dvc exp` experiments and the data +cache under the hood so there's no need to decide on the above +until your `dvc experiments` are made [persistent]. diff --git a/content/docs/user-guide/experiment-management/running-experiments.md b/content/docs/user-guide/experiment-management/running-experiments.md index 6f3a3dc51c..dac1ae0450 100644 --- a/content/docs/user-guide/experiment-management/running-experiments.md +++ b/content/docs/user-guide/experiment-management/running-experiments.md @@ -226,7 +226,8 @@ Note that Git-ignored files/dirs are explicitly excluded from queued/temp runs to avoid committing unwanted files into Git (e.g. once successful experiments are [persisted]). -[persisted]: /doc/user-guide/experiment-management#persistent-experiments +[persisted]: + /doc/user-guide/experiment-management/dvc-experiments#persistent-experiments > 💡 To include untracked files, stage them with `git add` first (before > `dvc exp run`) and `git reset` them afterwards. From 6b7300afe7b78e903027161c237804cb27f27b09 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Sat, 9 Oct 2021 21:54:04 -0400 Subject: [PATCH 02/39] guide: remove checkpoint related changes --- content/docs/api-reference/make_checkpoint.md | 10 +++++++--- content/docs/sidebar.json | 2 +- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/content/docs/api-reference/make_checkpoint.md b/content/docs/api-reference/make_checkpoint.md index 4d19241b0d..0d9132d7e4 100644 --- a/content/docs/api-reference/make_checkpoint.md +++ b/content/docs/api-reference/make_checkpoint.md @@ -1,6 +1,7 @@ # dvc.api.make_checkpoint() -Make an [in-code checkpoint](/doc/user-guide/experiment-management/checkpoints). +Make an +[in-code checkpoint](/doc/user-guide/experiment-management#checkpoints-in-source-code). ```py def make_checkpoint() @@ -58,7 +59,9 @@ stages: The code in `iterate.py` will execute continuously increment an integer number saved in `int.txt` (starting at 0). At 0 and every 100 loops, it makes a -checkpoint for `dvc experiments`: +checkpoint for [DVC experiments]: + +[dvc experiments]: /doc/user-guide/experiment-management#experiments ```py import os @@ -140,4 +143,5 @@ $ dvc exp show If we use `dvc exp run` again, the process will start from 200 (since that's what the workspace reflects). -See `dvc experiments` for more details on managing experiments. +See [Experiment Management](/doc/user-guide/experiment-management) for more +details on managing experiments. diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json index 2a2d2bfcef..455e31ecab 100644 --- a/content/docs/sidebar.json +++ b/content/docs/sidebar.json @@ -246,7 +246,7 @@ "slug": "doctor" }, { - "label": "experiments", + "label": "exp", "slug": "exp", "source": "exp/index.md", "children": [ From 6027e155dc7da88636181537873b302852216a32 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Sat, 9 Oct 2021 21:57:46 -0400 Subject: [PATCH 03/39] guide: remove `dvc experiments` long cmd autolinks per https://github.com/iterative/dvc.org/pull/2901 --- content/docs/command-reference/exp/run.md | 2 +- content/docs/user-guide/experiment-management/index.md | 5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/content/docs/command-reference/exp/run.md b/content/docs/command-reference/exp/run.md index 5a86b9b656..ef02b21be1 100644 --- a/content/docs/command-reference/exp/run.md +++ b/content/docs/command-reference/exp/run.md @@ -18,7 +18,7 @@ positional arguments: ## Description -Provides a way to execute and track `dvc experiments` in your +Provides a way to execute and track experiments in your project without polluting it with unnecessary commits, branches, directories, etc. diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md index 3203494af9..866f2e2dc3 100644 --- a/content/docs/user-guide/experiment-management/index.md +++ b/content/docs/user-guide/experiment-management/index.md @@ -34,7 +34,8 @@ scene. But there's no easy way to explore it. _New in DVC 2.0_ -The `dvc experiments` features are designed to support these main approaches: +DVC experiment management features are designed to support these main +approaches: 1. [Run] and capture [experiments] that derive from your latest project version without polluting your Git history. DVC tracks them for you, letting you list @@ -83,4 +84,4 @@ main alternatives: DVC takes care of arranging `dvc exp` experiments and the data cache under the hood so there's no need to decide on the above -until your `dvc experiments` are made [persistent]. +until your experiments are made [persistent]. From 8f048993f62693bfe5ef56099d181f468457bc6e Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 11 Oct 2021 15:26:58 -0400 Subject: [PATCH 04/39] guide: move run-cache section back to Exp Mgmt index bottom per https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-775621333 --- .../user-guide/experiment-management/index.md | 32 ++++++++----------- 1 file changed, 14 insertions(+), 18 deletions(-) diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md index 866f2e2dc3..dff85f8487 100644 --- a/content/docs/user-guide/experiment-management/index.md +++ b/content/docs/user-guide/experiment-management/index.md @@ -12,24 +12,6 @@ the other end, [metrics](/doc/command-reference/metrics) (and [plots](/doc/command-reference/plots)) let you define, visualize, and compare quantitative measures of your results. -
- -## 💡 Run Cache: Automatic Log of Stage Runs - -Every time you [reproduce](/doc/command-reference/repro) a pipeline with DVC, it -logs the unique signature of each stage run (in `.dvc/cache/runs` by default). -If it never happened before, the stage command(s) are executed normally. Every -subsequent time a [stage](/doc/command-reference/run) runs under the same -conditions, the previous results can be restored instantly, without wasting time -or computing resources. - -✅ This built-in feature is called run-cache and it can -dramatically improve performance. It's enabled out-of-the-box (can be disabled), -which means DVC is already saving all of your tests and experiments behind the -scene. But there's no easy way to explore it. - -
- ## DVC Experiments _New in DVC 2.0_ @@ -85,3 +67,17 @@ main alternatives: DVC takes care of arranging `dvc exp` experiments and the data cache under the hood so there's no need to decide on the above until your experiments are made [persistent]. + +## Run Cache: Automatic Log of Stage Runs + +Every time you [reproduce](/doc/command-reference/repro) a pipeline with DVC, it +logs the unique signature of each stage run (in `.dvc/cache/runs` by default). +If it never happened before, the stage command(s) are executed normally. Every +subsequent time a [stage](/doc/command-reference/run) runs under the same +conditions, the previous results can be restored instantly, without wasting time +or computing resources. + +✅ This built-in feature is called run-cache and it can +dramatically improve performance. It's enabled out-of-the-box (can be disabled), +which means DVC is already saving all of your tests and experiments behind the +scene. But there's no easy way to explore it. From 0c2bcf59a2cedf9b6dae6a678c929e9232248265 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 1 Nov 2021 17:53:39 -0600 Subject: [PATCH 05/39] guide: Exp Mgmt/ DVC Exps -> Exps Overview per https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-775621268 --- content/docs/command-reference/exp/apply.md | 2 +- content/docs/command-reference/exp/branch.md | 2 +- content/docs/command-reference/exp/run.md | 2 +- content/docs/sidebar.json | 2 +- .../experiment-management/cleaning-experiments.md | 2 +- .../{dvc-experiments.md => experiments-overview.md} | 4 +--- content/docs/user-guide/experiment-management/index.md | 6 +++--- .../user-guide/experiment-management/running-experiments.md | 2 +- 8 files changed, 10 insertions(+), 12 deletions(-) rename content/docs/user-guide/experiment-management/{dvc-experiments.md => experiments-overview.md} (97%) diff --git a/content/docs/command-reference/exp/apply.md b/content/docs/command-reference/exp/apply.md index d22f1c2afd..95f34d4840 100644 --- a/content/docs/command-reference/exp/apply.md +++ b/content/docs/command-reference/exp/apply.md @@ -20,7 +20,7 @@ can be referenced by name or hash (see `dvc exp run` for details). This is typically used after choosing a target `experiment` with `dvc exp show` or `dvc exp diff`, and before committing it to Git (making it -[persistent](/doc/user-guide/experiment-management/dvc-experiments#persistent-experiments)). +[persistent](/doc/user-guide/experiment-management/experiments-overview#persistent-experiments)). `dvc exp apply` changes any files (code, data, parameters, metrics, etc.) needed to reflect the experiment conditions and diff --git a/content/docs/command-reference/exp/branch.md b/content/docs/command-reference/exp/branch.md index 6b837088fd..8a0e10705a 100644 --- a/content/docs/command-reference/exp/branch.md +++ b/content/docs/command-reference/exp/branch.md @@ -18,7 +18,7 @@ positional arguments: Makes a named Git [`branch`](https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging) containing the target `experiment` (making it -[persistent](/doc/user-guide/experiment-management/dvc-experiments#persistent-experiments)). +[persistent](/doc/user-guide/experiment-management/experiments-overview#persistent-experiments)). For [checkpoint experiments](/doc/command-reference/exp/run#checkpoints), the new branch will contain multiple commits (the checkpoints). diff --git a/content/docs/command-reference/exp/run.md b/content/docs/command-reference/exp/run.md index f96731cb66..f4fe9751de 100644 --- a/content/docs/command-reference/exp/run.md +++ b/content/docs/command-reference/exp/run.md @@ -55,7 +55,7 @@ and compare multiple experiments, use `dvc exp show` or `dvc exp diff` to restore the results of any other experiment instead. Successful experiments can be made -[persistent](/doc/user-guide/experiment-management/dvc-experiments#persistent-experiments) +[persistent](/doc/user-guide/experiment-management/experiments-overview#persistent-experiments) by committing them to the Git repo. Unnecessary ones can be removed with `dvc exp remove`or `dvc exp gc` (or abandoned). diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json index 0379ebd8ac..a2782658df 100644 --- a/content/docs/sidebar.json +++ b/content/docs/sidebar.json @@ -148,7 +148,7 @@ "slug": "experiment-management", "source": "experiment-management/index.md", "children": [ - "dvc-experiments", + "experiments-overview", "running-experiments", "sharing-experiments", "persisting-experiments", diff --git a/content/docs/user-guide/experiment-management/cleaning-experiments.md b/content/docs/user-guide/experiment-management/cleaning-experiments.md index 18b4d94ba3..e84be1e13b 100644 --- a/content/docs/user-guide/experiment-management/cleaning-experiments.md +++ b/content/docs/user-guide/experiment-management/cleaning-experiments.md @@ -5,7 +5,7 @@ clutter tables and the workspace. DVC allows to remove specific experiments from the workspace or delete all not-yet-[persisted] experiments at once. [persisted]: - /doc/user-guide/experiment-management/dvc-experiments#persistent-experiments + /doc/user-guide/experiment-management/experiments-overview#persistent-experiments ## Removing specific experiments diff --git a/content/docs/user-guide/experiment-management/dvc-experiments.md b/content/docs/user-guide/experiment-management/experiments-overview.md similarity index 97% rename from content/docs/user-guide/experiment-management/dvc-experiments.md rename to content/docs/user-guide/experiment-management/experiments-overview.md index 45f456acb8..25cefcd253 100644 --- a/content/docs/user-guide/experiment-management/dvc-experiments.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -1,6 +1,4 @@ -## DVC Experiments - -_New in DVC 2.0_ +## DVC Experiments Overview `dvc exp` commands let you automatically track a variation to an established [data pipeline](/doc/command-reference/dag) baseline. You can create multiple diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md index dff85f8487..a9e198b958 100644 --- a/content/docs/user-guide/experiment-management/index.md +++ b/content/docs/user-guide/experiment-management/index.md @@ -32,15 +32,15 @@ approaches: repository history. [run]: /doc/user-guide/experiment-management/running-experiments -[experiments]: /doc/user-guide/experiment-management/dvc-experiments +[experiments]: /doc/user-guide/experiment-management/experiments-overview [queue]: /doc/user-guide/experiment-management/running-experiments#the-experiments-queue [checkpoints]: /doc/user-guide/experiment-management/checkpoints [persistent]: - /doc/user-guide/experiment-management/dvc-experiments#persistent-experiments + /doc/user-guide/experiment-management/experiments-overview#persistent-experiments 📖 More information in the -[full guide](/doc/user-guide/experiment-management/dvc-experiments). +[full guide](/doc/user-guide/experiment-management/experiments-overview). > 👨‍💻 See [Get Started: Experiments](/doc/start/experiments) for a hands-on > introduction to DVC experiments. diff --git a/content/docs/user-guide/experiment-management/running-experiments.md b/content/docs/user-guide/experiment-management/running-experiments.md index a0b0a9e3fd..b14d69be6f 100644 --- a/content/docs/user-guide/experiment-management/running-experiments.md +++ b/content/docs/user-guide/experiment-management/running-experiments.md @@ -232,7 +232,7 @@ to avoid committing unwanted files into Git (e.g. once successful experiments are [persisted]). [persisted]: - /doc/user-guide/experiment-management/dvc-experiments#persistent-experiments + /doc/user-guide/experiment-management/experiments-overview#persistent-experiments > 💡 To include untracked files, stage them with `git add` first (before > `dvc exp run`) and `git reset` them afterwards. From 27afdc1a6e78a65cd4863fe6fbe0d052f1b516e6 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 1 Nov 2021 18:53:10 -0600 Subject: [PATCH 06/39] guide: clear separation between Exp Mgmt index and Overview page rel https://github.com/iterative/dvc.org/pull/2909#discussion_r740633633 --- .../experiments-overview.md | 28 +++++----- .../user-guide/experiment-management/index.md | 51 +++++++++---------- 2 files changed, 40 insertions(+), 39 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 25cefcd253..584504769c 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -1,26 +1,30 @@ ## DVC Experiments Overview -`dvc exp` commands let you automatically track a variation to an established -[data pipeline](/doc/command-reference/dag) baseline. You can create multiple -isolated experiments this way, as well as review, compare, and restore them -later, or roll back to the baseline. The basic workflow goes like this: +`dvc exp` commands let you automatically track a variation to a committed +project version (baseline). You can create independent groups of experiments +this way, as well as review, compare, and restore them later. The basic workflow +goes like this: -- Modify stage parameters or other dependencies (e.g. input data, - source code) of committed stages. +- Modify parameters or other dependencies (input data, source code, + stage definitions, etc.) of committed stages. - [Run experiments] with `dvc exp run` (instead of `repro`) to execute the pipeline. The results are reflected in your workspace, and tracked automatically. -- Use `dvc metrics` to identify the best experiment(s). -- Visualize, compare experiments with `dvc exp show` or `dvc exp diff`. Repeat - 🔄 -- Use `dvc exp apply` to roll back to the best one. -- Make the selected experiment persistent by committing its results to Git. This - cleans the slate so you can repeat the process. +- Use [metrics](/doc/command-reference/metrics) to identify the best + experiment(s). +- Visualize and compare experiments with `dvc exp show` or `dvc exp diff`. + Repeat 🔄 +- Make certain experiments [persistent](#persistent-experiments) by committing + their results to Git. This cleans the slate so you can repeat the process + later. [run experiments]: /doc/user-guide/experiment-management/running-experiments +[persistent]: /doc/user-guide/experiment-management/persisting-experiments ## Persistent Experiments +📖 See [full guide][persistent]. + When your experiments are good enough to save or share, you may want to store them persistently as Git commits in your repository. diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md index a9e198b958..6659575c7b 100644 --- a/content/docs/user-guide/experiment-management/index.md +++ b/content/docs/user-guide/experiment-management/index.md @@ -1,8 +1,8 @@ # Experiment Management -Data science and ML are iterative processes that require a large number of -attempts to reach a certain level of a metric. Experimentation is part of the -development of data features, hyperspace exploration, deep learning +Data science and machine learning are iterative processes that require a large +number of attempts to reach a certain level of a metric. Experimentation is part +of the development of data features, hyperspace exploration, deep learning optimization, etc. Some of DVC's base features already help you codify and analyze experiments. @@ -12,35 +12,30 @@ the other end, [metrics](/doc/command-reference/metrics) (and [plots](/doc/command-reference/plots)) let you define, visualize, and compare quantitative measures of your results. -## DVC Experiments +## Experimentation methods in DVC _New in DVC 2.0_ -DVC experiment management features are designed to support these main -approaches: - -1. [Run] and capture [experiments] that derive from your latest project version - without polluting your Git history. DVC tracks them for you, letting you list - and compare them. The best ones can be made persistent, and the rest left as - history or cleared. -1. [Queue] and process series of experiments based on a parameter search or - other modifications to your baseline. -1. Generate [checkpoints] during your code execution to analyze the internal - progress of deep experiments. DVC captures them at runtime, and can manage - them in batches. -1. Make experiments [persistent] by committing them to your - repository history. - -[run]: /doc/user-guide/experiment-management/running-experiments +DVC experiment management features build on top of base DVC features to form a +comprehensive framework to organize, execute, manage, and share ML experiments. +They support support these main approaches: + +- Compare params and metrics of existing project versions (for example different + Git branches) against each other or against new results in your workspace + (without committing them). + +- [Run and capture] multiple experiments (derived from any project version as + baseline) without polluting your Git history. DVC tracks them for you, letting + you compare and share them. 📖 More info in the [Experiments + Overview][experiments]. + +- Generate [checkpoints] during your code execution to analyze the internal + progress of deep experiments. DVC captures [live metrics](/doc/dvclive) at + runtime, and lets you manage them in batches. + +[run and capture]: /doc/user-guide/experiment-management/running-experiments [experiments]: /doc/user-guide/experiment-management/experiments-overview -[queue]: - /doc/user-guide/experiment-management/running-experiments#the-experiments-queue [checkpoints]: /doc/user-guide/experiment-management/checkpoints -[persistent]: - /doc/user-guide/experiment-management/experiments-overview#persistent-experiments - -📖 More information in the -[full guide](/doc/user-guide/experiment-management/experiments-overview). > 👨‍💻 See [Get Started: Experiments](/doc/start/experiments) for a hands-on > introduction to DVC experiments. @@ -55,10 +50,12 @@ main alternatives: other. Helpful if the Git [revisions](https://git-scm.com/docs/revisions) can be easily visualized, for example with tools [like GitHub](https://docs.github.com/en/github/visualizing-repository-data-with-graphs/viewing-a-repositorys-network). + - **Directories** - the project's "space dimension" can be structured with directories (folders) to organize experiments. Useful when you want to see all your experiments at the same time (without switching versions) by just exploring the file system. + - **Hybrid** - combining an intuitive directory structure with a good repo branching strategy tends to be the best option for complex projects. Completely independent experiments live in separate directories, while their From 30db819880d6f4034f6ffb94781af8290a341411 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 1 Nov 2021 19:10:25 -0600 Subject: [PATCH 07/39] guide: single guide for Persisting Exps content and fix links --- content/docs/command-reference/exp/apply.md | 11 +++++---- content/docs/command-reference/exp/branch.md | 20 ++++++++-------- content/docs/command-reference/exp/run.md | 9 ++++---- .../cleaning-experiments.md | 3 +-- .../experiments-overview.md | 23 +++---------------- .../user-guide/experiment-management/index.md | 2 ++ .../persisting-experiments.md | 10 ++++---- .../running-experiments.md | 3 +-- 8 files changed, 33 insertions(+), 48 deletions(-) diff --git a/content/docs/command-reference/exp/apply.md b/content/docs/command-reference/exp/apply.md index 95f34d4840..cca0eca7a6 100644 --- a/content/docs/command-reference/exp/apply.md +++ b/content/docs/command-reference/exp/apply.md @@ -19,8 +19,7 @@ been made after the experiment was run (`HEAD` hasn't moved). The `experiment` can be referenced by name or hash (see `dvc exp run` for details). This is typically used after choosing a target `experiment` with `dvc exp show` -or `dvc exp diff`, and before committing it to Git (making it -[persistent](/doc/user-guide/experiment-management/experiments-overview#persistent-experiments)). +or `dvc exp diff`, and before committing it to Git (making it [persistent]. `dvc exp apply` changes any files (code, data, parameters, metrics, etc.) needed to reflect the experiment conditions and @@ -30,9 +29,11 @@ results in the workspace. the experiment in question, in which case they are overwritten (unless `--no-force` is used). -Note that the history of -[checkpoints](/doc/command-reference/exp/run#checkpoints) found in the -`experiment` is **not** preserved when applying and committing it. +Note that the history of [checkpoints] found in the `experiment` is **not** +preserved when applying and committing it. + +[persistent]: /doc/user-guide/experiment-management/persisting-experiments +[checkpoints]: /doc/command-reference/exp/run#checkpoints ## Options diff --git a/content/docs/command-reference/exp/branch.md b/content/docs/command-reference/exp/branch.md index 8a0e10705a..50fa193eb9 100644 --- a/content/docs/command-reference/exp/branch.md +++ b/content/docs/command-reference/exp/branch.md @@ -15,26 +15,28 @@ positional arguments: ## Description -Makes a named Git -[`branch`](https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging) -containing the target `experiment` (making it -[persistent](/doc/user-guide/experiment-management/experiments-overview#persistent-experiments)). -For [checkpoint experiments](/doc/command-reference/exp/run#checkpoints), the -new branch will contain multiple commits (the checkpoints). +Makes a named Git [`branch`] containing the target `experiment` (making it +[persistent]. For [checkpoint experiments], the new branch will contain multiple +commits (the checkpoints). The new `branch` will be based on the experiment's parent commit (`HEAD` at the time that the experiment was run). Note that DVC **does not** switch into the new `branch` automatically. `dvc exp branch` is useful to make an experiment persistent without modifying -the workspace, so they can be continued, -[stored, and shared](https://dvc.org/doc/use-cases/sharing-data-and-model-files) -in a normal Git + DVC workflow. +the workspace, so they can be continued, [stored and shared] in a normal Git + +DVC workflow. To switch into the new branch, use `git checkout branch` and `dvc checkout`. Or use `git merge branch` and `dvc repro` to combine it with your current project version. +[`branch`]: + https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging +[persistent]: /doc/user-guide/experiment-management/persisting-experiments +[checkpoint experiments]: /doc/command-reference/exp/run#checkpoints +[stored and shared]: /doc/use-cases/sharing-data-and-model-files + ## Options - `-h`, `--help` - shows the help message and exit. diff --git a/content/docs/command-reference/exp/run.md b/content/docs/command-reference/exp/run.md index f4fe9751de..b059732bec 100644 --- a/content/docs/command-reference/exp/run.md +++ b/content/docs/command-reference/exp/run.md @@ -54,14 +54,15 @@ and compare multiple experiments, use `dvc exp show` or `dvc exp diff` (`plots diff` also accepts experiment names as `revisions`). Use `dvc exp apply` to restore the results of any other experiment instead. -Successful experiments can be made -[persistent](/doc/user-guide/experiment-management/experiments-overview#persistent-experiments) -by committing them to the Git repo. Unnecessary ones can be removed with -`dvc exp remove`or `dvc exp gc` (or abandoned). +Successful experiments can be made [persistent] by committing them to the Git +repo. Unnecessary ones can be removed with `dvc exp remove`or `dvc exp gc` (or +abandoned). > Note that experiment data will remain in the cache until you use > regular `dvc gc` to clean it up. +[persistent]: /doc/user-guide/experiment-management/persisting-experiments + ## Checkpoints To track successive steps in a longer or deeper experiment, you can diff --git a/content/docs/user-guide/experiment-management/cleaning-experiments.md b/content/docs/user-guide/experiment-management/cleaning-experiments.md index e84be1e13b..2415553449 100644 --- a/content/docs/user-guide/experiment-management/cleaning-experiments.md +++ b/content/docs/user-guide/experiment-management/cleaning-experiments.md @@ -4,8 +4,7 @@ Although DVC uses minimal resources to keep track of the experiments, they may clutter tables and the workspace. DVC allows to remove specific experiments from the workspace or delete all not-yet-[persisted] experiments at once. -[persisted]: - /doc/user-guide/experiment-management/experiments-overview#persistent-experiments +[persisted]: /doc/user-guide/experiment-management/persisting-experiments ## Removing specific experiments diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 584504769c..58320bd30f 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -1,4 +1,4 @@ -## DVC Experiments Overview +# DVC Experiments Overview `dvc exp` commands let you automatically track a variation to a committed project version (baseline). You can create independent groups of experiments @@ -14,25 +14,8 @@ goes like this: experiment(s). - Visualize and compare experiments with `dvc exp show` or `dvc exp diff`. Repeat 🔄 -- Make certain experiments [persistent](#persistent-experiments) by committing - their results to Git. This cleans the slate so you can repeat the process - later. +- Make certain experiments [persistent] by committing their results to Git. This + cleans the slate so you can repeat the process later. [run experiments]: /doc/user-guide/experiment-management/running-experiments [persistent]: /doc/user-guide/experiment-management/persisting-experiments - -## Persistent Experiments - -📖 See [full guide][persistent]. - -When your experiments are good enough to save or share, you may want to store -them persistently as Git commits in your repository. - -Whether the results were produced with `dvc repro` directly, or after a -`dvc exp` workflow, `dvc.yaml` and `dvc.lock` will define the experiment as a -new project version. The right outputs (including -[metrics](/doc/command-reference/metrics)) should also be present, or available -via `dvc checkout`. - -Use `dvc exp apply` and `dvc exp branch` to persist experiments in your Git -history. diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md index 6659575c7b..4b0404b37b 100644 --- a/content/docs/user-guide/experiment-management/index.md +++ b/content/docs/user-guide/experiment-management/index.md @@ -65,6 +65,8 @@ DVC takes care of arranging `dvc exp` experiments and the data cache under the hood so there's no need to decide on the above until your experiments are made [persistent]. +[persistent]: /doc/user-guide/experiment-management/persisting-experiments + ## Run Cache: Automatic Log of Stage Runs Every time you [reproduce](/doc/command-reference/repro) a pipeline with DVC, it diff --git a/content/docs/user-guide/experiment-management/persisting-experiments.md b/content/docs/user-guide/experiment-management/persisting-experiments.md index f5bebae62e..0b281c6019 100644 --- a/content/docs/user-guide/experiment-management/persisting-experiments.md +++ b/content/docs/user-guide/experiment-management/persisting-experiments.md @@ -1,11 +1,9 @@ # Persisting Experiments -DVC runs experiments outside of the Git stage/commit cycle for quick iteration. -When your experiments are good enough to save or share, you may want to store -them persistently as Git commits in your repository. - -In this section, we describe how to bring them to the standard Git workflow with -`dvc exp branch` and `dvc exp apply`. +DVC Experiments run outside of the regular Git workflow for faster iteration and +to avoid polluting your repository's history. Once experiments are +good enough to keep or distribute, you may want to store them persistently as +Git commits. ## Create a Git branch from an experiment diff --git a/content/docs/user-guide/experiment-management/running-experiments.md b/content/docs/user-guide/experiment-management/running-experiments.md index b14d69be6f..84b8633d82 100644 --- a/content/docs/user-guide/experiment-management/running-experiments.md +++ b/content/docs/user-guide/experiment-management/running-experiments.md @@ -231,8 +231,7 @@ Note that Git-ignored files/dirs are explicitly excluded from queued/temp runs to avoid committing unwanted files into Git (e.g. once successful experiments are [persisted]). -[persisted]: - /doc/user-guide/experiment-management/experiments-overview#persistent-experiments +[persistent]: /doc/user-guide/experiment-management/persisting-experiments > 💡 To include untracked files, stage them with `git add` first (before > `dvc exp run`) and `git reset` them afterwards. From aa3c5d06dbe631b6b30148f4bac76acd650f1157 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 1 Nov 2021 19:14:38 -0600 Subject: [PATCH 08/39] guide: begin extracting Exp details from Running to Overview rel https://github.com/iterative/dvc.org/pull/2909#discussion_r740633633 --- .../experiment-management/experiments-overview.md | 6 ++++++ .../experiment-management/running-experiments.md | 8 ++++---- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 58320bd30f..ddd69ad690 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -1,5 +1,11 @@ # DVC Experiments Overview +DVC Experiments are captures automatically by DVC when you [run them]. + +[run them]: /doc/user-guide/experiment-management/running-experiments + +## Basic Workflow + `dvc exp` commands let you automatically track a variation to a committed project version (baseline). You can create independent groups of experiments this way, as well as review, compare, and restore them later. The basic workflow diff --git a/content/docs/user-guide/experiment-management/running-experiments.md b/content/docs/user-guide/experiment-management/running-experiments.md index 84b8633d82..946b8209f9 100644 --- a/content/docs/user-guide/experiment-management/running-experiments.md +++ b/content/docs/user-guide/experiment-management/running-experiments.md @@ -1,8 +1,8 @@ # Running Experiments -We explain how DVC codifies and executes experiments, setting their parameters, -using multiple jobs to run them in parallel, and running them in queues, among -other details. +We explain how to execute DVC Experiments, setting their parameters, using +multiple jobs to run them in parallel, and running them in queues, among other +details. > 📖 If this is the first time you are introduced into data science > experimentation, you may want to check the basics in @@ -231,7 +231,7 @@ Note that Git-ignored files/dirs are explicitly excluded from queued/temp runs to avoid committing unwanted files into Git (e.g. once successful experiments are [persisted]). -[persistent]: /doc/user-guide/experiment-management/persisting-experiments +[persisted]: /doc/user-guide/experiment-management/persisting-experiments > 💡 To include untracked files, stage them with `git add` first (before > `dvc exp run`) and `git reset` them afterwards. From 77104338b1ba2e8daa7b7578e6a2ee38da71ee63 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 1 Nov 2021 19:39:07 -0600 Subject: [PATCH 09/39] guide: make ToC entry for Run Cache section rel https://github.com/iterative/dvc.org/pull/2909#discussion_r740635208 --- content/docs/user-guide/project-structure/internal-files.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/user-guide/project-structure/internal-files.md b/content/docs/user-guide/project-structure/internal-files.md index 456df8fd1b..a585a34e99 100644 --- a/content/docs/user-guide/project-structure/internal-files.md +++ b/content/docs/user-guide/project-structure/internal-files.md @@ -134,7 +134,7 @@ $ cat .dvc/cache/6f/db5336fce0dbfd669f83065f107551.dir That's how DVC knows that the other two cached files belong in the directory. -### Run-cache +## Run-cache `dvc repro` and `dvc run` by default populate and reutilize a log of stages that have been run in the project. It is found in the `runs/` directory inside the From a133f700ab2a65481fbf36ef8cedf89c9e4573d1 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Thu, 4 Nov 2021 10:39:09 -0600 Subject: [PATCH 10/39] Update content/docs/user-guide/experiment-management/index.md Co-authored-by: Ivan Shcheklein --- content/docs/user-guide/experiment-management/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md index 4b0404b37b..a8c8c0dc4a 100644 --- a/content/docs/user-guide/experiment-management/index.md +++ b/content/docs/user-guide/experiment-management/index.md @@ -40,7 +40,7 @@ They support support these main approaches: > 👨‍💻 See [Get Started: Experiments](/doc/start/experiments) for a hands-on > introduction to DVC experiments. -### Organization Patterns +### Organization patterns It's up to you to decide how to organize completed experiments. These are the main alternatives: From dacaf85f1dec0c0cba093cbd404d6ebe67046564 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 16 Nov 2021 19:19:44 -0600 Subject: [PATCH 11/39] [NESTED] guide: Exp implementation details, naming into Overview (#3006) * guide: bring exp implementation details and naming from ref. per https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-794881005 * guide: copy edits to exp naming info. --- content/docs/command-reference/exp/run.md | 41 +++++++------------ .../experiments-overview.md | 23 ++++++++++- 2 files changed, 36 insertions(+), 28 deletions(-) diff --git a/content/docs/command-reference/exp/run.md b/content/docs/command-reference/exp/run.md index b059732bec..de4aad52cb 100644 --- a/content/docs/command-reference/exp/run.md +++ b/content/docs/command-reference/exp/run.md @@ -32,36 +32,25 @@ data and code updates, or hyperparameter tuning. For the latter, you can use the `--set-param` (`-S`) option of this command to change `dvc param` values on-the fly. -Each experiment creates and tracks a project variation based on your -workspace changes. Experiments will have an auto-generated name -like `exp-bfe64` by default, which can be customized using the `--name` (`-n`) -option. +📖 See [DVC Experiments Overview][exp-overview] for more information. -
- -### ⚙️ How does DVC track experiments? - -Experiments are custom -[Git references](https://git-scm.com/book/en/v2/Git-Internals-Git-References) -(found in `.git/refs/exps`) with a single commit based on `HEAD` (not checked -out by DVC). Note that these commits are not pushed to Git remotes by default -(see `dvc exp push`). - -
- -The results of the last `dvc exp run` can be seen in the workspace. To display -and compare multiple experiments, use `dvc exp show` or `dvc exp diff` -(`plots diff` also accepts experiment names as `revisions`). Use `dvc exp apply` -to restore the results of any other experiment instead. +Each experiment creates and tracks a project variation based on the changes in +your workspace. The results of the last `dvc exp run` will be +reflected in the workspace. Experiments will have an auto-generated ID like +`exp-bfe64` by default. A custom name can be given instead, using the `--name` +(`-n`) option -Successful experiments can be made [persistent] by committing them to the Git -repo. Unnecessary ones can be removed with `dvc exp remove`or `dvc exp gc` (or -abandoned). +To display and compare multiple experiments, use `dvc exp show` or +`dvc exp diff` (`plots diff` also accepts experiment names as `revisions`). Use +`dvc exp apply` to restore the results of any experiment, for example to [commit +them][persisting] to Git. Unnecessary experiments can be removed with +`dvc exp remove`or `dvc exp gc` (or abandoned). -> Note that experiment data will remain in the cache until you use -> regular `dvc gc` to clean it up. +> Note that experiment data will remain in the local cache until +> you use regular `dvc gc` to clean it up. -[persistent]: /doc/user-guide/experiment-management/persisting-experiments +[exp-overview]: /doc/user-guide/experiment-management/experiments-overview +[persisting]: /doc/user-guide/experiment-management/persisting-experiments ## Checkpoints diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index ddd69ad690..0e8a448898 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -1,10 +1,29 @@ # DVC Experiments Overview -DVC Experiments are captures automatically by DVC when you [run them]. +DVC Experiments are captures automatically by DVC when you [run them]. Each +experiment creates and tracks a project variation based on the changes in your +workspace. Experiments preserve the latest commit in the current +branch (Git `HEAD`) as their parent or _baseline_. + +
+ +### ⚙️ How does DVC track experiments? + +Experiments are custom [Git references] (found in `.git/refs/exps`) with a +single commit based on `HEAD` (not checked out by DVC). Note that these commits +are not pushed to Git remotes by default (see `dvc exp push`). + +
+ +Experiments will have an auto-generated ID like `exp-bfe64` by default. A custom +name can be given instead (using the `--name`/`-n` option of `dvc exp run`). + +> ID or name can be used to reference experiments with `dvc exp` subcommands. [run them]: /doc/user-guide/experiment-management/running-experiments +[git references]: https://git-scm.com/book/en/v2/Git-Internals-Git-References -## Basic Workflow +## Basic workflow `dvc exp` commands let you automatically track a variation to a committed project version (baseline). You can create independent groups of experiments From 73175a997ff44cd90ece9ed0f356e437c0fff9da Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 29 Nov 2021 23:14:46 -0600 Subject: [PATCH 12/39] guide: emphasize dvc exps are not part of Git tree in overview rel https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-812064701 --- .../user-guide/experiment-management/experiments-overview.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 0e8a448898..dbe38aa1ec 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -3,7 +3,8 @@ DVC Experiments are captures automatically by DVC when you [run them]. Each experiment creates and tracks a project variation based on the changes in your workspace. Experiments preserve the latest commit in the current -branch (Git `HEAD`) as their parent or _baseline_. +branch (Git `HEAD`) as their parent or _baseline_, but do not form part of the +regular Git tree or workflow (unless you make them [persistent]).
From 112ad8702847fc1c033447ff1fc9cd3da6ca6df4 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 29 Nov 2021 23:17:17 -0600 Subject: [PATCH 13/39] guide: ID->name in dvc exps overview per https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-812312391 --- .../experiment-management/experiments-overview.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index dbe38aa1ec..9c86aa2d60 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -16,10 +16,10 @@ are not pushed to Git remotes by default (see `dvc exp push`).
-Experiments will have an auto-generated ID like `exp-bfe64` by default. A custom -name can be given instead (using the `--name`/`-n` option of `dvc exp run`). - -> ID or name can be used to reference experiments with `dvc exp` subcommands. +Experiments will have an auto-generated name like `exp-bfe64` by default. A +custom name can be given instead (using the `--name`/`-n` option of +`dvc exp run`). These names can be used to reference experiments in other +`dvc exp` subcommands. [run them]: /doc/user-guide/experiment-management/running-experiments [git references]: https://git-scm.com/book/en/v2/Git-Internals-Git-References From 9c2a55ce6ab143886f207b6e8f3b9d9b5224a9e5 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 29 Nov 2021 23:20:36 -0600 Subject: [PATCH 14/39] guide: ID->name in other exp guides rel https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-812312391 --- .../user-guide/experiment-management/comparing-experiments.md | 4 ++-- .../experiment-management/persisting-experiments.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/content/docs/user-guide/experiment-management/comparing-experiments.md b/content/docs/user-guide/experiment-management/comparing-experiments.md index 5974896ad9..a58642e9a3 100644 --- a/content/docs/user-guide/experiment-management/comparing-experiments.md +++ b/content/docs/user-guide/experiment-management/comparing-experiments.md @@ -384,8 +384,8 @@ params.yaml train.epochs 10 10 0 ## Compare an experiment with the workspace When you want to compare two experiments, either the baseline experiment in a -commit, branch, tag or an attached experiment with ID, you can supply their -names to `dvc exp diff`. +commit, branch, or tag; or an attached experiment by name, you can supply any of +these references to `dvc exp diff`. ``` $ dvc exp diff cnn-128 cnn-64 diff --git a/content/docs/user-guide/experiment-management/persisting-experiments.md b/content/docs/user-guide/experiment-management/persisting-experiments.md index 66e54f87ed..3314e36d29 100644 --- a/content/docs/user-guide/experiment-management/persisting-experiments.md +++ b/content/docs/user-guide/experiment-management/persisting-experiments.md @@ -71,7 +71,7 @@ $ dvc exp show --include-params=my_param The results found in the workspace are shown in the respective row. When you want to bring another experiment to the workspace, you can reference it using -it's name or ID, e.g.: +it's name, e.g.: ```dvc $ dvc exp apply exp-e6c97 From 9b2902abbb31714d5ae7dc0f201abf2233b48cbc Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 29 Nov 2021 23:26:59 -0600 Subject: [PATCH 15/39] guide: Visualize->Review in exp/overview/basic-workflow per https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-812317806 --- .../user-guide/experiment-management/experiments-overview.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 9c86aa2d60..5178c2a133 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -38,10 +38,11 @@ goes like this: tracked automatically. - Use [metrics](/doc/command-reference/metrics) to identify the best experiment(s). -- Visualize and compare experiments with `dvc exp show` or `dvc exp diff`. - Repeat 🔄 +- Review and [compare] experiments with `dvc exp show` or `dvc exp diff`. Repeat + 🔄 - Make certain experiments [persistent] by committing their results to Git. This cleans the slate so you can repeat the process later. [run experiments]: /doc/user-guide/experiment-management/running-experiments +[compare]: /doc/user-guide/experiment-management/comparing-experiments [persistent]: /doc/user-guide/experiment-management/persisting-experiments From 7b9384f17699d8bb249fc627ca8e4447a1e5b634 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 29 Nov 2021 23:29:22 -0600 Subject: [PATCH 16/39] guide: don't say "cleans the slate" in exp/overview/basic-workflow per https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-812319814 --- .../user-guide/experiment-management/experiments-overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 5178c2a133..55ff5f5053 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -41,7 +41,7 @@ goes like this: - Review and [compare] experiments with `dvc exp show` or `dvc exp diff`. Repeat 🔄 - Make certain experiments [persistent] by committing their results to Git. This - cleans the slate so you can repeat the process later. + lets you repeat the process from that point. [run experiments]: /doc/user-guide/experiment-management/running-experiments [compare]: /doc/user-guide/experiment-management/comparing-experiments From c9493f473b6ea1f725ca0d66a7209fa495e3282c Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 29 Nov 2021 23:33:16 -0600 Subject: [PATCH 17/39] giude: soften params description in exps index per https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-812322659 --- content/docs/user-guide/experiment-management/index.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md index 91d6ab66a1..d90a3467f5 100644 --- a/content/docs/user-guide/experiment-management/index.md +++ b/content/docs/user-guide/experiment-management/index.md @@ -6,9 +6,9 @@ of the development of data features, hyperspace exploration, deep learning optimization, etc. Some of DVC's base features already help you codify and analyze experiments. -[Parameters](/doc/command-reference/params) are simple values you can tweak in a -formatted text file; They cause different behaviors in your code and models. On -the other end, [metrics](/doc/command-reference/metrics) (and +[Parameters](/doc/command-reference/params) are simple values in a formatted +text file which you can tweak and use in your code. On the other end, +[metrics](/doc/command-reference/metrics) (and [plots](/doc/command-reference/plots)) let you define, visualize, and compare quantitative measures of your results. From 42454f00063420a2a4da95946daad4539e635edf Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 29 Nov 2021 23:47:22 -0600 Subject: [PATCH 18/39] guide: generalize dvc exps basic workflow --- .../experiment-management/experiments-overview.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 55ff5f5053..93f72dbaf0 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -31,18 +31,17 @@ project version (baseline). You can create independent groups of experiments this way, as well as review, compare, and restore them later. The basic workflow goes like this: -- Modify parameters or other dependencies (input data, source code, - stage definitions, etc.) of committed stages. -- [Run experiments] with `dvc exp run` (instead of `repro`) to execute the - pipeline. The results are reflected in your workspace, and - tracked automatically. -- Use [metrics](/doc/command-reference/metrics) to identify the best - experiment(s). -- Review and [compare] experiments with `dvc exp show` or `dvc exp diff`. Repeat +- Modify hyperparameters or other dependencies (input data, source code, + commands to execute, etc.). Leave these changes un-committed in Git. +- [Run experiments] with `dvc exp run` (instead of `repro`). The results are + reflected in your workspace, and tracked automatically. +- Review and [compare] experiments with `dvc exp show` or `dvc exp diff`, using + [metrics](/doc/command-reference/metrics) to identify the best one(s). Repeat 🔄 - Make certain experiments [persistent] by committing their results to Git. This lets you repeat the process from that point. [run experiments]: /doc/user-guide/experiment-management/running-experiments +[pipeline]: /doc/user-guide/project-structure/pipelines-files [compare]: /doc/user-guide/experiment-management/comparing-experiments [persistent]: /doc/user-guide/experiment-management/persisting-experiments From bd95136eeeb24d1834d9fa957019668c61e8d8ab Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 30 Nov 2021 00:09:48 -0600 Subject: [PATCH 19/39] guide: Properties section in DVC Exps overview page --- .../experiment-management/experiments-overview.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 93f72dbaf0..9b061296ec 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -16,13 +16,19 @@ are not pushed to Git remotes by default (see `dvc exp push`). -Experiments will have an auto-generated name like `exp-bfe64` by default. A +[run them]: /doc/user-guide/experiment-management/running-experiments +[git references]: https://git-scm.com/book/en/v2/Git-Internals-Git-References + +## Properties + +DVC Experiments will have an auto-generated name like `exp-bfe64` by default. A custom name can be given instead (using the `--name`/`-n` option of `dvc exp run`). These names can be used to reference experiments in other `dvc exp` subcommands. -[run them]: /doc/user-guide/experiment-management/running-experiments -[git references]: https://git-scm.com/book/en/v2/Git-Internals-Git-References +All experiments created by DVC will be associated to the latest commit (Git +`HEAD`) at the time that they were run. This is called the experiment's +_baseline_. ## Basic workflow From 6162f5a232066b3316c7ff73e1421f5376a27e7e Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 30 Nov 2021 00:10:09 -0600 Subject: [PATCH 20/39] guide: exp init section in Exp Overview page --- .../experiment-management/experiments-overview.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 9b061296ec..20bfde2f43 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -51,3 +51,17 @@ goes like this: [pipeline]: /doc/user-guide/project-structure/pipelines-files [compare]: /doc/user-guide/experiment-management/comparing-experiments [persistent]: /doc/user-guide/experiment-management/persisting-experiments + +## Initialize DVC Experiments on any project + +DVC Experiments features build on basic semantics of DVC projects. +This means that minimal formalities are required, such as codifying a pipeline +with `dvc.yaml` (even if it has a single stage that represents your +entire process). Another typical preparation step is to create or modify a +structured parameters file. + +`dvc exp init` lets you onboard any existing data science project to use DVC +Experiments without having to worry bootstrapping DVC manually. It will prompt +you a few simple questions and create a simple `dvc.yaml` file as well as other +metafiles with sane default values. You can review these and commit +them to Git to begin using DVC Experiments. From 5043e64e9f584b99eda81d951f87c8e4a8fb9b56 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 30 Nov 2021 22:00:49 -0600 Subject: [PATCH 21/39] guide: clarify dvc exp implementation --- .../experiment-management/experiments-overview.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 20bfde2f43..9a3c16bf0d 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -10,9 +10,10 @@ regular Git tree or workflow (unless you make them [persistent]). ### ⚙️ How does DVC track experiments? -Experiments are custom [Git references] (found in `.git/refs/exps`) with a -single commit based on `HEAD` (not checked out by DVC). Note that these commits -are not pushed to Git remotes by default (see `dvc exp push`). +Experiments are custom [Git references] (found in `.git/refs/exps`) with one or +more commits based on `HEAD`. These commits are hidden and not checked out by +DVC. Note that these are not pushed to Git remotes by default either (see +`dvc exp push`). From 27f01e63510a49589ed80c0ea071319740373ffe Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 30 Nov 2021 22:04:58 -0600 Subject: [PATCH 22/39] guide: expand on Exp Overview motivation per https://github.com/iterative/dvc.org/pull/2909#discussion_r759673758 --- .../experiment-management/experiments-overview.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 9a3c16bf0d..30ee089ed3 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -2,9 +2,12 @@ DVC Experiments are captures automatically by DVC when you [run them]. Each experiment creates and tracks a project variation based on the changes in your -workspace. Experiments preserve the latest commit in the current -branch (Git `HEAD`) as their parent or _baseline_, but do not form part of the -regular Git tree or workflow (unless you make them [persistent]). +workspace. + +Experiments preserve a connection to the latest commit in the current branch +(Git `HEAD`) as their parent or _baseline_, but do not form part of the regular +Git tree or workflow (unless you make them [persistent]). This prevents +polluting Git namespaces and bloating the repo unnecessarily.
From a799743a65f1de48c11cb4ce11d381f1b3d23e8c Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 30 Nov 2021 22:09:49 -0600 Subject: [PATCH 23/39] guide: direct language in Exp Overview/ workflow intro per https://github.com/iterative/dvc.org/pull/2909#discussion_r759676450 --- .../experiments-overview.md | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 30ee089ed3..437a9c6345 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -13,15 +13,14 @@ polluting Git namespaces and bloating the repo unnecessarily. ### ⚙️ How does DVC track experiments? -Experiments are custom [Git references] (found in `.git/refs/exps`) with one or -more commits based on `HEAD`. These commits are hidden and not checked out by -DVC. Note that these are not pushed to Git remotes by default either (see -`dvc exp push`). +Experiments are custom [Git references](/blog/experiment-refs) (found in +`.git/refs/exps`) with one or more commits based on `HEAD`. These commits are +hidden and not checked out by DVC. Note that these are not pushed to Git remotes +by default either (see `dvc exp push`).
[run them]: /doc/user-guide/experiment-management/running-experiments -[git references]: https://git-scm.com/book/en/v2/Git-Internals-Git-References ## Properties @@ -36,10 +35,10 @@ _baseline_. ## Basic workflow -`dvc exp` commands let you automatically track a variation to a committed -project version (baseline). You can create independent groups of experiments -this way, as well as review, compare, and restore them later. The basic workflow -goes like this: +`dvc exp` commands let you automatically track a variation of a project version +(the baseline). You can create independent groups of experiments this way, as +well as review, compare, and restore them later. The basic workflow goes like +this: - Modify hyperparameters or other dependencies (input data, source code, commands to execute, etc.). Leave these changes un-committed in Git. From 59505f6c25c5f452fa1195e8cd40d2086dd24955 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 30 Nov 2021 22:13:39 -0600 Subject: [PATCH 24/39] guide: mention metrics in exp init intro (Exp Overview) per https://github.com/iterative/dvc.org/pull/2909#discussion_r759679609 --- .../experiment-management/experiments-overview.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 437a9c6345..d7619040ce 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -1,8 +1,8 @@ # DVC Experiments Overview -DVC Experiments are captures automatically by DVC when you [run them]. Each -experiment creates and tracks a project variation based on the changes in your -workspace. +DVC Experiments are captured automatically by DVC when [run]. Each experiment +creates and tracks a variation of your data science project based on the changes +in your workspace. Experiments preserve a connection to the latest commit in the current branch (Git `HEAD`) as their parent or _baseline_, but do not form part of the regular @@ -60,8 +60,9 @@ this: DVC Experiments features build on basic semantics of DVC projects. This means that minimal formalities are required, such as codifying a pipeline with `dvc.yaml` (even if it has a single stage that represents your -entire process). Another typical preparation step is to create or modify a -structured parameters file. +entire process). Other typical preparation step are to write (or update) a +structured parameters file, and to track metrics output by your +code or ML models. `dvc exp init` lets you onboard any existing data science project to use DVC Experiments without having to worry bootstrapping DVC manually. It will prompt From 3d0bede8f08a8377704c9d391f3d34848dceefd2 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 30 Nov 2021 22:19:06 -0600 Subject: [PATCH 25/39] guide: intro exp init before giving specific examples of what it does per https://github.com/iterative/dvc.org/pull/2909#discussion_r759680948 --- .../experiments-overview.md | 22 +++++++++++-------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index d7619040ce..13e82222ea 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -57,15 +57,19 @@ this: ## Initialize DVC Experiments on any project -DVC Experiments features build on basic semantics of DVC projects. -This means that minimal formalities are required, such as codifying a pipeline -with `dvc.yaml` (even if it has a single stage that represents your -entire process). Other typical preparation step are to write (or update) a -structured parameters file, and to track metrics output by your -code or ML models. +DVC Experiments build on basic semantics of DVC projects. This +means that minimal formalities are required. `dvc exp init` lets you onboard any existing data science project to use DVC Experiments without having to worry bootstrapping DVC manually. It will prompt -you a few simple questions and create a simple `dvc.yaml` file as well as other -metafiles with sane default values. You can review these and commit -them to Git to begin using DVC Experiments. +you wth a few simple questions and create a basic `dvc.yaml` file, as well as +other metafiles with sane default values. You can review these +files and commit them to Git to begin using DVC Experiments quickly. + +One of the important steps this takes care of is to [codify a pipeline] (even if +it has a single stage that represents your entire process). Other +typical preparation step are to write (or update) a structured +parameters file, and to track metrics output by your +code or ML models. + +[codify a pipeline]: /doc/user-guide/project-structure/pipelines-files From db2d610edbe818065803393557c1ae8387acb3bb Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 1 Dec 2021 09:05:15 -0600 Subject: [PATCH 26/39] guide: hint forach stages for hybrid exp org pattern rel. https://github.com/iterative/dvc.org/pull/2909#discussion_r759851040 --- content/docs/user-guide/experiment-management/index.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md index d90a3467f5..4125c23a39 100644 --- a/content/docs/user-guide/experiment-management/index.md +++ b/content/docs/user-guide/experiment-management/index.md @@ -58,8 +58,12 @@ main alternatives: - **Hybrid** - combining an intuitive directory structure with a good repo branching strategy tends to be the best option for complex projects. - Completely independent experiments live in separate directories, while their - progress can be found in different branches. + Completely independent experiments live in separate directories (and can be + generated with [`foreach` stages], for example), while their progress can be + found in different branches. + +[`foreach` stages]: + /doc/user-guide/project-structure/pipelines-files#foreach-stages DVC takes care of arranging `dvc exp` experiments and the data cache under the hood so there's no need to decide on the above From f6eef7956f36cc99e46dc9a9a29f26d817146aa8 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 1 Dec 2021 09:43:28 -0600 Subject: [PATCH 27/39] guide: exp mgmt index copy edits --- .../docs/user-guide/experiment-management/index.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md index 4125c23a39..cd8d78ceb7 100644 --- a/content/docs/user-guide/experiment-management/index.md +++ b/content/docs/user-guide/experiment-management/index.md @@ -20,18 +20,18 @@ DVC experiment management features build on top of base DVC features to form a comprehensive framework to organize, execute, manage, and share ML experiments. They support support these main approaches: -- Compare params and metrics of existing project versions (for example different - Git branches) against each other or against new results in your workspace - (without committing them). +- Compare parameters and metrics of existing project versions (for example + different Git branches) against each other or against new, uncommitted results + in your workspace. One tool to do so is `dvc exp diff`. - [Run and capture] multiple experiments (derived from any project version as baseline) without polluting your Git history. DVC tracks them for you, letting you compare and share them. 📖 More info in the [Experiments Overview][experiments]. -- Generate [checkpoints] during your code execution to analyze the internal - progress of deep experiments. DVC captures [live metrics](/doc/dvclive) at - runtime, and lets you manage them in batches. +- Generate [checkpoints] at runtime to keep track of the internal progress of + deeper experiments. DVC captures [live metrics](/doc/dvclive), which you can + manage in batches. [run and capture]: /doc/user-guide/experiment-management/running-experiments [experiments]: /doc/user-guide/experiment-management/experiments-overview From c68fc784534c9bf6a2d9276a4a2506878b04769b Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 1 Dec 2021 10:49:11 -0600 Subject: [PATCH 28/39] guide: mention label-based exp organization rel. https://docs.google.com/presentation/d/1C_owNoC72GvrpyMGlonHEYJ9I2rl2SLHkZQDMx0eT7A/edit#slide=id.gcb78e52e40_0_635 --- content/docs/user-guide/experiment-management/index.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md index cd8d78ceb7..b501fd0a16 100644 --- a/content/docs/user-guide/experiment-management/index.md +++ b/content/docs/user-guide/experiment-management/index.md @@ -62,13 +62,17 @@ main alternatives: generated with [`foreach` stages], for example), while their progress can be found in different branches. -[`foreach` stages]: - /doc/user-guide/project-structure/pipelines-files#foreach-stages +- **Labels** - in general, you can record experiments in a separate system and + structure them using custom labeling. This is typical in dedicated experiment + tracking tools. A possible problem with this approach is that it's easy to + lose the connection between your project history and the experiments logged. DVC takes care of arranging `dvc exp` experiments and the data cache under the hood so there's no need to decide on the above until your experiments are made [persistent]. +[`foreach` stages]: + /doc/user-guide/project-structure/pipelines-files#foreach-stages [persistent]: /doc/user-guide/experiment-management/persisting-experiments ## Run Cache: Automatic Log of Stage Runs From 9fd3b3aa2269d71d17ca25e794f29d44d7792393 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 6 Dec 2021 19:41:44 -0600 Subject: [PATCH 29/39] guide: hide exp naming section in overview page and other details per https://github.com/iterative/dvc.org/pull/2909#discussion_r760506777 et al. --- .../experiments-overview.md | 23 ++++++++----------- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 13e82222ea..41ed8fd362 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -9,6 +9,8 @@ Experiments preserve a connection to the latest commit in the current branch Git tree or workflow (unless you make them [persistent]). This prevents polluting Git namespaces and bloating the repo unnecessarily. +[run]: /doc/user-guide/experiment-management/running-experiments +
### ⚙️ How does DVC track experiments? @@ -18,20 +20,13 @@ Experiments are custom [Git references](/blog/experiment-refs) (found in hidden and not checked out by DVC. Note that these are not pushed to Git remotes by default either (see `dvc exp push`). -
- -[run them]: /doc/user-guide/experiment-management/running-experiments +Note that DVC Experiments require a unique name to identify them. DVC will +usually auto-generate one by default, such as `exp-bfe64` (based on the +experiment's hash). A custom name can be set instead, using the `--name`/`-n` +option of `dvc exp run`. These names can be used to reference experiments in +other `dvc exp` subcommands. -## Properties - -DVC Experiments will have an auto-generated name like `exp-bfe64` by default. A -custom name can be given instead (using the `--name`/`-n` option of -`dvc exp run`). These names can be used to reference experiments in other -`dvc exp` subcommands. - -All experiments created by DVC will be associated to the latest commit (Git -`HEAD`) at the time that they were run. This is called the experiment's -_baseline_. + ## Basic workflow @@ -62,7 +57,7 @@ means that minimal formalities are required. `dvc exp init` lets you onboard any existing data science project to use DVC Experiments without having to worry bootstrapping DVC manually. It will prompt -you wth a few simple questions and create a basic `dvc.yaml` file, as well as +you with a few simple questions and create a basic `dvc.yaml` file, as well as other metafiles with sane default values. You can review these files and commit them to Git to begin using DVC Experiments quickly. From f241901d71cc0a743ce165392718f2c1974a4153 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 6 Dec 2021 19:53:47 -0600 Subject: [PATCH 30/39] guide: mention `exp init -i` in Overview per https://github.com/iterative/dvc.org/pull/2909#discussion_r760509273 --- .../experiments-overview.md | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 41ed8fd362..12914d3862 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -37,15 +37,14 @@ this: - Modify hyperparameters or other dependencies (input data, source code, commands to execute, etc.). Leave these changes un-committed in Git. -- [Run experiments] with `dvc exp run` (instead of `repro`). The results are - reflected in your workspace, and tracked automatically. +- [Run experiments][run] with `dvc exp run` (instead of `repro`). The results + are reflected in your workspace, and tracked automatically. - Review and [compare] experiments with `dvc exp show` or `dvc exp diff`, using [metrics](/doc/command-reference/metrics) to identify the best one(s). Repeat 🔄 - Make certain experiments [persistent] by committing their results to Git. This lets you repeat the process from that point. -[run experiments]: /doc/user-guide/experiment-management/running-experiments [pipeline]: /doc/user-guide/project-structure/pipelines-files [compare]: /doc/user-guide/experiment-management/comparing-experiments [persistent]: /doc/user-guide/experiment-management/persisting-experiments @@ -56,10 +55,11 @@ DVC Experiments build on basic semantics of DVC projects. This means that minimal formalities are required. `dvc exp init` lets you onboard any existing data science project to use DVC -Experiments without having to worry bootstrapping DVC manually. It will prompt -you with a few simple questions and create a basic `dvc.yaml` file, as well as -other metafiles with sane default values. You can review these -files and commit them to Git to begin using DVC Experiments quickly. +Experiments without having to worry bootstrapping DVC manually. This creates a +simple `dvc.yaml` file for you, as well as other other metafiles +with sane default values. For more control `dvc exp init --interactive` (or +`-i`) will prompt you with a few simple questions to populate the aforementioned +DVC metafiles. One of the important steps this takes care of is to [codify a pipeline] (even if it has a single stage that represents your entire process). Other @@ -67,4 +67,7 @@ typical preparation step are to write (or update) a structured parameters file, and to track metrics output by your code or ML models. +You can review these files and commit them to Git to begin using DVC Experiments +quickly. Now you can move on to [running your experiments][run]. + [codify a pipeline]: /doc/user-guide/project-structure/pipelines-files From e122b0a385b7a24f15c9280fc9b5dd57c3cbece4 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Mon, 6 Dec 2021 19:55:10 -0600 Subject: [PATCH 31/39] guide: typo fix per https://github.com/iterative/dvc.org/pull/2909#discussion_r760512581 --- .../experiment-management/experiments-overview.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 12914d3862..1164582308 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -55,11 +55,11 @@ DVC Experiments build on basic semantics of DVC projects. This means that minimal formalities are required. `dvc exp init` lets you onboard any existing data science project to use DVC -Experiments without having to worry bootstrapping DVC manually. This creates a -simple `dvc.yaml` file for you, as well as other other metafiles -with sane default values. For more control `dvc exp init --interactive` (or -`-i`) will prompt you with a few simple questions to populate the aforementioned -DVC metafiles. +Experiments without having to worry about bootstrapping DVC manually. This +creates a simple `dvc.yaml` file for you, as well as other other +metafiles with sane default values. For more control +`dvc exp init --interactive` (or `-i`) will prompt you with a few simple +questions to populate the aforementioned DVC metafiles. One of the important steps this takes care of is to [codify a pipeline] (even if it has a single stage that represents your entire process). Other From 73d510d09caff62fc670676d76f1e9bf7a7939eb Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 7 Dec 2021 12:02:58 -0600 Subject: [PATCH 32/39] ref: exp apply copy edits per https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-825273678 --- content/docs/command-reference/exp/apply.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/content/docs/command-reference/exp/apply.md b/content/docs/command-reference/exp/apply.md index d16700adf6..c012e4ef12 100644 --- a/content/docs/command-reference/exp/apply.md +++ b/content/docs/command-reference/exp/apply.md @@ -20,20 +20,20 @@ be referenced by name or hash (see `dvc exp run` for details). Specifically, `dvc exp apply` changes any files (code, data, parameters, metrics, etc.) needed to reflect the -experiment conditions and results in the workspace. +experiment conditions and results in the workspace. Current changes to the +workspace are preserved except if they conflict with the experiment in question. -⚠️ Current changes to the workspace are preserved except if they conflict with -the experiment in question, in which case they are overwritten (unless -`--no-force` is used). +⚠️ Conflicting changes in the workspace are overwritten unless unless +`--no-force` is used. This is typically used after choosing a target `experiment` with `dvc exp show` or `dvc exp diff`, and before committing it to Git (making it [persistent]. -Note that the history of [checkpoints] found in the `experiment` is **not** -preserved when applying and committing it. +> Note that if a history of [checkpoints] is found in the `experiment`, it will +> **not** be preserved when applying and committing it. [persistent]: /doc/user-guide/experiment-management/persisting-experiments -[checkpoints]: /doc/command-reference/exp/run#checkpoints +[checkpoints]: /doc/user-guide/experiment-management/checkpoints ## Options From 9d43ca6d04fb32dfe60d90ccb7e44f53160423ca Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 7 Dec 2021 12:05:41 -0600 Subject: [PATCH 33/39] ref: mention init before exp init per https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-825273678 --- content/docs/command-reference/exp/init.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/content/docs/command-reference/exp/init.md b/content/docs/command-reference/exp/init.md index 472626d1af..420b9bdb63 100644 --- a/content/docs/command-reference/exp/init.md +++ b/content/docs/command-reference/exp/init.md @@ -3,6 +3,8 @@ Codify project using [DVC metafiles](/doc/user-guide/project-structure) to run [experiments](/doc/user-guide/experiment-management). +> Requires having used `dvc init` to create a DVC repository. + ## Synopsis ```usage From 24c967daa64f098f6a5cc859c45e5ec122ec122b Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 7 Dec 2021 12:30:15 -0600 Subject: [PATCH 34/39] guide: correct info aboug exp init in Exp Overview per pending comments in https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-820716969 --- content/docs/command-reference/exp/init.md | 3 +- .../experiments-overview.md | 28 +++++++++---------- 2 files changed, 15 insertions(+), 16 deletions(-) diff --git a/content/docs/command-reference/exp/init.md b/content/docs/command-reference/exp/init.md index 420b9bdb63..5e71186389 100644 --- a/content/docs/command-reference/exp/init.md +++ b/content/docs/command-reference/exp/init.md @@ -3,7 +3,8 @@ Codify project using [DVC metafiles](/doc/user-guide/project-structure) to run [experiments](/doc/user-guide/experiment-management). -> Requires having used `dvc init` to create a DVC repository. +> Requires a DVC repository, created with `git init` and +> `dvc init`. ## Synopsis diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 1164582308..4bf51f9f59 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -54,20 +54,18 @@ this: DVC Experiments build on basic semantics of DVC projects. This means that minimal formalities are required. -`dvc exp init` lets you onboard any existing data science project to use DVC -Experiments without having to worry about bootstrapping DVC manually. This -creates a simple `dvc.yaml` file for you, as well as other other -metafiles with sane default values. For more control -`dvc exp init --interactive` (or `-i`) will prompt you with a few simple -questions to populate the aforementioned DVC metafiles. - -One of the important steps this takes care of is to [codify a pipeline] (even if -it has a single stage that represents your entire process). Other -typical preparation step are to write (or update) a structured -parameters file, and to track metrics output by your -code or ML models. - -You can review these files and commit them to Git to begin using DVC Experiments -quickly. Now you can move on to [running your experiments][run]. +`dvc exp init` lets you quickly onboard an existing data science project to use +DVC Experiments, without having to worry about bootstrapping DVC manually. You +can either supply a `command` to execute your experiments or use the +`--interactive` flag (`-i`) to be prompted for that and other optional +customizations. + +This creates a simple `dvc.yaml` file for you. It uses sane default locations +for your project's dependencies (data, parameters, source code) and +outputs (ML models or other artifacts, metrics, etc.) +-- which you can customize via `-i` or other options of `dvc exp init`. + +You can review the results (and commit them to Git) to begin using DVC +Experiments. Now you can move on to [running your experiments][run] (next). [codify a pipeline]: /doc/user-guide/project-structure/pipelines-files From 439050ec0b5e89722aba3fa01c3960ffd40ad25c Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 7 Dec 2021 12:36:43 -0600 Subject: [PATCH 35/39] ref: link from exp init to corresponding guide --- content/docs/command-reference/exp/init.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/content/docs/command-reference/exp/init.md b/content/docs/command-reference/exp/init.md index 5e71186389..f674eccbdc 100644 --- a/content/docs/command-reference/exp/init.md +++ b/content/docs/command-reference/exp/init.md @@ -35,6 +35,11 @@ training of machine learning models. This command is intended to be a quick way to start running experiments. To create more complex stages and pipeliens, use `dvc stage add`. +> 📖 More context in [Experiments Overview]. + +[experiments overview]: + /doc/user-guide/experiment-management/experiments-overview + ### The `command` argument The `command` argument is optional, if you are using `--interactive` mode. The From 3af2f9acd97684f318d46f84be86980e94cab303 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 7 Dec 2021 19:23:25 -0600 Subject: [PATCH 36/39] guide: make exp intro more concrete per https://github.com/iterative/dvc.org/pull/2909#discussion_r764393113 --- .../user-guide/experiment-management/experiments-overview.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 4bf51f9f59..bf86b550a1 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -6,8 +6,8 @@ in your workspace. Experiments preserve a connection to the latest commit in the current branch (Git `HEAD`) as their parent or _baseline_, but do not form part of the regular -Git tree or workflow (unless you make them [persistent]). This prevents -polluting Git namespaces and bloating the repo unnecessarily. +Git tree (unless you make them [persistent]). This prevents bloating your repo +with temporary commits and branches. [run]: /doc/user-guide/experiment-management/running-experiments From 12f87975baecb59c335baf9a0cb0141408342505 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Tue, 7 Dec 2021 20:06:03 -0600 Subject: [PATCH 37/39] guide: rewrite exp init section of Exps Overview page per https://github.com/iterative/dvc.org/pull/2909#pullrequestreview-825813926 --- .../experiments-overview.md | 31 ++++++++++--------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index bf86b550a1..82251c59e0 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -45,27 +45,28 @@ this: - Make certain experiments [persistent] by committing their results to Git. This lets you repeat the process from that point. -[pipeline]: /doc/user-guide/project-structure/pipelines-files [compare]: /doc/user-guide/experiment-management/comparing-experiments [persistent]: /doc/user-guide/experiment-management/persisting-experiments ## Initialize DVC Experiments on any project -DVC Experiments build on basic semantics of DVC projects. This -means that minimal formalities are required. +To use DVC Experiments you need a DVC project with a minimal +structure and configuration. To avoid having to bootstrap DVC manually, the +`dvc exp init` command lets you quickly onboard an existing project to the DVC +Experiments workflow. -`dvc exp init` lets you quickly onboard an existing data science project to use -DVC Experiments, without having to worry about bootstrapping DVC manually. You -can either supply a `command` to execute your experiments or use the -`--interactive` flag (`-i`) to be prompted for that and other optional -customizations. +It will create a simple `dvc.yaml` metafile, which codifies your planned +experiments. This includes the locations for expected dependencies +(data, parameters, source code) and outputs (ML models, +metrics, etc.). These assume [sane defaults] but can be customized +with the options of `dvc exp init`. -This creates a simple `dvc.yaml` file for you. It uses sane default locations -for your project's dependencies (data, parameters, source code) and -outputs (ML models or other artifacts, metrics, etc.) --- which you can customize via `-i` or other options of `dvc exp init`. +💡 We recommend adding the `-i` flag to use its `--interactive` mode. This will +ask you how to run the experiments, and guide you through customizing the +aforementioned locations (optional). -You can review the results (and commit them to Git) to begin using DVC -Experiments. Now you can move on to [running your experiments][run] (next). +You can review the resulting changes to your repo (and commit them to Git) to +begin using DVC Experiments. Now you can move on to [running experiments][run] +(next). -[codify a pipeline]: /doc/user-guide/project-structure/pipelines-files +[sane defaults]: /doc/command-reference/exp/init#description From 8aed62227f3abbd64e3524f5f6cd289f6d9d7682 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Fri, 10 Dec 2021 10:49:27 -0700 Subject: [PATCH 38/39] ref: roll back unrelated ref changes (moved to ref/exp-misc) --- content/docs/command-reference/exp/apply.md | 26 +++++------ content/docs/command-reference/exp/branch.md | 20 ++++----- content/docs/command-reference/exp/run.md | 45 +++++++++++--------- 3 files changed, 45 insertions(+), 46 deletions(-) diff --git a/content/docs/command-reference/exp/apply.md b/content/docs/command-reference/exp/apply.md index c012e4ef12..1bc6b16767 100644 --- a/content/docs/command-reference/exp/apply.md +++ b/content/docs/command-reference/exp/apply.md @@ -15,25 +15,21 @@ positional arguments: ## Description Restores an `experiment` into the workspace as long as no more Git commits have -been made after the target experiment (`HEAD` hasn't moved). The experiment can -be referenced by name or hash (see `dvc exp run` for details). +been made after the target experiment (`HEAD` hasn't moved). The `experiment` +can be referenced by name or hash (see `dvc exp run` for details). This changes +any files (code, data, parameters, metrics, etc.) +needed to reflect the experiment conditions and results in the workspace. -Specifically, `dvc exp apply` changes any files (code, data, -parameters, metrics, etc.) needed to reflect the -experiment conditions and results in the workspace. Current changes to the -workspace are preserved except if they conflict with the experiment in question. - -⚠️ Conflicting changes in the workspace are overwritten unless unless -`--no-force` is used. +⚠️ Conflicting changes in the workspace are overwritten unless `--no-force` is +used. This is typically used after choosing a target `experiment` with `dvc exp show` -or `dvc exp diff`, and before committing it to Git (making it [persistent]. - -> Note that if a history of [checkpoints] is found in the `experiment`, it will -> **not** be preserved when applying and committing it. +or `dvc exp diff`, and before committing it to Git (making it +[persistent](/doc/user-guide/experiment-management#persistent-experiments)). -[persistent]: /doc/user-guide/experiment-management/persisting-experiments -[checkpoints]: /doc/user-guide/experiment-management/checkpoints +Note that the history of +[checkpoints](/doc/command-reference/exp/run#checkpoints) found in the +`experiment` is **not** preserved when applying and committing it. ## Options diff --git a/content/docs/command-reference/exp/branch.md b/content/docs/command-reference/exp/branch.md index 50fa193eb9..a4ac9a4c11 100644 --- a/content/docs/command-reference/exp/branch.md +++ b/content/docs/command-reference/exp/branch.md @@ -15,28 +15,26 @@ positional arguments: ## Description -Makes a named Git [`branch`] containing the target `experiment` (making it -[persistent]. For [checkpoint experiments], the new branch will contain multiple -commits (the checkpoints). +Makes a named Git +[`branch`](https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging) +containing the target `experiment` (making it +[persistent](/doc/user-guide/experiment-management#persistent-experiments)). For +[checkpoint experiments](/doc/command-reference/exp/run#checkpoints), the new +branch will contain multiple commits (the checkpoints). The new `branch` will be based on the experiment's parent commit (`HEAD` at the time that the experiment was run). Note that DVC **does not** switch into the new `branch` automatically. `dvc exp branch` is useful to make an experiment persistent without modifying -the workspace, so they can be continued, [stored and shared] in a normal Git + -DVC workflow. +the workspace, so they can be continued, +[stored, and shared](https://dvc.org/doc/use-cases/sharing-data-and-model-files) +in a normal Git + DVC workflow. To switch into the new branch, use `git checkout branch` and `dvc checkout`. Or use `git merge branch` and `dvc repro` to combine it with your current project version. -[`branch`]: - https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging -[persistent]: /doc/user-guide/experiment-management/persisting-experiments -[checkpoint experiments]: /doc/command-reference/exp/run#checkpoints -[stored and shared]: /doc/use-cases/sharing-data-and-model-files - ## Options - `-h`, `--help` - shows the help message and exit. diff --git a/content/docs/command-reference/exp/run.md b/content/docs/command-reference/exp/run.md index 37cdc2de8e..869f9a7b8a 100644 --- a/content/docs/command-reference/exp/run.md +++ b/content/docs/command-reference/exp/run.md @@ -22,40 +22,45 @@ Provides a way to execute and track experiments in your project without polluting it with unnecessary commits, branches, directories, etc. -> `dvc exp run` is equivalent to `dvc repro` for experiments. It -> has the same behavior when it comes to `targets` and stage execution (restores -> the dependency graph, etc.). See the command [options](#options) for more on -> the differences. +> `dvc exp run` is equivalent to `dvc repro` for experiments. It has the same +> behavior when it comes to `targets` and stage execution (restores the +> dependency graph, etc.). See the command [options](#options) for more on the +> differences. Before running an experiment, you'll probably want to make modifications such as data and code updates, or hyperparameter tuning. For the latter, you can use the `--set-param` (`-S`) option of this command to change `dvc param` values on-the fly. -📖 See [DVC Experiments](/doc/user-guide/experiment-management) for more -information. - Each experiment creates and tracks a project variation based on your workspace changes. Experiments will have a unique, auto-generated name like `exp-bfe64` by default, which can be customized using the `--name` (`-n`) option. -Each experiment creates and tracks a project variation based on the changes in -your workspace. The results of the last `dvc exp run` will be -reflected in the workspace. Experiments will have an auto-generated ID like -`exp-bfe64` by default. A custom name can be given instead, using the `--name` -(`-n`) option +
+ +### ⚙️ How does DVC track experiments? -To display and compare multiple experiments, use `dvc exp show` or -`dvc exp diff` (`plots diff` also accepts experiment names as `revisions`). Use -`dvc exp apply` to restore the results of any experiment, for example to [commit -them][persisting] to Git. Unnecessary experiments can be removed with -`dvc exp remove`or `dvc exp gc` (or abandoned). +Experiments are custom +[Git references](https://git-scm.com/book/en/v2/Git-Internals-Git-References) +(found in `.git/refs/exps`) with a single commit based on `HEAD` (not checked +out by DVC). Note that these commits are not pushed to the Git remote by default +(see `dvc exp push`). -> Note that experiment data will remain in the local cache until -> you use regular `dvc gc` to clean it up. +
+ +The results of the last `dvc exp run` can be seen in the workspace. To display +and compare multiple experiments, use `dvc exp show` or `dvc exp diff` +(`plots diff` also accepts experiment names as `revisions`). Use `dvc exp apply` +to restore the results of any other experiment instead. + +Successful experiments can be made +[persistent](/doc/user-guide/experiment-management#persistent-experiments) by +committing them to the Git repo. Unnecessary ones can be removed with +`dvc exp remove`or `dvc exp gc` (or abandoned). -[persisting]: /doc/user-guide/experiment-management/persisting-experiments +> Note that experiment data will remain in the cache until you use +> regular `dvc gc` to clean it up. ## Checkpoints From c088a067fa0c4b7d7c7efd51140d8c215443301a Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Fri, 10 Dec 2021 10:52:22 -0700 Subject: [PATCH 39/39] guide: roll back unrelated changes (moved to #3080) --- .../user-guide/experiment-management/comparing-experiments.md | 4 ++-- content/docs/user-guide/project-structure/internal-files.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/content/docs/user-guide/experiment-management/comparing-experiments.md b/content/docs/user-guide/experiment-management/comparing-experiments.md index a58642e9a3..5974896ad9 100644 --- a/content/docs/user-guide/experiment-management/comparing-experiments.md +++ b/content/docs/user-guide/experiment-management/comparing-experiments.md @@ -384,8 +384,8 @@ params.yaml train.epochs 10 10 0 ## Compare an experiment with the workspace When you want to compare two experiments, either the baseline experiment in a -commit, branch, or tag; or an attached experiment by name, you can supply any of -these references to `dvc exp diff`. +commit, branch, tag or an attached experiment with ID, you can supply their +names to `dvc exp diff`. ``` $ dvc exp diff cnn-128 cnn-64 diff --git a/content/docs/user-guide/project-structure/internal-files.md b/content/docs/user-guide/project-structure/internal-files.md index 9339a385a1..3a77648119 100644 --- a/content/docs/user-guide/project-structure/internal-files.md +++ b/content/docs/user-guide/project-structure/internal-files.md @@ -134,7 +134,7 @@ $ cat .dvc/cache/6f/db5336fce0dbfd669f83065f107551.dir That's how DVC knows that the other two cached files belong in the directory. -## Run-cache +### Run-cache `dvc exp run` and `dvc repro` by default populate and reutilize a log of stages that have been run in the project. It is found in the `runs/` directory inside