Skip to content

Commit

Permalink
guide: clear separation between Exp Mgmt index and Overview page
Browse files Browse the repository at this point in the history
  • Loading branch information
jorgeorpinel committed Nov 2, 2021
1 parent 0c2bcf5 commit d2fc52c
Show file tree
Hide file tree
Showing 2 changed files with 40 additions and 39 deletions.
Original file line number Diff line number Diff line change
@@ -1,26 +1,30 @@
## DVC Experiments Overview

`dvc exp` commands let you automatically track a variation to an established
[data pipeline](/doc/command-reference/dag) baseline. You can create multiple
isolated experiments this way, as well as review, compare, and restore them
later, or roll back to the baseline. The basic workflow goes like this:
`dvc exp` commands let you automatically track a variation to a committed
project version (baseline). You can create independent groups of experiments
this way, as well as review, compare, and restore them later. The basic workflow
goes like this:

- Modify stage <abbr>parameters</abbr> or other dependencies (e.g. input data,
source code) of committed stages.
- Modify <abbr>parameters</abbr> or other dependencies (input data, source code,
stage definitions, etc.) of committed stages.
- [Run experiments] with `dvc exp run` (instead of `repro`) to execute the
pipeline. The results are reflected in your <abbr>workspace</abbr>, and
tracked automatically.
- Use `dvc metrics` to identify the best experiment(s).
- Visualize, compare experiments with `dvc exp show` or `dvc exp diff`. Repeat
πŸ”„
- Use `dvc exp apply` to roll back to the best one.
- Make the selected experiment persistent by committing its results to Git. This
cleans the slate so you can repeat the process.
- Use [metrics](/doc/command-reference/metrics) to identify the best
experiment(s).
- Visualize and compare experiments with `dvc exp show` or `dvc exp diff`.
Repeat πŸ”„
- Make certain experiments [persistent](#persistent-experiments) by committing
their results to Git. This cleans the slate so you can repeat the process
later.

[run experiments]: /doc/user-guide/experiment-management/running-experiments
[persistent]: /doc/user-guide/experiment-management/persisting-experiments

## Persistent Experiments

πŸ“– See [full guide][persistent].

When your experiments are good enough to save or share, you may want to store
them persistently as Git commits in your <abbr>repository</abbr>.

Expand Down
51 changes: 24 additions & 27 deletions content/docs/user-guide/experiment-management/index.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Experiment Management

Data science and ML are iterative processes that require a large number of
attempts to reach a certain level of a metric. Experimentation is part of the
development of data features, hyperspace exploration, deep learning
Data science and machine learning are iterative processes that require a large
number of attempts to reach a certain level of a metric. Experimentation is part
of the development of data features, hyperspace exploration, deep learning
optimization, etc.

Some of DVC's base features already help you codify and analyze experiments.
Expand All @@ -12,35 +12,30 @@ the other end, [metrics](/doc/command-reference/metrics) (and
[plots](/doc/command-reference/plots)) let you define, visualize, and compare
quantitative measures of your results.

## DVC Experiments
## Experimentation methods in DVC

_New in DVC 2.0_

DVC experiment management features are designed to support these main
approaches:

1. [Run] and capture [experiments] that derive from your latest project version
without polluting your Git history. DVC tracks them for you, letting you list
and compare them. The best ones can be made persistent, and the rest left as
history or cleared.
1. [Queue] and process series of experiments based on a parameter search or
other modifications to your baseline.
1. Generate [checkpoints] during your code execution to analyze the internal
progress of deep experiments. DVC captures them at runtime, and can manage
them in batches.
1. Make experiments [persistent] by committing them to your
<abbr>repository</abbr> history.

[run]: /doc/user-guide/experiment-management/running-experiments
DVC experiment management features build on top of base DVC features to form a
comprehensive framework to organize, execute, manage, and share ML experiments.
They support support these main approaches:

- Compare params and metrics of existing project versions (for example different
Git branches) against each other or against new results in your workspace
(without committing them).

- [Run and capture] multiple experiments (derived from any project version as
baseline) without polluting your Git history. DVC tracks them for you, letting
you compare and share them. πŸ“– More info in the [Experiments
Overview][experiments].

- Generate [checkpoints] during your code execution to analyze the internal
progress of deep experiments. DVC captures [live metrics](/doc/dvclive) at
runtime, and lets you manage them in batches.

[run and capture]: /doc/user-guide/experiment-management/running-experiments
[experiments]: /doc/user-guide/experiment-management/experiments-overview
[queue]:
/doc/user-guide/experiment-management/running-experiments#the-experiments-queue
[checkpoints]: /doc/user-guide/experiment-management/checkpoints
[persistent]:
/doc/user-guide/experiment-management/experiments-overview#persistent-experiments

πŸ“– More information in the
[full guide](/doc/user-guide/experiment-management/experiments-overview).

> πŸ‘¨β€πŸ’» See [Get Started: Experiments](/doc/start/experiments) for a hands-on
> introduction to DVC experiments.
Expand All @@ -55,10 +50,12 @@ main alternatives:
other. Helpful if the Git [revisions](https://git-scm.com/docs/revisions) can
be easily visualized, for example with tools
[like GitHub](https://docs.github.com/en/github/visualizing-repository-data-with-graphs/viewing-a-repositorys-network).

- **Directories** - the project's "space dimension" can be structured with
directories (folders) to organize experiments. Useful when you want to see all
your experiments at the same time (without switching versions) by just
exploring the file system.

- **Hybrid** - combining an intuitive directory structure with a good repo
branching strategy tends to be the best option for complex projects.
Completely independent experiments live in separate directories, while their
Expand Down

0 comments on commit d2fc52c

Please sign in to comment.