Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guide: expand Experiments guide #2654

Closed
wants to merge 20 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
5008c55
guide: split Experiments (index) into sub-pages
jorgeorpinel Jul 21, 2021
ff85352
Merge branch 'master' into guide/exps
jorgeorpinel Jul 28, 2021
923040f
case: keep Persistent Exps in basic page
jorgeorpinel Jul 29, 2021
3ae85e5
cases: keep Run-cache in basic Exps page
jorgeorpinel Jul 29, 2021
29b17b2
guide: edit Exp Mgmt index (intro)
jorgeorpinel Jul 29, 2021
e21fef4
guide: edit basic Exps page inc. persisting them
jorgeorpinel Jul 29, 2021
c21dbe3
Merge branch 'master' into guide/exps
jorgeorpinel Aug 4, 2021
d8f2d7c
guide: rename DVC Exps, remove Org Exps page
jorgeorpinel Aug 4, 2021
1337453
guide: bash -> dvc in EM/Checkpoints
jorgeorpinel Aug 4, 2021
8d93521
guide: fix exps link
jorgeorpinel Aug 4, 2021
90f3042
Merge branch 'master' into guide/exps
jorgeorpinel Aug 11, 2021
fb4663c
Merge branch 'master' into guide/exps
jorgeorpinel Aug 18, 2021
d1422b1
guide: consolidate Exp Sharing intro (#2711)
jorgeorpinel Aug 18, 2021
532df56
Merge branch 'master' into guide/exps
jorgeorpinel Aug 18, 2021
0581991
guide: summarize Exp Sharing titles and examples (#2719)
jorgeorpinel Aug 20, 2021
e6d4eca
Merge branch 'master' into guide/exps
jorgeorpinel Oct 4, 2021
ec2ac41
Merge branch 'guide/exps' of github.com:iterative/dvc.org into guide/…
jorgeorpinel Oct 4, 2021
2e4a512
Merge branch 'master' into guide/exps
jorgeorpinel Oct 6, 2021
e4f4024
exp: fix links to old guides
jorgeorpinel Oct 6, 2021
581a9a9
guide: review links to Persistent Exps and Checkpoints info
jorgeorpinel Oct 6, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions content/docs/api-reference/make_checkpoint.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# dvc.api.make_checkpoint()

Make an
[in-code checkpoint](/doc/user-guide/experiment-management#checkpoints-in-source-code).
Make an [in-code checkpoint](/doc/user-guide/experiment-management/checkpoints).

```py
def make_checkpoint()
Expand Down
2 changes: 1 addition & 1 deletion content/docs/command-reference/exp/apply.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ can be referenced by name or hash (see `dvc exp run` for details).

This is typically used after choosing a target `experiment` with `dvc exp show`
or `dvc exp diff`, and before committing it to Git (making it
[persistent](/doc/user-guide/experiment-management#persistent-experiments)).
[persistent](/doc/user-guide/experiment-management/dvc-experiments#persistent-experiments)).

`dvc exp apply` changes any files (code, data, <abbr>parameters</abbr>,
<abbr>metrics</abbr>, etc.) needed to reflect the experiment conditions and
Expand Down
6 changes: 3 additions & 3 deletions content/docs/command-reference/exp/branch.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ positional arguments:
Makes a named Git
[`branch`](https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging)
containing the target `experiment` (making it
[persistent](/doc/user-guide/experiment-management#persistent-experiments)). For
[checkpoint experiments](/doc/command-reference/exp/run#checkpoints), the new
branch will contain multiple commits (the checkpoints).
[persistent](/doc/user-guide/experiment-management/dvc-experiments#persistent-experiments)).
For [checkpoint experiments](/doc/command-reference/exp/run#checkpoints), the
new branch will contain multiple commits (the checkpoints).

The new `branch` will be based on the experiment's parent commit (`HEAD` at the
time that the experiment was run). Note that DVC **does not** switch into the
Expand Down
6 changes: 4 additions & 2 deletions content/docs/command-reference/exp/pull.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,10 @@ positional arguments:

## Description

The `dvc exp push` and `dvc exp pull` commands are the means for sharing
experiments across <abbr>repository</abbr> copies via Git (and DVC) remotes.
The `dvc exp push` and `dvc exp pull` commands are the means for [sharing
experiments] across <abbr>repository</abbr> copies via Git and DVC remotes.

[sharing experiments]: /doc/user-guide/experiment-management/sharing-experiments

> Plain `git push` and `git fetch` don't work with `dvc experiments` because
> these are saved under custom Git references. See **How does DVC track
Expand Down
6 changes: 4 additions & 2 deletions content/docs/command-reference/exp/push.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,10 @@ positional arguments:

## Description

The `dvc exp push` and `dvc exp pull` commands are the means for sharing
experiments across <abbr>repository</abbr> copies via Git (and DVC) remotes.
The `dvc exp push` and `dvc exp pull` commands are the means for [sharing
experiments] across <abbr>repository</abbr> copies via Git and DVC remotes.

[sharing experiments]: /doc/user-guide/experiment-management/sharing-experiments

> Plain `git push` and `git fetch` don't work with `dvc experiments` because
> these are saved under custom Git references. See **How does DVC track
Expand Down
6 changes: 3 additions & 3 deletions content/docs/command-reference/exp/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ option.
Experiments are custom
[Git references](https://git-scm.com/book/en/v2/Git-Internals-Git-References)
(found in `.git/refs/exps`) with a single commit based on `HEAD` (not checked
out by DVC). Note that these commits are not pushed to the Git remote by default
out by DVC). Note that these commits are not pushed to Git remotes by default
(see `dvc exp push`).

</details>
Expand All @@ -55,8 +55,8 @@ and compare multiple experiments, use `dvc exp show` or `dvc exp diff`
to restore the results of any other experiment instead.

Successful experiments can be made
[persistent](/doc/user-guide/experiment-management#persistent-experiments) by
committing them to the Git repo. Unnecessary ones can be removed with
[persistent](/doc/user-guide/experiment-management/dvc-experiments#persistent-experiments)
by committing them to the Git repo. Unnecessary ones can be removed with
`dvc exp remove`or `dvc exp gc` (or abandoned).

> Note that experiment data will remain in the <abbr>cache</abbr> until you use
Expand Down
1 change: 1 addition & 0 deletions content/docs/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,7 @@
"slug": "experiment-management",
"source": "experiment-management/index.md",
"children": [
"dvc-experiments",
"running-experiments",
"sharing-experiments",
"cleaning-experiments",
Expand Down
5 changes: 3 additions & 2 deletions content/docs/user-guide/basic-concepts/experiment.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
---
name: Experiment
match: [experiment, experiments]
match: [experiment, experiments, 'DVC experiments']
tooltip: >-
An attempt to reach desired/better/interesting results during data pipelining
or ML model development. DVC is designed to help [manage
experiments](/doc/start/experiments), having [built-in
mechanisms](/doc/user-guide/experiment-management) like the
[run-cache](/doc/user-guide/project-structure/internal-files#run-cache) and
the `dvc experiments` commands (available on DVC 2.0 and above).
the [`dvc experiments`](/doc/command-reference/exp) commands (available on DVC
2.0 and above).
---
88 changes: 41 additions & 47 deletions content/docs/user-guide/experiment-management/checkpoints.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,32 @@
# Checkpoints

ML checkpoints are an important part of deep learning because ML engineers like
to save the model files at certain points during a training process.
_New in DVC 2.0_

With DVC experiments and checkpoints, you can:
To track successive steps in a longer experiment, you can register checkpoints
from your code at runtime. This is especially helpful in machine learning, for
example to track the progress in deep learning techniques such as evolving
neural networks.

- Implement the best practice in deep learning to save your model weights as
_Checkpoint experiments_ track a series of variations (the checkpoints) and
their execution can be stopped and resumed as needed. You interact with them
using the `--rev` and `--reset` options of `dvc exp run` (see also the
`checkpoint` field in `dvc.yaml` `outs`). They can help you

- implement the best practice in deep learning to save your model weights as
checkpoints.
- Track all code and data changes corresponding to the checkpoints.
- See when metrics start diverging and revert to the optimal checkpoint.
- Automate the process of tracking every training epoch.
- track all code and data changes corresponding to the checkpoints.
- see when metrics start diverging and revert to the optimal checkpoint.
- automate the process of tracking every training epoch.

[The way checkpoints are implemented by DVC](/blog/experiment-refs) utilizes
_ephemeral_ experiment commits and experiment branches within DVC. They are
created using the metadata from experiments and are tracked with the `exps`
custom Git reference.
> Experiments and checkpoints are [implemented](/blog/experiment-refs) with
> hidden Git experiment commits branches.

You can add experiments to your Git history by committing the experiment you
want to track, which you'll see later in this tutorial.
Like with regular experiments, checkpoints can become persistent by
[committing them to Git](#committing-checkpoints-to-git).

This tutorial is going to cover how to implement checkpoints in an ML project
using DVC. We're going to train a model to identify handwritten digits based on
the MNIST dataset.
This guide covers how to implement checkpoints in an ML project using DVC. We're
going to train a model to identify handwritten digits based on the MNIST
dataset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not this PR: below: Setting up the project ... should we do ## -> ###?

This comment was marked as resolved.


https://youtu.be/PcDo-hCvYpw

Expand All @@ -32,15 +37,15 @@ https://youtu.be/PcDo-hCvYpw
You can follow along with the steps here or you can clone the repo directly from
GitHub and play with it. To clone the repo, run the following commands.

```bash
```dvc
$ git clone https://github.com/iterative/checkpoints-tutorial
$ cd checkpoints-tutorial
```

It is highly recommended you create a virtual environment for this example. You
can do that by running:

```bash
```dvc
$ python3 -m venv .venv
```

Expand All @@ -53,7 +58,7 @@ following commands.
Once you have your environment set up, you can install the dependencies by
running:

```bash
```dvc
$ pip install -r requirements.txt
```

Expand All @@ -64,9 +69,9 @@ everything you need to get started with experiments and checkpoints.

## Setting up a DVC pipeline

DVC versions data and it also can version the machine learning model weights
file as checkpoints during the training process. To enable this, you will need
to set up a DVC pipeline to train your model.
DVC versions data and it also can version the ML model weights file as
checkpoints during the training process. To enable this, you will need to set up
a DVC pipeline to train your model.

Adding a DVC pipeline only takes a few commands. At the root of the project,
run:
Expand Down Expand Up @@ -130,7 +135,7 @@ stages:
Before we go any further, this is a great point to add these changes to your Git
history. You can do that with the following commands:

```bash
```dvc
$ git add .
$ git commit -m "created DVC pipeline"
```
Expand Down Expand Up @@ -427,39 +432,28 @@ new set of checkpoints under a new experiment branch.
└─────────────────────────┴──────────┴──────┴─────────┴────────┴────────┴────────┴──────────────┘
```

## Adding checkpoints to Git
## Committing checkpoints to Git

When you terminate training, you'll see a few commands in the terminal that will
allow you to add these changes to Git.
allow you to add these changes to Git, making them [persistent]:

```
[persistent]:
/doc/user-guide/experiment-management/dvc-experiments#persistent-experiments

```dvc
To track the changes with git, run:

git add dvclive.json dvc.yaml .gitignore train.py dvc.lock

Reproduced experiment(s): exp-263da
Experiment results have been applied to your workspace.

To promote an experiment to a Git branch run:

dvc exp branch <exp>
```

You can run the following command to save your experiments to the Git history.

```bash
$ git add dvclive.json dvc.yaml .gitignore train.py dvc.lock
...
```

You can take a look at what will be committed to your Git history by running:
Running the command above will stage the checkpoint experiment with Git. You can
take a look at what would be committed first with `git status`. You should see
something similar to this in your terminal:

```bash
```dvc
$ git status
```

You should see something similar to this in your terminal.

```
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
new file: .gitignore
Expand All @@ -476,9 +470,9 @@ Untracked files:
predictions.json
```

All that's left is to commit these changes with the following command:
All that's left to do is to `git commit` the changes:

```bash
```dvc
$ git commit -m 'saved files from experiment'
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,10 @@

Although DVC uses minimal resources to keep track of the experiments, they may
clutter tables and the workspace. DVC allows to remove specific experiments from
the workspace or delete all not-yet-persisted experiments at once.
the workspace or delete all not-yet-[persisted] experiments at once.

[persisted]:
/doc/user-guide/experiment-management/dvc-experiments#persistent-experiments

## Removing specific experiments

Expand Down
36 changes: 36 additions & 0 deletions content/docs/user-guide/experiment-management/dvc-experiments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
## DVC Experiments
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be clear that those are DVC experiments?

also it's clear that this is about experiments

what is the actual intention behind this page?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, renaming page + copy editing makes it almost impossible to review

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the section in the index is just called Experiments and I'm happy to rename it back to that. I just thought it would be clearer to "brand them" since they're special. We also say things like DVC Project, DVC Cache, and DVC Remote.

actual intention behind this page?

"...dedicated DVC Experiments page to explain what experiments are and potentially go into implementation details (in the future)." 🙂

renaming page + copy editing makes it almost impossible to review

This is a brand new page. No file was renamed in this PR. Some of it's info was extracted from the index.md.

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. I realize it's a huge PR. That's because we wanted to try nesting PRs, but it resulted in this one being too big and losing track of which changes we already approved... But the PR description has a list of changes done with links to the files updated (4 major ones). Everything else is small copy edits and link updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clearer to "brand them" since they're special. We also say things like DVC Project, DVC Cache

In fact we already call it "DVC Experiments" in some places e.g. there's a few instances in https://github.com/iterative/dvc.org/pull/2901/files#diff-8bac2dee1e13766aefa536c3d2fa38296dc6b09ba750fd71c9eedde61df8d1b9


_New in DVC 2.0_

`dvc exp` commands let you automatically track a variation to an established
[data pipeline](/doc/command-reference/dag) baseline. You can create multiple
isolated experiments this way, as well as review, compare, and restore them
later, or roll back to the baseline. The basic workflow goes like this:

- Modify stage <abbr>parameters</abbr> or other dependencies (e.g. input data,
source code) of committed stages.
- [Run experiments] with `dvc exp run` (instead of `repro`) to execute the
pipeline. The results are reflected in your <abbr>workspace</abbr>, and
tracked automatically.
- Use `dvc metrics` to identify the best experiment(s).
- Visualize, compare experiments with `dvc exp show` or `dvc exp diff`. Repeat
🔄
- Use `dvc exp apply` to roll back to the best one.
- Make the selected experiment persistent by committing its results to Git. This
cleans the slate so you can repeat the process.

[run experiments]: /doc/user-guide/experiment-management/running-experiments

## Persistent Experiments

When your experiments are good enough to save or share, you may want to store
them persistently as Git commits in your <abbr>repository</abbr>.

Whether the results were produced with `dvc repro` directly, or after a
`dvc exp` workflow, `dvc.yaml` and `dvc.lock` will define the experiment as a
new project version. The right <abbr>outputs</abbr> (including
[metrics](/doc/command-reference/metrics)) should also be present, or available
via `dvc checkout`.

Use `dvc exp apply` and `dvc exp branch` to persist experiments in your Git
history.
Loading