Skip to content

Commit

Permalink
guide: Cleaning Up Experiments (iterative#2631)
Browse files Browse the repository at this point in the history
* added section headers

* added another section header

* Added `dvc exp remove` example

* added section on remote experiment removal

* added dvc exp gc --workspace example

* added info about deleting all experiments in a remote

* new sections and examples for `--all-branches` and `--queue`

* prettified

* removed copied content from GS:Experiments

* example for --all-tags added

* Restyled by prettier (iterative#2710)

Co-authored-by: Restyled.io <[email protected]>

* updates after review

* reviewed and added example for remote experiment deletion

* Restyled by prettier (iterative#2726)

Co-authored-by: Restyled.io <[email protected]>

* guide: list Cleaning Exps in the nav

* guide: copy edits to Exps Cleanup

* guide: format Exp Cleanup

Co-authored-by: Emre Sahin <[email protected]>
Co-authored-by: restyled-io[bot] <32688539+restyled-io[bot]@users.noreply.github.com>
Co-authored-by: Restyled.io <[email protected]>
Co-authored-by: Jorge Orpinel <[email protected]>
Co-authored-by: Jorge Orpinel <[email protected]>
  • Loading branch information
6 people authored and karajan1001 committed Sep 29, 2021
1 parent 08f3a7b commit 6481a95
Show file tree
Hide file tree
Showing 2 changed files with 249 additions and 0 deletions.
1 change: 1 addition & 0 deletions content/docs/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@
"children": [
"running-experiments",
"sharing-experiments",
"cleaning-experiments",
"checkpoints"
]
},
Expand Down
248 changes: 248 additions & 0 deletions content/docs/user-guide/experiment-management/cleaning-experiments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,248 @@
# Cleaning Up Experiments

Although DVC uses minimal resources to keep track of the experiments, they may
clutter tables and the workspace. DVC allows to remove specific experiments from
the workspace or delete all not-yet-persisted experiments at once.

## Removing specific experiments

When you want to discard experiments by their name, you can use `dvc exp remove`
and supply the experiment name.

```dvc
$ dvc exp list
main:
cnn-32
cnn-64
cnn-128
$ dvc exp remove cnn-32 cnn-64
$ dvc exp list
main:
cnn-128
```

## Removing multiple experiments

After you've completed a set of experiments, it may be easier to decide which of
these to keep rather than which of these to remove. You can use `dvc exp gc` to
select a set of experiments to keep and the rest of them are _garbage
collected._

This command takes a _scope_ argument. The scope can be `workspace`,
`all-branches`, `all-tags`, `all-commits`. In garbage collection, the scope
determines the experiments to _keep_, i.e., experiments out of the scope of the
given flag are removed.

### Keeping experiments in the workspace

Supplying `--workspace` flag to `dvc exp gc` causes all experiments to be
removed **except** those in the current workspace.

```dvc
$ dvc exp list --all
main:
exp-aaa000
exp-aaa111
exp-aaa222
other:
exp-bbb333
exp-bbb444
another:
exp-ccc555
exp-ccc666
exp-ccc777
```

Issuing `dvc exp gc --workspace` removes experiments in `other` and `another`
branches in this example.

```dvc
$ dvc exp gc --workspace
$ dvc exp list --all
main:
exp-abc000
exp-abc111
exp-abc222
```

### Keeping experiments in all branches

DVC can create a branch for an experiment using `dvc exp branch` command.

In cases where you want to clean up the experiments _except_ those in the
branches, you can use `--all-branches` flag.

```dvc
$ dvc exp show --all-branches
```

```dvctable
┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ white:**Experiment** ┃ white:**Created** ┃ yellow:**acc** ┃ blue:**model.conv_units** ┃
┑━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
β”‚ workspace β”‚ - β”‚ - β”‚ 64 β”‚
β”‚ cnn-48 β”‚ 09:11 AM β”‚ 0.9131 β”‚ 48 β”‚
β”‚ main β”‚ Jul 21, 2021 β”‚ 0.9189 β”‚ 16 β”‚
β”‚ β”œβ”€β”€ dac711b [cnn-32] β”‚ 09:16 AM β”‚ 0.9152 β”‚ 32 β”‚
β”‚ β”œβ”€β”€ 7cd3ae7 [cnn-48] β”‚ 09:11 AM β”‚ 0.9131 β”‚ 48 β”‚
β”‚ β”œβ”€β”€ ab585b5 [cnn-24] β”‚ 09:06 AM β”‚ 0.9135 β”‚ 24 β”‚
β”‚ β”œβ”€β”€ 7d51b55 [exp-44136] β”‚ 09:01 AM β”‚ 0.9151 β”‚ 16 β”‚
β”‚ └── 7feaa1c [exp-78ede] β”‚ Aug 02, 2021 β”‚ 0.9151 β”‚ 16 β”‚
β”‚ 8583124 β”‚ Jul 20, 2021 β”‚ 0.9132 β”‚ 17 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

Supplying `--all-branches` keeps only the experiments in branch tips. Any
experiment that's not promoted to a branch is removed this way.

```dvc
$ dvc exp gc --all-branches
WARNING: This will remove all experiments except those derived from the workspace and all git branches of the current repo.
Are you sure you want to proceed? [y/n] y
Removed 6 experiments. To remove unused cache files use 'dvc gc'.
```

The resulting `dvc exp show` table is as the following:

```dvctable
┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ Experiment ┃ Created ┃ acc ┃ model.conv_units ┃
┑━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
β”‚ workspace β”‚ - β”‚ - β”‚ 64 β”‚
β”‚ cnn-48 β”‚ 09:11 AM β”‚ 0.9131 β”‚ 48 β”‚
β”‚ main β”‚ Jul 21, 2021 β”‚ 0.9189 β”‚ 16 β”‚
β”‚ 8583124 β”‚ Jul 20, 2021 β”‚ 0.9132 β”‚ 17 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

### Keeping experiments in all tags

When you tag the experiment commits by `git tag`, `dvc exp show --all-tags`
presents them along with the tags. If you want the delete _all experiments
without tags_, you can issue a `dvc exp gc --all-tags` command.

```dvc
$ dvc exp show --all-tags
```

```dvctable
┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Experiment ┃ acc ┃ model.conv_units ┃
┑━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
β”‚ workspace β”‚ 0.9067 β”‚ 16 β”‚
β”‚ β”œβ”€β”€ 2fc4f81 [exp-a1b3c4]β”‚ 0.9037 β”‚ 48 β”‚
β”‚ └── 21beb69 [exp-d4e3ff]β”‚ 0.9367 β”‚ 128 β”‚
β”‚ my-experiments β”‚ 0.9067 β”‚ 16 β”‚
β”‚ β”œβ”€β”€ 2fc4f81 [cnn-32] β”‚ 0.9067 β”‚ 32 β”‚
β”‚ β”œβ”€β”€ 5bc84a3 [cnn-64] β”‚ 0.9158 β”‚ 64 β”‚
β”‚ β”œβ”€β”€ 206cba6 [cnn-96] β”‚ 0.9260 β”‚ 96 β”‚
β”‚ └── 21beb69 [cnn-128] β”‚ 0.9379 β”‚ 128 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

```dvc
$ dvc exp gc --all-tags
$ dvc exp show --all-tags
```

```dvctable
┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Experiment ┃ acc ┃ model.conv_units ┃
┑━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
β”‚ workspace β”‚ 0.9067 β”‚ 16 β”‚
β”‚ my-experiments β”‚ 0.9067 β”‚ 16 β”‚
β”‚ β”œβ”€β”€ 2fc4f81 [cnn-32] β”‚ 0.9067 β”‚ 32 β”‚
β”‚ β”œβ”€β”€ 5bc84a3 [cnn-64] β”‚ 0.9158 β”‚ 64 β”‚
β”‚ β”œβ”€β”€ 206cba6 [cnn-96] β”‚ 0.9260 β”‚ 96 β”‚
β”‚ └── 21beb69 [cnn-128] β”‚ 0.9379 β”‚ 128 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

### Keeping experiments in all commits

When you want to delete _all the experiments not associated with a Git commit_,
you can do so by `--all-commits` flag. It deletes the experiments in the
workspace that are not committed to the history.

### Deleting Experiment-Related Objects in DVC Cache

Note that `dvc exp gc` and `dvc exp remove` doesn't delete any objects in the
DVC <abbr>cache</abbr>. In order to remove the cache objects, e.g. model files,
intermediate artifacts, etc. related with the experiments, you can use `dvc gc`
command.

`dvc gc` receives the same _scoping_ flags, `--workspace`, `--all-branches`,
etc. After a `dvc exp gc --workspace` command, you can supply
`dvc gc --workspace` to remove all the experiment artifacts from the cache as
well.

## Removing experiments from remotes

As you push the experiments with `dvc exp push`, remotes may be become cluttered
with experiment references.

DVC doesn't provide a shortcut for cleaning up the experiments in remotes but
you can use Git plumbing commands to remove experiment references from remotes.

First get the list of experiments with their hash values.

```dvc
$ git ls-remote origin 'refs/exps/*'
98b237f8d8da0964c9fa60c6b27f1dd4a214dabf refs/exps/17/2b1b9c885f10a73c76bd457a04878bee0e6d6f/exp-7424d
794854926931e84ebc90e829dbc09b3085391659 refs/exps/19/0e697aa566482ebdb8bdb65401382e76b1bce5/exp-ec039
```

Then we can use `git push -d` as any other Git reference:

```dvc
$ git push -d origin refs/exps/17/2b1b9c885f10a73c76bd457a04878bee0e6d6f/exp-7424d
```

If you want to delete **all** experiments in a remote, you can use a loop:

```dvc
$ git ls-remote origin 'refs/exps/*' | cut -f 2 | while read exppath ; do
git push -d origin "${exppath}"
done
```

## Removing queued experiments

When you've created experiments to be run in the queue with
`dvc exp run --queue` and later decide not to run them, you can remove them with
`dvc exp remove --queue`.

```dvc
$ dvc exp run --queue -S param=10
Queued experiment '7b83744' for future execution.
$ dvc exp run --queue -S param=20
Queued experiment '68808d5' for future execution.
$ dvc exp show
```

```dvctable
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━┓
┃ Experiment ┃ Created ┃ param ┃
┑━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━┩
β”‚ workspace β”‚ - β”‚ - β”‚
β”‚ 04abbb7 β”‚ Jul 21, 2021 β”‚ - β”‚
β”‚ β”œβ”€β”€ *68808d5 β”‚ 12:05 PM β”‚ 20 β”‚
β”‚ └── *7b83744 β”‚ 12:05 PM β”‚ 10 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”˜
```

You can delete these queued experiments with `dvc exp remove --queue`.

```dvc
$ dvc exp remove --queue
$ dvc exp show
```

```dvctable
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━┓
┃ Experiment ┃ Created ┃ param ┃
┑━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━┩
β”‚ workspace β”‚ - β”‚ - β”‚
β”‚ 04abbb7 β”‚ Jul 21, 2021 β”‚ - β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”˜
```

0 comments on commit 6481a95

Please sign in to comment.