Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref: exp Examples, etc. #2259

Merged
merged 24 commits into from
Mar 17, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
c975a86
ref: dump exp show Examples
jorgeorpinel Mar 1, 2021
178b110
ref: dump exp apply Examples
jorgeorpinel Mar 1, 2021
77b96f1
ref: copy edit dumped exp show Examples
jorgeorpinel Mar 1, 2021
a0ccd63
ref: copy edit dumped exp apply Example
jorgeorpinel Mar 1, 2021
7f41d83
ref: dump and copy edit exp branch Example
jorgeorpinel Mar 1, 2021
baa601c
ref: update exp Examples sample repo tags
jorgeorpinel Mar 1, 2021
88686f3
ref: dump and copy edit exp diff Example
jorgeorpinel Mar 1, 2021
fb37dcd
ref: dump and copy edit exp gc Examples
jorgeorpinel Mar 1, 2021
70e1cbe
ref: dump and edit exp list Examples
jorgeorpinel Mar 2, 2021
a204f41
ref: dump and copy edit exp push/pull Examples
jorgeorpinel Mar 2, 2021
f0187a0
Merge branch 'master' into ref/exp/examples
jorgeorpinel Mar 8, 2021
0604483
ref: first exp run Example
jorgeorpinel Mar 11, 2021
e62e1e0
ref: exp run --set-param example
jorgeorpinel Mar 11, 2021
22ec5d6
ref: basic exp remove example
jorgeorpinel Mar 11, 2021
ae55ca5
Merge branch 'master' into ref/exp/examples
jorgeorpinel Mar 13, 2021
759b12f
ref: tooltip for run
jorgeorpinel Mar 14, 2021
e3f01b3
Merge branch 'master' into ref/exp/examples
jorgeorpinel Mar 14, 2021
e946710
Merge branch 'master' into ref/exp/examples
jorgeorpinel Mar 16, 2021
ffde787
ref: fix exp apply Example
jorgeorpinel Mar 16, 2021
c6af90f
Update content/docs/command-reference/exp/branch.md
jorgeorpinel Mar 17, 2021
c2713c4
Update content/docs/command-reference/exp/list.md
jorgeorpinel Mar 17, 2021
d2a389f
Update content/docs/command-reference/exp/run.md
jorgeorpinel Mar 17, 2021
7b442d5
Restyled by prettier
restyled-commits Mar 17, 2021
a99f821
Merge pull request #2308 from iterative/restyled/ref/exp/examples
jorgeorpinel Mar 17, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions content/docs/command-reference/exp/apply.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,88 @@ the current Git commit.

- `-v`, `--verbose` - displays detailed tracing information from executing the
`dvc pull` command.

## Example: Make an experiment persistent

> This example is based on our
> [Get Started](/doc/tutorials/get-started/experiments), where you can find the
> actual source code.

Let's say we have run 3 experiments in our project:

```dvc
$ dvc exp show --include-params=featurize
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ Experiment ┃ Created ┃ auc ┃ featurize.max_features ┃ featurize.ngrams ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ workspace │ - │ 0.61314 │ 1500 │ 2 │
│ 10-bigrams-experiment │ Jun 20, 2020 │ 0.61314 │ 1500 │ 2 │
│ ├── exp-e6c97 │ Oct 21, 2020 │ 0.69830 │ 2000 │ 2 │
│ ├── exp-1dad0 │ Oct 09, 2020 │ 0.57756 │ 1200 │ 2 │
│ └── exp-1df77 │ Oct 09, 2020 │ 0.51676 │ 500 │ 2 │
└───────────────────────┴──────────────┴─────────┴────────────────────────┴──────────────────┘
```

Since `exp-e6c97` has the best `auc`, we may want to commit it into our project
(this is what we call to "make it persistent"):

```dvc
$ dvc exp apply exp-e6c97
Changes for experiment 'exp-e6c97' have been applied...
```

We can inspect what changed in the workspace with Git,

```dvc
$ git status
On branch master
Changes not staged for commit:
modified: dvc.lock
modified: params.yaml
modified: scores.json
$ git diff params.yaml
dberenbaum marked this conversation as resolved.
Show resolved Hide resolved
```

```git
@@ -3,7 +3,7 @@ prepare:
featurize:
- max_features: 1500
+ max_features: 2000
ngrams: 2
```

and with DVC:

```dvc
$ dvc status
Data and pipelines are up to date.
$ dvc diff
Modified:
data/features/
data/features/test.pkl
data/features/train.pkl
model.pkl
files summary: 0 added, 0 deleted, 3 modified, 0 not in cache
```

To finish making this experiment persistent, we commit the changes to the repo:

```dvc
$ git add .
$ git commit -m "persist exp-e6c97"
```

We can now see that the experiment is the new tip of our master branch:

```dvc
$ dvc exp show --include-params=featurize
┏━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ Experiment ┃ Created ┃ auc ┃ featurize.max_features ┃ featurize.ngrams ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ workspace │ - │ 0.69830 │ 2000 │ 2 │
│ master │ 04:31 PM │ 0.69830 │ 2000 │ 2 │
└────────────┴──────────┴─────────┴────────────────────────┴──────────────────┘
```

Note that all the other experiments are based on a previous commit, so
`dvc exp show` won't display them by default (but they're still saved).
42 changes: 42 additions & 0 deletions content/docs/command-reference/exp/branch.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,45 @@ To switch into the new branch, use `git checkout branch` and `dvc checkout`.

- `-v`, `--verbose` - displays detailed tracing information from executing the
`dvc pull` command.

## Example: Make a persistent branch from an experiment

> This example is based on our
> [Get Started](/doc/tutorials/get-started/experiments), where you can find the
> actual source code.

Let's say we have run 3 experiments in our project:

```dvc
$ dvc exp show --include-params=featurize
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ Experiment ┃ Created ┃ auc ┃ featurize.max_features ┃ featurize.ngrams ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ workspace │ - │ 0.61314 │ 1500 │ 2 │
│ 10-bigrams-experiment │ Jun 20, 2020 │ 0.61314 │ 1500 │ 2 │
│ ├── exp-e6c97 │ Oct 21, 2020 │ 0.61314 │ 1500 │ 2 │
│ ├── exp-1dad0 │ Oct 09, 2020 │ 0.57756 │ 2000 │ 2 │
│ └── exp-1df77 │ Oct 09, 2020 │ 0.51676 │ 500 │ 2 │
└───────────────────────┴──────────────┴─────────┴────────────────────────┴──────────────────┘
```

We may want to branch-off `exp-1dad0` for a separate experimentation process
(based on 2000 `max_features`).

```dvc
$ dvc exp branch exp-1dad0 maxf-2000
Git branch 'maxf-2000' has been created from experiment 'exp-1dad0'.
To switch to the new branch run:
git checkout my-branch
```

We can inspect the result with Git:

```dvc
$ git branch
* master
maxf-2000
```

`maxf-2000` can now be checked out, merged, rebased, pushed, etc. like any other
Git branch.
56 changes: 56 additions & 0 deletions content/docs/command-reference/exp/diff.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,59 @@ all the current experiments (without comparisons).
problems arise, otherwise 1.

- `-v`, `--verbose` - displays detailed tracing information.

## Examples

> This example is based on our
> [Get Started](/doc/tutorials/get-started/experiments), where you can find the
> actual source code.

Let's say we have run 3 experiments in our project:

```dvc
$ dvc exp show --include-params=featurize
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ Experiment ┃ Created ┃ auc ┃ featurize.max_features ┃ featurize.ngrams ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ workspace │ - │ 0.61314 │ 1500 │ 2 │
│ 10-bigrams-experiment │ Jun 20, 2020 │ 0.61314 │ 1500 │ 2 │
│ ├── exp-e6c97 │ Oct 21, 2020 │ 0.61314 │ 1500 │ 2 │
│ ├── exp-1dad0 │ Oct 09, 2020 │ 0.57756 │ 2000 │ 2 │
│ └── exp-1df77 │ Oct 09, 2020 │ 0.51676 │ 500 │ 2 │
└───────────────────────┴──────────────┴─────────┴────────────────────────┴──────────────────┘
```

Since we haven't made any changes to the workspace, we can compare `exp-1dad0`
to its baseline (`10-bigrams-experiment`, current `HEAD`) like this:

```dvc
$ dvc exp diff exp-1dad0
Path Metric Value Change
scores.json auc 0.61314 0.035575
Path Param Value Change
params.yaml featurize.max_features 1500 -500
```

To compare two specific experiments (values are shown for the second one by
default):

```dvc
$ dvc exp diff exp-1dad0 exp-1df77
Path Metric Value Change
scores.json auc 0.51676 -0.060799
Path Param Value Change
params.yaml featurize.max_features 500 -1500
```

To compare an experiment to the
[`7-ml-pipeline`](https://github.com/iterative/example-get-started/releases/tag/7-ml-pipeline)
tag (or any other [revision](https://git-scm.com/docs/revisions)):

```dvc
$ dvc exp diff exp-1dad0 7-ml-pipeline
Path Metric Value Change
scores.json auc None diff not supported
Path Param Value Change
params.yaml featurize.max_features 500 -1500
params.yaml featurize.ngrams 1 -1
```
70 changes: 70 additions & 0 deletions content/docs/command-reference/exp/gc.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,73 @@ separately to delete it.

- `-v`, `--verbose` - displays detailed tracing information from executing the
`dvc pull` command.

## Examples

> This example is based on our
> [Get Started](/doc/tutorials/get-started/experiments), where you can find the
> actual source code.

Let's say we have the following project, and have just
[applied](/docs/command-reference/exp/apply) and committed `exp-1dad0` (current
`HEAD` of `master`):

```dvc
$ dvc exp show --all-commits --include-params=featurize
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ Experiment ┃ Created ┃ auc ┃ featurize.max_features ┃ featurize.ngrams ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ workspace │ - │ 0.57756 │ 2000 │ 2 │
│ master │ 05:39 PM │ 0.57756 │ 2000 │ 2 │
│ 10-bigrams-experiment │ Jun 20, 2020 │ 0.61314 │ 1500 │ 2 │
│ ├── exp-e6c97 │ Oct 21, 2020 │ 0.61314 │ 1500 │ 2 │
│ ├── exp-1dad0 │ Oct 09, 2020 │ 0.57756 │ 2000 │ 2 │
│ └── exp-1df77 │ Oct 09, 2020 │ 0.51676 │ 500 │ 2 │
│ 9-bigrams-model │ Jun 20, 2020 │ 0.54175 │ 1500 │ 2 │
│ └── exp-069d9 │ Sep 24, 2020 │ 0.51076 │ 2500 │ 2 │
│ 8-evaluation │ Jun 20, 2020 │ 0.54175 │ 500 │ 1 │
│ 7-ml-pipeline │ Jun 20, 2020 │ - │ 500 │ 1 │
...
│ 0-git-init │ Jun 20, 2020 │ - │ 1500 │ 2 │
└───────────────────────┴──────────────┴─────────┴────────────────────────┴──────────────────┘
```

If we consider all the other experiments unnecessary, we can delete them like
this:

```dvc
$ dvc exp gc -w
WARNING: This will remove all experiments except ...
Are you sure you want to proceed? [y/n] y
Removed 4 experiments. To remove unused cache files use 'dvc gc'.
```

We can confirm that all the previous experiments are gone:

```dvc
$ dvc exp show --all-commits --include-params=featurize
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ Experiment ┃ Created ┃ auc ┃ featurize.max_features ┃ featurize.ngrams ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ workspace │ - │ 0.57756 │ 2000 │ 2 │
│ master │ 05:39 PM │ 0.57756 │ 2000 │ 2 │
│ 10-bigrams-experiment │ Jun 20, 2020 │ 0.61314 │ 1500 │ 2 │
│ 9-bigrams-model │ Jun 20, 2020 │ 0.54175 │ 1500 │ 2 │
...
│ 0-git-init │ Jun 20, 2020 │ - │ 2000 │ 2 │
└───────────────────────┴──────────────┴─────────┴────────────────────────┴──────────────────┘
```

To remove any <abbr>cached</abbr> data associated to the deleted experiments and
which are no longer needed in the project, we can use regular `dvc gc` (with the
appropriate options):

```dvc
$ dvc dvc gc --all-commits
WARNING: This will remove all cache except ...
Are you sure you want to proceed? [y/n] y
...
```

> Note the use of `--all-commits` to ensure that we do not garbage collect files
> or directories referenced in remaining commits in the repo.
51 changes: 50 additions & 1 deletion content/docs/command-reference/exp/list.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ limited to experiment names and with very simple formatting. See also
`dvc exp run`.

If a working `git_remote` name (e.g. `origin`) or valid Git repo's URL is
provided, lists experiments in that <abbr>repository</abbr> instead (if any).
provided, lists experiments in that <abbr>repository</abbr> instead (if any,
based on the `dvc remote default`).

> Note that this utility doesn't require an existing <abbr>DVC project</abbr> to
> run from when a `git_remote` URL is given.
Expand All @@ -45,3 +46,51 @@ options below).

- `-v`, `--verbose` - displays detailed tracing information from executing the
`dvc pull` command.

## Examples

> This example is based on our
> [Get Started](/doc/tutorials/get-started/experiments), where you can find the
> actual source code.

Let's say we have run 3 experiments in our project. You can quickly list the
available experiments with this command:

```dvc
$ dvc exp list --all
10-bigrams-experiment:
exp-e6c97
exp-1dad0
exp-1df77
```

> Contrast this with the full table
> [displayed by `dvc exp show`](/doc/command-reference/exp/show#examples).

You can also list experiments in any DVC repo with `dvc exp list`:

```dvc
$ dvc exp list --all [email protected]:iterative/example-get-started.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to actually push some experiments to the repo to support this? I don't it will work right now, which could cause confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe to one of the exp-specific repos you published instead (for the next iteration). Do they already have some exps up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't pushed any experiments since they need a data remote to save the experiment data. However, there is a read-only remote associated with example-get-started, so I could add some there. Let me know what you think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, having some exps in there would be nice. That remote is read-only in it's HTTP form but you can use the S3 config to upload stuff as you probably already know from modifying the generator script 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I just pushed some experiments up to https://github.com/iterative/example-get-started.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thanks! 🙏

10-bigrams-experiment:
exp-e6c97
exp-86dd6
```

We can see that two experiments are available in
([the DVC repo](https://github.com/iterative/example-get-started)).

If we're currently in a local clone of the repo, we can also use
[Git remote](https://git-scm.com/book/en/v2/Git-Basics-Working-with-Remotes)
name instead:

```dvc
$ git remote -v
origin [email protected]:iterative/example-get-started.git
$ dvc exp list --all origin
10-bigrams-experiment:
exp-e6c97
exp-86dd6
```

And in this context, `dvc exp pull` can download the experiments if needed, as
`dvc exp push` can upload any local ones we wish to share.
25 changes: 25 additions & 0 deletions content/docs/command-reference/exp/pull.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,3 +70,28 @@ given with `--remote`.

- `-v`, `--verbose` - displays detailed tracing information from executing the
`dvc pull` command.

## Examples

> This example is based on our
> [Get Started](/doc/tutorials/get-started/experiments), where you can find the
> actual source code.

Let's say we have cloned a DVC repository, and would like to fetch an experiment
that someone else shared (see also `dvc exp list`).

```dvc
$ dvc exp list --all origin
master:
exp-e6c97
$ dvc exp pull origin exp-e6c97
Pulled experiment 'exp-e6c97' from Git remote 'origin'.
```

We can now see that the experiment exists in the local repo:

```dvc
$ dvc exp list --all
master:
exp-e6c97
```
31 changes: 31 additions & 0 deletions content/docs/command-reference/exp/push.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,34 @@ given with `--remote`.

- `-v`, `--verbose` - displays detailed tracing information from executing the
`dvc pull` command.

## Examples

> This example is based on our
> [Get Started](/doc/tutorials/get-started/experiments), where you can find the
> actual source code.

Let's say we have run 3 experiments in our project:

```dvc
$ dvc exp list --all
11-bigrams-experiment:
exp-e6c97
exp-1dad0
exp-1df77
```

We would now like to share one of them with others via the Git remote:

```dvc
$ dvc exp push origin exp-e6c97
Pushed experiment 'exp-e6c97' to Git remote 'origin'.
```

We can now see that the experiment exists in the remote repo:

```dvc
$ dvc exp list --all origin
master:
exp-e6c97
```
Loading