Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guide: Persisting experiments #2845

Merged
merged 19 commits into from
Oct 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions content/docs/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@
"children": [
"running-experiments",
"sharing-experiments",
"persisting-experiments",
"cleaning-experiments",
"checkpoints"
]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Persisting Experiments

DVC runs experiments outside of the Git stage/commit cycle for quick iteration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This motivation is not very clear. Let's just use the text from https://dvc.org/doc/user-guide/experiment-management#persistent-experiments probably

When your experiments are good enough to save or share, you may want to store them persistently as Git commits in your repository.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW what should happen to that existing section of the index page? Make it into a paragraph? Link here? Remove it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current paragraph mentions repro and that might be confusing. I think linking here is the easiest way.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolving. This is not addressed at all:

  1. The motivation is not clear. Why "persist" experiments? (or whatever terminology we end up using) — rel. guide: Persisting experiments #2845 (comment) below.
  2. What do we do about https://dvc.org/doc/user-guide/experiment-management#persistent-experiments? Let's deal with that and check any other mentions about persisting experiments in the guide, please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current paragraph is this:

image

The second paragraph is not relevant at all to experiments or persisting as we discuss here. We can link from the first paragraph to this document.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Oct 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeo so I was just referring to the first paragraph. That would be a better motivation/intro here I think, and then we can pretty much remove that whole section in the index (just mention somewhere and link here).

  • We can extract this to a follow-up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think doing that revision separately is a better approach.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Oct 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. But please don't mark conversations that are not resolved as resolved. How will we know what to carry over? Unresolving

It's also very hard to find them again once resolved. Not the greatest GH feature but let's be careful regardless.

When your experiments are good enough to save or share, you may want to store
them persistently as Git commits in your repository.

In this section, we describe how to bring them to the standard Git workflow with
`dvc exp branch` and `dvc exp apply`.

## Create a Git branch from an experiment

You can use `dvc exp branch` to create a new branch from an experiment, and keep
all its code and artifacts separate from your current <abbr>workspace</abbr>.

```dvc
$ dvc exp show --include-params=my_param
```

```dvctable
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┓
┃ neutral:**Experiment** ┃ neutral:**Created** ┃ metric:**auc** ┃ param:**my_param** ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━┩
│ workspace │ - │ 0.61314 │ 3 │
│ new-experiments │ Oct 19, 2020 │ 0.61314 │ 3 │
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
new-experiments │ Oct 19, 2020 │ 0.61314 │ 3 │
baseline │ Oct 19, 2020 │ 0.61314 │ 3 │

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just do master which is what's usually shown without tag names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not baseline, we are doing new experiments here. And baseline is used as the default experiment attached to a commit. That might confuse the reader.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Oct 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But all experiments have a parent commit, isn't that a link of "baseline"? OK so master if not?

  • Minor 💅 but
Suggested change
new-experiments │ Oct 19, 2020 │ 0.61314 │ 3 │
main │ Oct 19, 2020 │ 0.61314 │ 3 │

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. You are already using "baseline" in #2862 (review) 🙂 Let's just use that here too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recap on this. It's minor and you have specific reasons to name this something artificial. I'd just prefer if it can be something realistic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being too realistic like main may confuse readers. I tend to keep it looking like artificial and associated with experimentation as much as possible. It's a bit ugly and verbose but new-experiments seems to do the job.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Oct 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I just don't think it makes sense to run experiments from a version called "experiments". Seems like an misunderstanding in terms of the way to organize them. Probably not very meaningful in this doc though, moving on

│ ├── exp-e6c97 │ Oct 20, 2020 │ 0.69830 │ 2 │
│ └── exp-1df77 │ Oct 22, 2020 │ 0.51676 │ 1 │
└───────────────────────┴──────────────┴─────────┴────────────┘
```

Suppose you want to continue to work on `exp-e6c97` in a separate branch. You
can create a new Git branch by specifying the experiment and giving a new name
for it:

```dvc
$ dvc exp branch exp-e6c97 my-branch
Git branch 'my-branch' has been created from experiment 'exp-e6c97'.
To switch to the new branch run:
git checkout my-branch
```

Note that DVC doesn't switch into the new branch. You can create one or more
branches from the existing experiments, and switch into any one manually like
this:

```dvc
$ git checkout my-branch
$ dvc checkout
```

Your workspace now contains all the files from the experiment.

## Bring experiment results to your workspace

Typically, `dvc exp run` leaves the experiment results in your workspace for
convenience. However, you may have run multiple experiments and wish to go back
to a specific one. In this case, you can restore a previous experiment's results
with `dvc exp apply`. Let's see an example:

```dvc
$ dvc exp show --include-params=my_param
```

```dvctable
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┓
┃ neutral:**Experiment** ┃ neutral:**Created** ┃ metric:**auc** ┃ param:**my_param** ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━┩
│ workspace │ - │ 0.61314 │ 3 │
│ new-experiments │ Oct 19, 2020 │ 0.61314 │ 3 │
iesahin marked this conversation as resolved.
Show resolved Hide resolved
│ ├── exp-e6c97 │ Oct 20, 2020 │ 0.69830 │ 2 │
│ └── exp-1df77 │ Oct 22, 2020 │ 0.51676 │ 1 │g
└───────────────────────┴──────────────┴─────────┴────────────┘
```

The results found in the workspace are shown in the respective row. When you
want to bring another experiment to the workspace, you can reference it using
it's name or ID, e.g.:

```dvc
$ dvc exp apply exp-e6c97
Changes for experiment 'exp-e6c97' have been applied...
```
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved

> ⚠️ Note that `dvc exp apply` requires your project version (Git `HEAD`) to be
> the same as when the experiment was run.

Now, if you list the experiments again with `dvc exp show`, you'll see that the
workspace contains the results of `exp-e6c97`.

You can use standard Git commands (e.g. `git add/commit/push`) to version this
experiment directly in the <abbr>repository</abbr>. DVC-tracked data and
Comment on lines +87 to +90
Copy link
Contributor

@jorgeorpinel jorgeorpinel Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💅

Suggested change
workspace contains the results of `exp-e6c97`.
You can use standard Git commands (e.g. `git add/commit/push`) to version this
experiment directly in the <abbr>repository</abbr>. DVC-tracked data and
workspace contains the results of `exp-e6c97`.
You can use standard Git commands (e.g. `git add/commit/push`) to version this
experiment directly in the <abbr>repository</abbr> as a regular project version. DVC-tracked data and

Needs formatting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolving. These sentences are part of the same idea, should be in the same paragraph IMO. Minor, but let's carry over

artifacts are already in the DVC cache, and the rest (params, code and config
files, etc.) can be stored in Git.

> Please note that you need to `dvc push` in order to share or backup the DVC
> cache contents.