Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guide: revisit Exp Sharing #2908

Merged
merged 81 commits into from
Dec 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
5008c55
guide: split Experiments (index) into sub-pages
jorgeorpinel Jul 21, 2021
ff85352
Merge branch 'master' into guide/exps
jorgeorpinel Jul 28, 2021
923040f
case: keep Persistent Exps in basic page
jorgeorpinel Jul 29, 2021
3ae85e5
cases: keep Run-cache in basic Exps page
jorgeorpinel Jul 29, 2021
29b17b2
guide: edit Exp Mgmt index (intro)
jorgeorpinel Jul 29, 2021
e21fef4
guide: edit basic Exps page inc. persisting them
jorgeorpinel Jul 29, 2021
c21dbe3
Merge branch 'master' into guide/exps
jorgeorpinel Aug 4, 2021
d8f2d7c
guide: rename DVC Exps, remove Org Exps page
jorgeorpinel Aug 4, 2021
1337453
guide: bash -> dvc in EM/Checkpoints
jorgeorpinel Aug 4, 2021
8d93521
guide: fix exps link
jorgeorpinel Aug 4, 2021
90f3042
Merge branch 'master' into guide/exps
jorgeorpinel Aug 11, 2021
44a1614
guide: summarize Sharing Exps intro
jorgeorpinel Aug 11, 2021
3dbcb5b
ref: link from exp push/pull to Exp Sharing guide
jorgeorpinel Aug 15, 2021
3fff051
Update content/docs/user-guide/experiment-management/sharing-experime…
jorgeorpinel Aug 15, 2021
c419fe6
guide: rename Exp Sharing sections
jorgeorpinel Aug 15, 2021
c473e51
guide: summarize Exp Sharing examples
jorgeorpinel Aug 15, 2021
1da6bd4
guide: link from Exp Mgmt index to Sharing
jorgeorpinel Aug 15, 2021
3207027
Merge branch 'guide/exps-sharing' of github.com:iterative/dvc.org int…
jorgeorpinel Aug 15, 2021
e115cff
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Aug 17, 2021
75d3280
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Aug 18, 2021
ad193b9
guide: ~~isolate~~ from link to Exp Sharing
jorgeorpinel Aug 18, 2021
7463e85
Update content/docs/user-guide/experiment-management/sharing-experime…
jorgeorpinel Aug 18, 2021
2744a97
guide: mention only SSH Git URLs support exp sharing
jorgeorpinel Aug 18, 2021
fb4663c
Merge branch 'master' into guide/exps
jorgeorpinel Aug 18, 2021
836f0a5
Merge branch 'guide/exps' into guide/exps-sharing
jorgeorpinel Aug 18, 2021
2e33799
guide: update dvc remote example in sharing exps
jorgeorpinel Aug 18, 2021
c60b5fe
yarn format some files
jorgeorpinel Aug 18, 2021
d1422b1
guide: consolidate Exp Sharing intro (#2711)
jorgeorpinel Aug 18, 2021
532df56
Merge branch 'master' into guide/exps
jorgeorpinel Aug 18, 2021
a53f7db
Merge branch 'guide/exps' into guide/exps-sharing
jorgeorpinel Aug 18, 2021
3e5a3a8
Merge branch 'guide/exps-sharing' into guide/exps-sharing-examples
jorgeorpinel Aug 18, 2021
8b6a3f6
prettier sharing-experiments.md
jorgeorpinel Aug 18, 2021
209e848
Update content/docs/user-guide/experiment-management/sharing-experime…
jorgeorpinel Aug 20, 2021
fd44d05
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Oct 8, 2021
1237674
guide: roll back wrong files
jorgeorpinel Oct 8, 2021
ecbc8cb
guide: roll back Exp Mgmt index...
jorgeorpinel Oct 8, 2021
5f1f8a8
guide: link to Sharing Exps from index
jorgeorpinel Oct 8, 2021
fdd38e2
guide: Listing exps on remotes
jorgeorpinel Oct 9, 2021
1c46961
guide: don't mention Git here...
jorgeorpinel Oct 9, 2021
0f1b7ef
guide: clarify that git is needed for exps and sharing
jorgeorpinel Oct 9, 2021
bcac63f
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Oct 11, 2021
dcd6986
guide: clarify note on Git requirement for DVC Exps
jorgeorpinel Oct 11, 2021
5e8dc5e
guide: simplify Sharing Exps intro (rel Git)
jorgeorpinel Oct 11, 2021
5b3306a
guide: rename exp list -r section
jorgeorpinel Oct 11, 2021
7243b13
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Oct 17, 2021
003d38a
copy edit
jorgeorpinel Oct 17, 2021
d2595aa
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Oct 27, 2021
69476ed
cases: simplify note about requiring Git
jorgeorpinel Oct 27, 2021
f7c94a6
guide: emoji for example in Sharing Exps
jorgeorpinel Oct 27, 2021
ad1f508
guide: clarify note about Git-DVC repo required for Exps
jorgeorpinel Oct 27, 2021
3bff416
Update content/docs/user-guide/experiment-management/sharing-experime…
jorgeorpinel Oct 27, 2021
bf6a58b
guide: another example emoji en Sharing Exps
jorgeorpinel Oct 27, 2021
9bb3a2a
Restyled by prettier (#2972)
restyled-io[bot] Oct 27, 2021
33fe7e6
Merge branch 'guide/exps-sharing' of github.com:iterative/dvc.org int…
jorgeorpinel Oct 27, 2021
4e3e5b9
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Nov 1, 2021
9f66ee1
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Nov 4, 2021
32ccd25
guide: list exps in Comparing guide, linked from Sharing
jorgeorpinel Nov 4, 2021
73fccdd
guide: address feedback from
jorgeorpinel Nov 7, 2021
dad7052
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Nov 8, 2021
20943b5
guide: rephrase Git history exps org
jorgeorpinel Nov 8, 2021
9785f80
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Nov 10, 2021
fef15fe
guide:address Exp sharing feedback from
jorgeorpinel Nov 11, 2021
c0876ef
guide: update Git remote auth limitation wording
jorgeorpinel Nov 11, 2021
f2e6f8c
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Nov 13, 2021
0552ef8
guide: more copy edits on Exp Sharing and Comparing
jorgeorpinel Nov 15, 2021
2886a3b
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Nov 17, 2021
15b93d1
guide: clarify `exp list` remote info
jorgeorpinel Nov 17, 2021
8fb73c3
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Dec 3, 2021
789dad3
guide: un0hide exp sharing details
jorgeorpinel Dec 3, 2021
e58b4bf
guide: move multi-exp share example to how-to
jorgeorpinel Dec 3, 2021
429733d
guide: simplify Exp Sharing intro, add diagram
jorgeorpinel Dec 3, 2021
a3b0541
guide: fix SSH URLS link in Exp Sharing...
jorgeorpinel Dec 3, 2021
06748f7
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Dec 15, 2021
ebef902
exp: roll back unrelated changes
jorgeorpinel Dec 15, 2021
15cb714
guide: Git -> Git remote
jorgeorpinel Dec 16, 2021
675aace
guide: improve Sharing exp intro
jorgeorpinel Dec 17, 2021
a9988b0
exp push/pull: remove --remote and --jobs details from guide and ref …
jorgeorpinel Dec 17, 2021
3b7a7e9
guide: remove Sharing Exps example
jorgeorpinel Dec 17, 2021
c7c1186
Merge branch 'master' into guide/exps-sharing
jorgeorpinel Dec 22, 2021
0fb52cb
guide: simplify Sharing Exps intro
jorgeorpinel Dec 22, 2021
d6de4f5
guide: add exp pull to diagram in Sharing Exps
jorgeorpinel Dec 22, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions content/docs/command-reference/exp/pull.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,10 @@ positional arguments:

## Description

The `dvc exp push` and `dvc exp pull` commands are the means for sharing
experiments across <abbr>repository</abbr> copies via Git (and DVC) remotes.
The `dvc exp push` and `dvc exp pull` commands are the means for [sharing
experiments] across <abbr>repository</abbr> copies via Git and DVC remotes.

[sharing experiments]: /doc/user-guide/experiment-management/sharing-experiments

> Plain `git push` and `git fetch` don't work with experiments because these are
> saved under custom Git references. See **How does DVC track experiments?** in
Expand All @@ -35,8 +37,6 @@ your local experiments.
By default, this command will also try to [pull](/doc/command-reference/pull)
all <abbr>cached</abbr> data associated with the experiment to DVC
[remote storage](/doc/command-reference/remote), unless `--no-cache` is used.
The default remote is used (see `dvc remote default`) unless a specific one is
given with `--remote`.

> 💡 Note that `git push <git_remote> --delete <experiment>` can be used to
> delete a pushed experiment.
Expand Down
8 changes: 4 additions & 4 deletions content/docs/command-reference/exp/push.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,10 @@ positional arguments:

## Description

The `dvc exp push` and `dvc exp pull` commands are the means for sharing
experiments across <abbr>repository</abbr> copies via Git (and DVC) remotes.
The `dvc exp push` and `dvc exp pull` commands are the means for [sharing
experiments] across <abbr>repository</abbr> copies via Git and DVC remotes.

[sharing experiments]: /doc/user-guide/experiment-management/sharing-experiments

> Plain `git push` and `git fetch` don't work with experiments because these are
> saved under custom Git references. See **How does DVC track experiments?** in
Expand All @@ -35,8 +37,6 @@ to see experiments in the remote.
This command will also try to [push](/doc/command-reference/push) all
<abbr>cached</abbr> data associated with the experiment to DVC
[remote storage](/doc/command-reference/remote), unless `--no-cache` is used.
The default remote is used (see `dvc remote default`) unless a specific one is
given with `--remote`.

## Options

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,9 @@ refs/tags/baseline-experiment:
cnn-64
```

This command lists remote experiments originated from `HEAD`. You can add any
other options to the remote command, including `--all` (see previous section).
This command lists remote experiments based on that repo's `HEAD`. You can use
`--all` to list all experiments, or add any other supported option to the remote
`dvc exp list` command.

[shared]: /doc/user-guide/experiment-management/sharing-experiments

Expand Down
216 changes: 57 additions & 159 deletions content/docs/user-guide/experiment-management/sharing-experiments.md
Original file line number Diff line number Diff line change
@@ -1,199 +1,97 @@
# Sharing Experiments

There are two types of remotes that can store experiments. Git remotes are
distributed copies of the Git repository, for example on GitHub or GitLab.
In a regular Git workflow, <abbr>DVC repository</abbr> versions are typically
synchronized among team members. And [DVC Experiments] are internally connected
to this commit history. But to avoid cluttering everyone's copies of the repo,
by default experiments will only exist in the local environment where they were
[created].

[DVC remotes](/doc/command-reference/remote) on the other hand are
storage-specific locations (e.g. Amazon S3 or Google Drive) which we can
configure with `dvc remote`. DVC uses them to store and fetch large files that
don't normally fit inside Git repos.
You must explicitly save or share experiments individually on other locations.
This is done similarly to [sharing regular project versions], by synchronizing
with DVC and Git remotes. But DVC takes care of pushing and pulling to/from Git
remotes in the case of experiments.

DVC needs both kinds of remotes for backing up and sharing experiments.
```
┌────────────────┐ ┌────────────────┐
├────────────────┤ │ │ Remote locations
│ DVC remote │ │ Git remote │
│ storage │ ├────────────────┤
└────────────────┘ └────────────────┘
▲ ▲
│ dvc exp push │
│ dvc exp pull │
▼ ▼
┌─────────────────┐ ┌────────────────┐
│ │ │ Code and │
│ Cached data │ │ metafiles │ Local project
└─────────────────┘ └────────────────┘
```

Experiment files that are normally tracked in Git (like code versions) are
shared using Git remotes, and files or directories tracked with DVC (like
datasets) are shared using DVC remotes.
> Specifically, data, models, etc. are tracked and <abbr>cached</abbr> by DVC
> and thus will be transferred to/from
> [remote storage](/doc/command-reference/remote) (e.g. Amazon S3 or Google
> Drive). Small files like [DVC metafiles](/doc/user-guide/project-structure)
> and code are tracked by Git, so DVC pushes and pulls them to/from your
> existing [Git remotes].

> See [Git remotes guide] and `dvc remote add` for information on setting them
> up.
[dvc experiments]: /doc/user-guide/experiment-management/experiments-overview
[created]: /doc/user-guide/experiment-management/running-experiments
[sharing regular project versions]: /doc/use-cases/sharing-data-and-model-files
[git remotes]: https://git-scm.com/book/en/v2/Git-Basics-Working-with-Remotes

[git remotes guide]:
https://git-scm.com/book/en/v2/Git-Basics-Working-with-Remotes
## Preparation

Normally, there should already be a Git remote called `origin` when you clone a
repo. Use `git remote -v` to list your Git remotes:
Make sure that you have the necessary remotes setup. Let's confirm with
`git remote -v` and `dvc remote list`:

```dvc
$ git remote -v
origin https://github.com/iterative/example-dvc-experiments (fetch)
origin https://github.com/iterative/example-dvc-experiments (push)
```

Similarly, you can see the DVC remotes in you project using `dvc remote list`:
origin [email protected]:iterative/get-started-experiments.git (fetch)
origin [email protected]:iterative/get-started-experiments.git (push)

```dvc
$ dvc remote list
storage https://remote.dvc.org/example-dvc-experiments
```

## Uploading experiments to remotes

You can upload an experiment and its files to both remotes using `dvc exp push`
(requires the Git remote name and experiment name as arguments).

```dvc
$ dvc exp push origin exp-abc123
storage s3://mybucket/my-dvc-store
```

> Use `dvc exp show` to find experiment names.

This pushes the necessary DVC-tracked files from the cache to the default DVC
remote (similar to `dvc push`). You can prevent this behavior by using the
`--no-cache` option to the command above.

If there's no default DVC remote, it will ask you to define one with
`dvc remote default`. If you don't want a default remote, or if you want to use
a different remote, you can specify one with the `--remote` (`-r`) option.

DVC can use multiple threads to upload files (4 per CPU core by default). You
can set the number with `--jobs` (`-j`). Please note that increases in
performance also depend on the connection bandwidth and remote configurations.
> ⚠️ Note that DVC can only authenticate with Git remotes using [SSH URLs].
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved

> 📖 See also the [run-cache] mechanism.
[ssh urls]:
https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols#_the_protocols

[run-cache]: /doc/user-guide/project-structure/internal-files#run-cache
## Uploading experiments

## Listing experiments remotely
You can upload an experiment with all of its files and data using
`dvc exp push`, which takes a Git remote name and an experiment ID or name as
arguments.

In order to list experiments in a DVC project, you can use the `dvc exp list`
command. With no command line options, it lists the experiments in the current
project.

You can supply a Git remote name to list the experiments:
> 💡 You can use `dvc exp show` to find experiment names.

```dvc
$ dvc exp list origin
main:
cnn-128
cnn-32
cnn-64
cnn-96
$ dvc exp push origin exp-abc123
```

Note that by default this only lists experiments derived from the current commit
(local `HEAD` or default remote branch). You can list all the experiments
(derived from from every branch and commit) with the `--all` option:
Once pushed, you can easily [list remote experiments] (with `dvc exp list`). To
pus

```dvc
$ dvc exp list origin --all
0b5bedd:
exp-9edbe
0f73830:
exp-280e9
exp-4cd96
...
main:
cnn-128
...
```
> See also [How to Share Many Experiments][share many].

When you don't need to see the parent commits, you can list experiment names
only, with `--names-only`:

```dvc
$ dvc exp list origin --names-only
cnn-128
cnn-32
cnn-64
cnn-96
```
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved
[list remote experiments]:
/doc/user-guide/experiment-management/comparing-experiments#list-experiments-saved-remotely
[share many]: /doc/user-guide/how-to/share-many-experiments

## Downloading experiments from remotes
## Downloading experiments

When you clone a DVC repository, it doesn't fetch any experiments by default. In
order to get them, use `dvc exp pull` (with the Git remote and the experiment
name), for example:

```dvc
$ dvc exp pull origin cnn-64
$ dvc exp pull origin cnn-32
```

This pulls all the necessary files from both remotes. Again, you need to have
both of these configured (see this
[earlier section](#prepare-remotes-to-share-experiments)).

You can specify a remote to pull from with `--remote` (`-r`).

DVC can use multiple threads to download files (4 per CPU core typically). You
can set the number with `--jobs` (`-j`).

If an experiment being pulled already exists in the local project, DVC won't
overwrite it unless you supply `--force`.

### Example: Pushing or pulling multiple experiments

You can create a loop to upload or download all experiments like this:

```dvc
$ dvc exp list --all --names-only | while read -r expname ; do \
dvc exp pull origin ${expname} \
done
```

> Without `--all`, only the experiments derived from the current commit will be
> pushed/pulled.

## Example: Creating a directory for an experiment

A good way to isolate experiments is to create a separate home directory for
each one.

> Another alternative is to use `dvc exp apply` and `dvc exp branch`, but here
> we'll see how to use `dvc exp pull` to copy an experiment.

Suppose there is a <abbr>DVC repository</abbr> in `~/my-project` with multiple
experiments. Let's create a copy of experiment `exp-abc12` from there.

First, clone the repo into another directory:

```dvc
$ git clone ~/my-project ~/my-experiment
$ cd ~/my-experiment
```

Git sets the `origin` remote of the cloned repo to `~/my-project`, so you can
see your all experiments from `~/my-experiment` like this:

```dvc
$ dvc exp list origin
main:
exp-abc12
...
```

If there is no DVC remote in the original repository, you can define its
<abbr>cache</abbr> as the clone's `dvc remote`:

```dvc
$ dvc remote add --local --default storage ~/my-project/.dvc/cache
```

> ⚠️ `--local` is important here, so that the configuration change doesn't get
> to the original repo accidentally.

If there's a DVC remote for the project, assuming the experiments have been
pushed there, you can pull the one in question:

```dvc
$ dvc exp pull origin exp-abc12
```

Then we can `dvc apply` this experiment and get a <abbr>workspace</abbr> that
contains all of its files:

```dvc
$ dvc exp apply exp-abc12
```

Now you have a dedicated directory for your experiment, containing all its
artifacts!
19 changes: 19 additions & 0 deletions content/docs/user-guide/how-to/share-many-experiments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# How to Share Many Experiments

`dvc exp push` and `dvc exp push` allow us to [share experiments] between
repositories via existing DVC and Git remotes. These however work on individual
experiments.

Here's a simple shell loop to push or pull all experiments (Linux):

```dvc
$ dvc exp list --all --names-only | while read -r expname ; do \
dvc exp pull origin ${expname} \
done
```

> 📖 See [Listing Experiments] for more info on `dvc exp list`.

[share experiments]: /doc/user-guide/experiment-management/sharing-experiments
[listing experiments]:
/doc/user-guide/experiment-management/comparing-experiments#list-experiments-in-the-project