-
Notifications
You must be signed in to change notification settings - Fork 394
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* guide: split Experiments (index) into sub-pages * case: keep Persistent Exps in basic page * cases: keep Run-cache in basic Exps page * guide: edit Exp Mgmt index (intro) * guide: edit basic Exps page inc. persisting them and move run-cache to guide intro (index) * guide: rename DVC Exps, remove Org Exps page * guide: bash -> dvc in EM/Checkpoints * guide: fix exps link * guide: summarize Sharing Exps intro * ref: link from exp push/pull to Exp Sharing guide * Update content/docs/user-guide/experiment-management/sharing-experiments.md * guide: rename Exp Sharing sections * guide: summarize Exp Sharing examples * guide: link from Exp Mgmt index to Sharing * guide: ~~isolate~~ from link to Exp Sharing per #2711 (review) * Update content/docs/user-guide/experiment-management/sharing-experiments.md Co-authored-by: David de la Iglesia Castro <[email protected]> * guide: mention only SSH Git URLs support exp sharing per #2711 (review) * guide: update dvc remote example in sharing exps * yarn format some files per https://app.circleci.com/pipelines/github/iterative/dvc.org/10086/workflows/9b1bf89f-a432-49f2-9a20-72fe77dd4102/jobs/10145 * guide: consolidate Exp Sharing intro (#2711) * guide: summarize Sharing Exps intro * ref: link from exp push/pull to Exp Sharing guide * Update content/docs/user-guide/experiment-management/sharing-experiments.md * guide: link from Exp Mgmt index to Sharing * guide: ~~isolate~~ from link to Exp Sharing per #2711 (review) * Update content/docs/user-guide/experiment-management/sharing-experiments.md Co-authored-by: David de la Iglesia Castro <[email protected]> * guide: mention only SSH Git URLs support exp sharing per #2711 (review) * guide: update dvc remote example in sharing exps * yarn format some files per https://app.circleci.com/pipelines/github/iterative/dvc.org/10086/workflows/9b1bf89f-a432-49f2-9a20-72fe77dd4102/jobs/10145 Co-authored-by: David de la Iglesia Castro <[email protected]> * prettier sharing-experiments.md * Update content/docs/user-guide/experiment-management/sharing-experiments.md Co-authored-by: Casper da Costa-Luis <[email protected]> * guide: roll back wrong files * guide: roll back Exp Mgmt index... * guide: link to Sharing Exps from index * guide: Listing exps on remotes per #2908 (review) * guide: don't mention Git here... per #2908 (review) * guide: clarify that git is needed for exps and sharing per #2908 (review) * guide: clarify note on Git requirement for DVC Exps per #2908 (review) * guide: simplify Sharing Exps intro (rel Git) per #2908 (review) * guide: rename exp list -r section * copy edit * cases: simplify note about requiring Git per #2908 (review) * guide: emoji for example in Sharing Exps per #2908 (comment) * guide: clarify note about Git-DVC repo required for Exps per #2908 (review) * Update content/docs/user-guide/experiment-management/sharing-experiments.md * guide: another example emoji en Sharing Exps * Restyled by prettier (#2972) Co-authored-by: Restyled.io <[email protected]> * guide: list exps in Comparing guide, linked from Sharing per #2908 (comment) * guide: address feedback from #2908 (review) and below * guide: rephrase Git history exps org per #2908 (review) * guide:address Exp sharing feedback from from #2908 (review) and below * guide: update Git remote auth limitation wording per #2908 (comment) * guide: more copy edits on Exp Sharing and Comparing * guide: clarify `exp list` remote info per #2908 (review) * guide: un0hide exp sharing details per #2908 (review) * guide: move multi-exp share example to how-to per #2908 (review) * guide: simplify Exp Sharing intro, add diagram per should be focusing more on explaining (in simple terms, with diagrams) how it works * guide: fix SSH URLS link in Exp Sharing... * exp: roll back unrelated changes * guide: Git -> Git remote per #2908 (review) * guide: improve Sharing exp intro per #2908 (review) * exp push/pull: remove --remote and --jobs details from guide and ref descs. rel. #2908 (comment) * guide: remove Sharing Exps example per #2908 (comment) * guide: simplify Sharing Exps intro per #2908 (review) * guide: add exp pull to diagram in Sharing Exps per #2908 (comment) Co-authored-by: David de la Iglesia Castro <[email protected]> Co-authored-by: Casper da Costa-Luis <[email protected]> Co-authored-by: restyled-io[bot] <32688539+restyled-io[bot]@users.noreply.github.com> Co-authored-by: Restyled.io <[email protected]>
- Loading branch information
1 parent
33a303c
commit 3fab9cd
Showing
5 changed files
with
87 additions
and
169 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
216 changes: 57 additions & 159 deletions
216
content/docs/user-guide/experiment-management/sharing-experiments.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,199 +1,97 @@ | ||
# Sharing Experiments | ||
|
||
There are two types of remotes that can store experiments. Git remotes are | ||
distributed copies of the Git repository, for example on GitHub or GitLab. | ||
In a regular Git workflow, <abbr>DVC repository</abbr> versions are typically | ||
synchronized among team members. And [DVC Experiments] are internally connected | ||
to this commit history. But to avoid cluttering everyone's copies of the repo, | ||
by default experiments will only exist in the local environment where they were | ||
[created]. | ||
|
||
[DVC remotes](/doc/command-reference/remote) on the other hand are | ||
storage-specific locations (e.g. Amazon S3 or Google Drive) which we can | ||
configure with `dvc remote`. DVC uses them to store and fetch large files that | ||
don't normally fit inside Git repos. | ||
You must explicitly save or share experiments individually on other locations. | ||
This is done similarly to [sharing regular project versions], by synchronizing | ||
with DVC and Git remotes. But DVC takes care of pushing and pulling to/from Git | ||
remotes in the case of experiments. | ||
|
||
DVC needs both kinds of remotes for backing up and sharing experiments. | ||
``` | ||
┌────────────────┐ ┌────────────────┐ | ||
├────────────────┤ │ │ Remote locations | ||
│ DVC remote │ │ Git remote │ | ||
│ storage │ ├────────────────┤ | ||
└────────────────┘ └────────────────┘ | ||
▲ ▲ | ||
│ dvc exp push │ | ||
│ dvc exp pull │ | ||
▼ ▼ | ||
┌─────────────────┐ ┌────────────────┐ | ||
│ │ │ Code and │ | ||
│ Cached data │ │ metafiles │ Local project | ||
└─────────────────┘ └────────────────┘ | ||
``` | ||
|
||
Experiment files that are normally tracked in Git (like code versions) are | ||
shared using Git remotes, and files or directories tracked with DVC (like | ||
datasets) are shared using DVC remotes. | ||
> Specifically, data, models, etc. are tracked and <abbr>cached</abbr> by DVC | ||
> and thus will be transferred to/from | ||
> [remote storage](/doc/command-reference/remote) (e.g. Amazon S3 or Google | ||
> Drive). Small files like [DVC metafiles](/doc/user-guide/project-structure) | ||
> and code are tracked by Git, so DVC pushes and pulls them to/from your | ||
> existing [Git remotes]. | ||
> See [Git remotes guide] and `dvc remote add` for information on setting them | ||
> up. | ||
[dvc experiments]: /doc/user-guide/experiment-management/experiments-overview | ||
[created]: /doc/user-guide/experiment-management/running-experiments | ||
[sharing regular project versions]: /doc/use-cases/sharing-data-and-model-files | ||
[git remotes]: https://git-scm.com/book/en/v2/Git-Basics-Working-with-Remotes | ||
|
||
[git remotes guide]: | ||
https://git-scm.com/book/en/v2/Git-Basics-Working-with-Remotes | ||
## Preparation | ||
|
||
Normally, there should already be a Git remote called `origin` when you clone a | ||
repo. Use `git remote -v` to list your Git remotes: | ||
Make sure that you have the necessary remotes setup. Let's confirm with | ||
`git remote -v` and `dvc remote list`: | ||
|
||
```dvc | ||
$ git remote -v | ||
origin https://github.com/iterative/example-dvc-experiments (fetch) | ||
origin https://github.com/iterative/example-dvc-experiments (push) | ||
``` | ||
|
||
Similarly, you can see the DVC remotes in you project using `dvc remote list`: | ||
origin [email protected]:iterative/get-started-experiments.git (fetch) | ||
origin [email protected]:iterative/get-started-experiments.git (push) | ||
```dvc | ||
$ dvc remote list | ||
storage https://remote.dvc.org/example-dvc-experiments | ||
``` | ||
|
||
## Uploading experiments to remotes | ||
|
||
You can upload an experiment and its files to both remotes using `dvc exp push` | ||
(requires the Git remote name and experiment name as arguments). | ||
|
||
```dvc | ||
$ dvc exp push origin exp-abc123 | ||
storage s3://mybucket/my-dvc-store | ||
``` | ||
|
||
> Use `dvc exp show` to find experiment names. | ||
This pushes the necessary DVC-tracked files from the cache to the default DVC | ||
remote (similar to `dvc push`). You can prevent this behavior by using the | ||
`--no-cache` option to the command above. | ||
|
||
If there's no default DVC remote, it will ask you to define one with | ||
`dvc remote default`. If you don't want a default remote, or if you want to use | ||
a different remote, you can specify one with the `--remote` (`-r`) option. | ||
|
||
DVC can use multiple threads to upload files (4 per CPU core by default). You | ||
can set the number with `--jobs` (`-j`). Please note that increases in | ||
performance also depend on the connection bandwidth and remote configurations. | ||
> ⚠️ Note that DVC can only authenticate with Git remotes using [SSH URLs]. | ||
> 📖 See also the [run-cache] mechanism. | ||
[ssh urls]: | ||
https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols#_the_protocols | ||
|
||
[run-cache]: /doc/user-guide/project-structure/internal-files#run-cache | ||
## Uploading experiments | ||
|
||
## Listing experiments remotely | ||
You can upload an experiment with all of its files and data using | ||
`dvc exp push`, which takes a Git remote name and an experiment ID or name as | ||
arguments. | ||
|
||
In order to list experiments in a DVC project, you can use the `dvc exp list` | ||
command. With no command line options, it lists the experiments in the current | ||
project. | ||
|
||
You can supply a Git remote name to list the experiments: | ||
> 💡 You can use `dvc exp show` to find experiment names. | ||
```dvc | ||
$ dvc exp list origin | ||
main: | ||
cnn-128 | ||
cnn-32 | ||
cnn-64 | ||
cnn-96 | ||
$ dvc exp push origin exp-abc123 | ||
``` | ||
|
||
Note that by default this only lists experiments derived from the current commit | ||
(local `HEAD` or default remote branch). You can list all the experiments | ||
(derived from from every branch and commit) with the `--all` option: | ||
Once pushed, you can easily [list remote experiments] (with `dvc exp list`). To | ||
pus | ||
|
||
```dvc | ||
$ dvc exp list origin --all | ||
0b5bedd: | ||
exp-9edbe | ||
0f73830: | ||
exp-280e9 | ||
exp-4cd96 | ||
... | ||
main: | ||
cnn-128 | ||
... | ||
``` | ||
> See also [How to Share Many Experiments][share many]. | ||
When you don't need to see the parent commits, you can list experiment names | ||
only, with `--names-only`: | ||
|
||
```dvc | ||
$ dvc exp list origin --names-only | ||
cnn-128 | ||
cnn-32 | ||
cnn-64 | ||
cnn-96 | ||
``` | ||
[list remote experiments]: | ||
/doc/user-guide/experiment-management/comparing-experiments#list-experiments-saved-remotely | ||
[share many]: /doc/user-guide/how-to/share-many-experiments | ||
|
||
## Downloading experiments from remotes | ||
## Downloading experiments | ||
|
||
When you clone a DVC repository, it doesn't fetch any experiments by default. In | ||
order to get them, use `dvc exp pull` (with the Git remote and the experiment | ||
name), for example: | ||
|
||
```dvc | ||
$ dvc exp pull origin cnn-64 | ||
$ dvc exp pull origin cnn-32 | ||
``` | ||
|
||
This pulls all the necessary files from both remotes. Again, you need to have | ||
both of these configured (see this | ||
[earlier section](#prepare-remotes-to-share-experiments)). | ||
|
||
You can specify a remote to pull from with `--remote` (`-r`). | ||
|
||
DVC can use multiple threads to download files (4 per CPU core typically). You | ||
can set the number with `--jobs` (`-j`). | ||
|
||
If an experiment being pulled already exists in the local project, DVC won't | ||
overwrite it unless you supply `--force`. | ||
|
||
### Example: Pushing or pulling multiple experiments | ||
|
||
You can create a loop to upload or download all experiments like this: | ||
|
||
```dvc | ||
$ dvc exp list --all --names-only | while read -r expname ; do \ | ||
dvc exp pull origin ${expname} \ | ||
done | ||
``` | ||
|
||
> Without `--all`, only the experiments derived from the current commit will be | ||
> pushed/pulled. | ||
## Example: Creating a directory for an experiment | ||
|
||
A good way to isolate experiments is to create a separate home directory for | ||
each one. | ||
|
||
> Another alternative is to use `dvc exp apply` and `dvc exp branch`, but here | ||
> we'll see how to use `dvc exp pull` to copy an experiment. | ||
Suppose there is a <abbr>DVC repository</abbr> in `~/my-project` with multiple | ||
experiments. Let's create a copy of experiment `exp-abc12` from there. | ||
|
||
First, clone the repo into another directory: | ||
|
||
```dvc | ||
$ git clone ~/my-project ~/my-experiment | ||
$ cd ~/my-experiment | ||
``` | ||
|
||
Git sets the `origin` remote of the cloned repo to `~/my-project`, so you can | ||
see your all experiments from `~/my-experiment` like this: | ||
|
||
```dvc | ||
$ dvc exp list origin | ||
main: | ||
exp-abc12 | ||
... | ||
``` | ||
|
||
If there is no DVC remote in the original repository, you can define its | ||
<abbr>cache</abbr> as the clone's `dvc remote`: | ||
|
||
```dvc | ||
$ dvc remote add --local --default storage ~/my-project/.dvc/cache | ||
``` | ||
|
||
> ⚠️ `--local` is important here, so that the configuration change doesn't get | ||
> to the original repo accidentally. | ||
If there's a DVC remote for the project, assuming the experiments have been | ||
pushed there, you can pull the one in question: | ||
|
||
```dvc | ||
$ dvc exp pull origin exp-abc12 | ||
``` | ||
|
||
Then we can `dvc apply` this experiment and get a <abbr>workspace</abbr> that | ||
contains all of its files: | ||
|
||
```dvc | ||
$ dvc exp apply exp-abc12 | ||
``` | ||
|
||
Now you have a dedicated directory for your experiment, containing all its | ||
artifacts! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# How to Share Many Experiments | ||
|
||
`dvc exp push` and `dvc exp push` allow us to [share experiments] between | ||
repositories via existing DVC and Git remotes. These however work on individual | ||
experiments. | ||
|
||
Here's a simple shell loop to push or pull all experiments (Linux): | ||
|
||
```dvc | ||
$ dvc exp list --all --names-only | while read -r expname ; do \ | ||
dvc exp pull origin ${expname} \ | ||
done | ||
``` | ||
|
||
> 📖 See [Listing Experiments] for more info on `dvc exp list`. | ||
[share experiments]: /doc/user-guide/experiment-management/sharing-experiments | ||
[listing experiments]: | ||
/doc/user-guide/experiment-management/comparing-experiments#list-experiments-in-the-project |