From 9cca53005187884899765bab8a6b7725bf7754c2 Mon Sep 17 00:00:00 2001 From: dberenbaum Date: Fri, 21 Jul 2023 16:44:48 -0400 Subject: [PATCH 1/5] dvclive: lightning log_model --- .../ml-frameworks/pytorch-lightning.md | 48 ++++++++++++------- .../start/experiments/experiment-tracking.md | 13 ++--- 2 files changed, 33 insertions(+), 28 deletions(-) diff --git a/content/docs/dvclive/ml-frameworks/pytorch-lightning.md b/content/docs/dvclive/ml-frameworks/pytorch-lightning.md index efff941dec..c813a7b168 100644 --- a/content/docs/dvclive/ml-frameworks/pytorch-lightning.md +++ b/content/docs/dvclive/ml-frameworks/pytorch-lightning.md @@ -61,6 +61,23 @@ checkpointing at all as described in the - `prefix` - (`None` by default) - string that adds to each metric name. +- `log_model` - (`False` by default) - use + [`live.log_artifact()`](/doc/dvclive/live/log_artifact) to log checkpoints + created by + [`ModelCheckpoint`](https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.ModelCheckpoint.html) + and annotate the best checkpoint with `type=model` and `name=best` for use in + [Studio model registry]. DVCLive will cache the checkpoint + directory and delete checkpoints from previous experiments, but + you can recover the checkpoints from any experiment using DVC. + + - if `log_model == 'all'`, checkpoints are logged during training. + + - if `log_model == True`, checkpoints are logged at the end of training, + except when `save_top_k == -1` which also logs every checkpoint during + training. + + - if `log_model == False` (default), no checkpoint is logged. + - `experiment` - (`None` by default) - [`Live`](/doc/dvclive/live) object to be used instead of initializing a new one. @@ -69,6 +86,17 @@ checkpointing at all as described in the ## Examples +- Using `log_model` to save the checkpoints and annotate the best one for use in + [Studio model registry]. + +```python +from dvclive.lightning import DVCLiveLogger + +trainer = Trainer( + logger=DVCLiveLogger(save_dvc_exp=True, log_model=True)) +trainer.fit(model) +``` + - Using `experiment` to pass an existing [`Live`] instance. ```python @@ -93,24 +121,6 @@ trainer = Trainer( trainer.fit(model) ``` -- Using [`live.log_artifact()`](/doc/dvclive/live/log_artifact) to save the - [best checkpoint](https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.ModelCheckpoint.html). - -```python -with Live(save_dvc_exp=True) as live: - checkpoint = ModelCheckpoint(dirpath="mymodel") - trainer = Trainer( - logger=DVCLiveLogger(experiment=live), - callbacks=checkpoint - ) - trainer.fit(model) - live.log_artifact( - checkpoint.best_model_path, - type="model", - name="lightning-model" - ) -``` - ## Output format Each metric will be logged to: @@ -140,3 +150,5 @@ dvclive/metrics/train/epoch/metric.tsv ``` [`live`]: /doc/dvclive/live +[studio model registry]: + /doc/studio/user-guide/model-registry/what-is-a-model-registry diff --git a/content/docs/start/experiments/experiment-tracking.md b/content/docs/start/experiments/experiment-tracking.md index efde7dcb3f..9a9c871e36 100644 --- a/content/docs/start/experiments/experiment-tracking.md +++ b/content/docs/start/experiments/experiment-tracking.md @@ -39,20 +39,13 @@ from dvclive import Live from dvclive.lightning import DVCLiveLogger ... -with Live(save_dvc_exp=True) as live: - checkpoint = ModelCheckpoint(dirpath="mymodel") trainer = Trainer( logger=DVCLiveLogger( - experiment=live - ), - callbacks=checkpoint + save_dvc_exp=True, + log_model=True + ) ) trainer.fit(model) - live.log_artifact( - checkpoint.best_model_path, - type="model", - name="lightning-model" - ) ``` From c04ee1bb344a96a829022ae3f396bc11f5cf5ae8 Mon Sep 17 00:00:00 2001 From: dberenbaum Date: Sat, 22 Jul 2023 10:16:35 -0400 Subject: [PATCH 2/5] expand lightning log_model examples --- content/docs/dvclive/live/log_artifact.md | 16 +++--- .../ml-frameworks/pytorch-lightning.md | 53 +++++++++++++++---- 2 files changed, 52 insertions(+), 17 deletions(-) diff --git a/content/docs/dvclive/live/log_artifact.md b/content/docs/dvclive/live/log_artifact.md index b0c736766c..752e5fa4b4 100644 --- a/content/docs/dvclive/live/log_artifact.md +++ b/content/docs/dvclive/live/log_artifact.md @@ -36,9 +36,10 @@ with Live() as live: ## Description -Uses `dvc add` to track `path` with DVC, generating a `{path}.dvc` file. When -combined with [`save_dvc_exp=True`](/doc/dvclive#initialize-dvclive), it will -ensure that `{path}.dvc` is included in the experiment. +Uses `dvc add` to [track] `path` with DVC, saving it to the DVC +cache and generating a `{path}.dvc` file. When combined with +[`save_dvc_exp=True`](/doc/dvclive#initialize-dvclive), it will ensure that +`{path}.dvc` is included in the experiment. If `Live` was initialized with `dvcyaml=True` (which is the default) and you include any of the optional metadata fields (`type`, `name`, `desc`, `labels`, @@ -71,12 +72,13 @@ the metadata passed as arguments to the corresponding `dvc.yaml`. Passing artifact. Useful if you don't want to track the original path in your repo (for example, it is outside the repo or in a Git-ignored directory). -- `cache` - cache the files with DVC to - [track](/doc/dvclive/how-it-works#track-large-artifacts-with-dvc) them outside - of Git. Defaults to `True`, but set to `False` if you want to annotate - metadata about the artifact without storing a copy in the DVC cache. +- `cache` - cache the files with DVC to [track] them outside of + Git. Defaults to `True`, but set to `False` if you want to annotate metadata + about the artifact without storing a copy in the DVC cache. ## Exceptions - `dvclive.error.InvalidDataTypeError` - thrown if the provided `path` does not have a supported type. + +[track]: /doc/dvclive/how-it-works#track-large-artifacts-with-dvc diff --git a/content/docs/dvclive/ml-frameworks/pytorch-lightning.md b/content/docs/dvclive/ml-frameworks/pytorch-lightning.md index c813a7b168..9438250be8 100644 --- a/content/docs/dvclive/ml-frameworks/pytorch-lightning.md +++ b/content/docs/dvclive/ml-frameworks/pytorch-lightning.md @@ -63,12 +63,8 @@ checkpointing at all as described in the - `log_model` - (`False` by default) - use [`live.log_artifact()`](/doc/dvclive/live/log_artifact) to log checkpoints - created by - [`ModelCheckpoint`](https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.ModelCheckpoint.html) - and annotate the best checkpoint with `type=model` and `name=best` for use in - [Studio model registry]. DVCLive will cache the checkpoint - directory and delete checkpoints from previous experiments, but - you can recover the checkpoints from any experiment using DVC. + created by [`ModelCheckpoint`]. See + [Log model checkpoints](#log-model-checkpoints). - if `log_model == 'all'`, checkpoints are logged during training. @@ -86,17 +82,52 @@ checkpointing at all as described in the ## Examples -- Using `log_model` to save the checkpoints and annotate the best one for use in - [Studio model registry]. +### Log model checkpoints + +Use `log_model` to save the checkpoints. DVCLive will first delete checkpoints +from previous experiments (since DVC tracks checkpoints per +experiment) and then cache the entire checkpoints directory using +`Live.log_artifact()`. At the end of training, DVCLive will annotate the +[`best_model_path`][`ModelCheckpoint`] with `type=model` and `name=best` for use +in [Studio model registry]. + +- Save updates to the checkpoints directory at the end of training: ```python from dvclive.lightning import DVCLiveLogger -trainer = Trainer( - logger=DVCLiveLogger(save_dvc_exp=True, log_model=True)) +logger = DVCLiveLogger(save_dvc_exp=True, log_model=True) +trainer = Trainer(logger=logger) +trainer.fit(model) +``` + +- Save updates to the checkpoints directory whenever a new checkpoint is saved: + +```python +from dvclive.lightning import DVCLiveLogger + +logger = DVCLiveLogger(save_dvc_exp=True, log_model="all") +trainer = Trainer(logger=logger) trainer.fit(model) ``` +- Use a custom `ModelCheckpoint`: + +```python +from dvclive.lightning import DVCLiveLogger + +logger = DVCLiveLogger(save_dvc_exp=True, log_model=True), +checkpoint_callback = ModelCheckpoint( + dirpath="model", + monitor="val_acc", + mode="max", +) +trainer = Trainer(logger=logger, callbacks=[checkpoint_callback]) +trainer.fit(model) +``` + +### Passing additional DVCLive arguments + - Using `experiment` to pass an existing [`Live`] instance. ```python @@ -152,3 +183,5 @@ dvclive/metrics/train/epoch/metric.tsv [`live`]: /doc/dvclive/live [studio model registry]: /doc/studio/user-guide/model-registry/what-is-a-model-registry +[`ModelCheckpoint`]: + https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.ModelCheckpoint.html From f8ee570acc4f1a96ab1b4efa42aa23462c7d0c3f Mon Sep 17 00:00:00 2001 From: dberenbaum Date: Tue, 25 Jul 2023 17:26:55 -0400 Subject: [PATCH 3/5] drop mention of save_dvc_exp in log_artifact --- content/docs/dvclive/live/log_artifact.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/content/docs/dvclive/live/log_artifact.md b/content/docs/dvclive/live/log_artifact.md index 752e5fa4b4..9e0c91ca35 100644 --- a/content/docs/dvclive/live/log_artifact.md +++ b/content/docs/dvclive/live/log_artifact.md @@ -37,9 +37,8 @@ with Live() as live: ## Description Uses `dvc add` to [track] `path` with DVC, saving it to the DVC -cache and generating a `{path}.dvc` file. When combined with -[`save_dvc_exp=True`](/doc/dvclive#initialize-dvclive), it will ensure that -`{path}.dvc` is included in the experiment. +cache and generating a `{path}.dvc` file that acts as a pointer to +the cached data. If `Live` was initialized with `dvcyaml=True` (which is the default) and you include any of the optional metadata fields (`type`, `name`, `desc`, `labels`, From 94df019ae39a26161999bfd21f0b6f1148e9e32c Mon Sep 17 00:00:00 2001 From: dberenbaum Date: Sun, 30 Jul 2023 20:59:59 -0400 Subject: [PATCH 4/5] tweak log_artifact language --- content/docs/dvclive/live/log_artifact.md | 14 ++++++++++---- .../dvclive/ml-frameworks/pytorch-lightning.md | 10 ++++------ 2 files changed, 14 insertions(+), 10 deletions(-) diff --git a/content/docs/dvclive/live/log_artifact.md b/content/docs/dvclive/live/log_artifact.md index 9e0c91ca35..89edcdbadd 100644 --- a/content/docs/dvclive/live/log_artifact.md +++ b/content/docs/dvclive/live/log_artifact.md @@ -36,9 +36,13 @@ with Live() as live: ## Description -Uses `dvc add` to [track] `path` with DVC, saving it to the DVC -cache and generating a `{path}.dvc` file that acts as a pointer to -the cached data. +Log `path`, saving its contents to DVC storage. Also annotate with any included +metadata fields (for example, to be consumed in [Studio model registry] or +automation scenarios). + +If `cache=True` (which is the default), uses `dvc add` to [track] `path` with +DVC, saving it to the DVC cache and generating a `{path}.dvc` file +that acts as a pointer to the cached data. If `Live` was initialized with `dvcyaml=True` (which is the default) and you include any of the optional metadata fields (`type`, `name`, `desc`, `labels`, @@ -46,7 +50,7 @@ include any of the optional metadata fields (`type`, `name`, `desc`, `labels`, [artifact](/doc/user-guide/project-structure/dvcyaml-files#artifacts) and all the metadata passed as arguments to the corresponding `dvc.yaml`. Passing `type="model"` will mark it as a `model` for DVC and will make it appear in -[Studio Model Registry](/doc/studio). +[Studio model registry]. ## Parameters @@ -81,3 +85,5 @@ the metadata passed as arguments to the corresponding `dvc.yaml`. Passing have a supported type. [track]: /doc/dvclive/how-it-works#track-large-artifacts-with-dvc +[Studio model registry]: + /doc/studio/user-guide/model-registry/what-is-a-model-registry) diff --git a/content/docs/dvclive/ml-frameworks/pytorch-lightning.md b/content/docs/dvclive/ml-frameworks/pytorch-lightning.md index 9438250be8..af2fa5e08a 100644 --- a/content/docs/dvclive/ml-frameworks/pytorch-lightning.md +++ b/content/docs/dvclive/ml-frameworks/pytorch-lightning.md @@ -84,12 +84,10 @@ checkpointing at all as described in the ### Log model checkpoints -Use `log_model` to save the checkpoints. DVCLive will first delete checkpoints -from previous experiments (since DVC tracks checkpoints per -experiment) and then cache the entire checkpoints directory using -`Live.log_artifact()`. At the end of training, DVCLive will annotate the -[`best_model_path`][`ModelCheckpoint`] with `type=model` and `name=best` for use -in [Studio model registry]. +Use `log_model` to save the checkpoints (it will use `Live.log_artifact()` +internally to save those). At the end of training, DVCLive will annotate the +[`best_model_path`][`ModelCheckpoint`] with name `best` (for example, to be +consumed in [Studio model registry] or automation scenarios). - Save updates to the checkpoints directory at the end of training: From da0bdaf4c804b9de18a250936517593453b2c5fd Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Mon, 31 Jul 2023 13:29:20 +0200 Subject: [PATCH 5/5] Update content/docs/dvclive/live/log_artifact.md --- content/docs/dvclive/live/log_artifact.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/dvclive/live/log_artifact.md b/content/docs/dvclive/live/log_artifact.md index 89edcdbadd..6dab250c52 100644 --- a/content/docs/dvclive/live/log_artifact.md +++ b/content/docs/dvclive/live/log_artifact.md @@ -86,4 +86,4 @@ the metadata passed as arguments to the corresponding `dvc.yaml`. Passing [track]: /doc/dvclive/how-it-works#track-large-artifacts-with-dvc [Studio model registry]: - /doc/studio/user-guide/model-registry/what-is-a-model-registry) + /doc/studio/user-guide/model-registry/what-is-a-model-registry