diff --git a/content/docs/dvclive/dvclive-with-dvc.md b/content/docs/dvclive/dvclive-with-dvc.md new file mode 100644 index 0000000000..a0d5e2575c --- /dev/null +++ b/content/docs/dvclive/dvclive-with-dvc.md @@ -0,0 +1,125 @@ +# Dvclive with DVC + +Even though Dvclive does not require DVC, they can integrate in several useful +ways. + +> In this section we will modify the [basic usage example](/doc/dvclive/usage) +> to see how DVC can cooperate with Dvclive module. + +```python +# train.py + +from keras.datasets import mnist +from keras.models import Sequential +from keras.layers.core import Dense, Activation +from keras.utils import np_utils + +def load_data(): + (x_train, y_train), (x_test, y_test) = mnist.load_data() + + x_train = x_train.reshape(60000, 784) + x_test = x_test.reshape(10000, 784) + x_train = x_train.astype('float32') + x_test = x_test.astype('float32') + x_train /= 255 + x_test /= 255 + classes = 10 + y_train = np_utils.to_categorical(y_train, classes) + y_test = np_utils.to_categorical(y_test, classes) + return (x_train, y_train), (x_test, y_test) + +def get_model(): + model = Sequential() + model.add(Dense(512, input_dim=784)) + model.add(Activation('relu')) + + model.add(Dense(10, input_dim=512)) + + model.add(Activation('softmax')) + + model.compile(loss='categorical_crossentropy', + metrics=['accuracy'], optimizer='sgd') + return model + + +from keras.callbacks import Callback +import dvclive + +class MetricsCallback(Callback): + def on_epoch_end(self, epoch: int, logs: dict = None): + logs = logs or {} + for metric, value in logs.items(): + dvclive.log(metric, value) + dvclive.next_step() + +(x_train, y_train), (x_test, y_test) = load_data() +model = get_model() + +# dvclive.init("training_metrics") # Implicit with DVC +model.fit(x_train, + y_train, + validation_data=(x_test, y_test), + batch_size=128, + epochs=3, + callbacks=[MetricsCallback()]) +``` + +Note that when using Dvclive in a DVC project, there is no need for manual +initialization of Dvclive (no `dvclive.init()` call). + +Let's use `dvc stage add` to create a stage to wrap this code (don't forget to +`dvc init` first): + +```dvc +$ dvc stage add -n train --live training_metrics + -d train.py python train.py +``` + +`dvc.yaml` will contain a new `train` stage with the Dvclive +[configuration](/doc/dvclive/usage#initial-configuration) (in the `live` field): + +```yaml +stages: + train: + cmd: python train.py + deps: + - train.py + live: + training_metrics: + summary: true + html: true +``` + +The value passed to `--live` (`training_metrics`) became the directory `path` +for Dvclive to write logs in. Other supported command options for DVC +integration: + +- `--live-no-summary` - passes `summary=False` to Dvclive. +- `--live-no-html` - passes `html=False` to Dvclive. + +> Note that these are convenience CLI options. You can still use +> `dvclive.init()` manually, which it will override `dvc stage add` flags. Just +> be careful to match the `--live` value (CLI) and `path` argument (code). + +Run the training with `dvc repro`: + +```bash +$ dvc repro train +``` + +After that's finished, you should see the following content in the project: + +```bash +$ ls +dvc.lock training_metrics training_metrics.json +dvc.yaml training_metrics.html train.py +``` + +If you open `training_metrics.html` in a browser, you'll see a plot for metrics +logged during the model training! + +![](/img/dvclive_report.png) + +> Dvclive is capable of creating _checkpoint_ signal files used by +> [experiments](/doc/user-guide/experiment-management). See this example +> [repository](https://github.com/iterative/dvc-checkpoints-mnist) to see how. diff --git a/content/docs/dvclive/index.md b/content/docs/dvclive/index.md new file mode 100644 index 0000000000..6ddfaff16c --- /dev/null +++ b/content/docs/dvclive/index.md @@ -0,0 +1,18 @@ +# dvclive + +[`dvclive`](/doc/dvclive) is an open-source Python library for monitoring the +progress of metrics during training of machine learning models. + +Dvclive integrates seamlessly with [DVC](https://dvc.org/) and the logs it +produces can be fed as `dvc plots`. However, `dvc` is not needed to work with +`dvclive` logs, and since they're saved as easily parsable TSV files, you can +use your preferred visualization method. + +We have created Dvclive with two principles in mind: + +- **No dependencies.** While you can install optional integrations for various + frameworks, the basic `dvclive` installation doesn't have requirements besides + [Python](https://www.python.org/). +- **DVC integration.** `dvc` recognizes when its being used along with + `dvclive`. This enables useful features automatically, like producing model + training summaries, among others. diff --git a/content/docs/dvclive/usage.md b/content/docs/dvclive/usage.md new file mode 100644 index 0000000000..39b1690161 --- /dev/null +++ b/content/docs/dvclive/usage.md @@ -0,0 +1,144 @@ +# Usage Guide + +We will use sample [MNIST classification](http://yann.lecun.com/exdb/mnist/) +training code in order to see how one can introduce Dvclive into the workflow. + +> Note that [keras](https://keras.io/about/#installation-amp-compatibility) is +> required throughout these examples. + +```python +# train.py + +from keras.datasets import mnist +from keras.models import Sequential +from keras.layers.core import Dense, Activation +from keras.utils import np_utils + +def load_data(): + (x_train, y_train), (x_test, y_test) = mnist.load_data() + + x_train = x_train.reshape(60000, 784) + x_test = x_test.reshape(10000, 784) + x_train = x_train.astype('float32') + x_test = x_test.astype('float32') + x_train /= 255 + x_test /= 255 + classes = 10 + y_train = np_utils.to_categorical(y_train, classes) + y_test = np_utils.to_categorical(y_test, classes) + return (x_train, y_train), (x_test, y_test) + +def get_model(): + model = Sequential() + model.add(Dense(512, input_dim=784)) + model.add(Activation('relu')) + + model.add(Dense(10, input_dim=512)) + + model.add(Activation('softmax')) + + model.compile(loss='categorical_crossentropy', + metrics=['accuracy'], optimizer='sgd') + return model + + +(x_train, y_train), (x_test, y_test) = load_data() +model = get_model() + +model.fit(x_train, + y_train, + validation_data=(x_test, y_test), + batch_size=128, + epochs=3) +``` + +> You may want to run the code manually to verify that the model gets trained. + +In this example we are training the `model` for 3 epochs. Lets use `dvclive` to +log the `accuracy`, `loss`, `validation_accuracy` and `validation_loss` after +each epoch, so that we can observe how the training progresses. + +In order to do that, we will provide a +[`Callback`](https://keras.io/api/callbacks/) for the `fit` method call: + +```python +from keras.callbacks import Callback +import dvclive +class MetricsCallback(Callback): + def on_epoch_end(self, epoch: int, logs: dict = None): + logs = logs or {} + for metric, value in logs.items(): + dvclive.log(metric, value) + dvclive.next_step() +``` + +On the end of each epoch, this callback will iterate over the gathered metrics +(`logs`) and use the `dvclive.log()` function to record their respective value. +After that we call `dvclive.next_step()` to signal Dvclive that we are done +logging for the current iteration. + +And in order to make that work, we need to plug it in with this change: + +```diff ++ dvclive.init("training_metrics") + model.fit(x_train, + y_train, + validation_data=(x_test, y_test), + batch_size=128, +- epochs=3) ++ epochs=3, ++ callbacks=[MetricsCallback()]) +``` + +We call `dvclive.init()` first, which tells Dvclive to write metrics under the +diven directory path (in this case `./training_metrics`). + +After running the code, the `training_metrics` should be created: + +```bash +$ ls +training_metrics training_metrics.json train.py +``` + +The `*.tsv` files inside have names corresponding to the metrics logged during +training. Note that a `training_metrics.json` file has been created as well. +It's contains information about latest training step. You can prevent its +creation by sending `summary = False` to `dvclive.init()` (see all the +[options](#initial-configuration)). + +```bash +$ ls training_metrics +accuracy.tsv loss.tsv val_accuracy.tsv val_loss.tsv +``` + +Each file contains metrics values logged in each epoch. For example: + +```bash +$ cat training_metrics/accuracy.tsv +timestamp step accuracy +1614129197192 0 0.7612833380699158 +1614129198031 1 0.8736833333969116 +1614129198848 2 0.8907166719436646 +``` + +## Initial configuration + +These are the arguments accepted by `dvclive.init()`: + +- `path` (**required**) - directory where `dvclive` will write TSV log files + +- `step` (`0` by default) - the `step` values in log files will start + incrementing from this value. + +- `resume` (`False`) - if set to `True`, Dvclive will try to read the previous + `step` from the `path` dir and start from that point (unless a `step` is + passed explicitly). Subsequent `next_step()` calls will increment the step. + +- `summary` (`True`) - upon each `next_step()` call, Dvclive will dump a JSON + file containing all metrics gathered in the last step. This file uses the + following naming: `<path>.json` (`path` being the logging directory passed to + `init()`). + +- `html` (`True`) - works only when Dvclive is used alongside DVC. If true, upon + each `next_step()` call, DVC will prepare summary of the training currently + running, with all metrics logged in `path`. diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json index 733789ca25..ba0f54fa60 100644 --- a/content/docs/sidebar.json +++ b/content/docs/sidebar.json @@ -477,5 +477,17 @@ "slug": "cml-with-npm" } ] + }, + { + "label": "Dvclive", + "slug": "dvclive", + "source": "dvclive/index.md", + "children": [ + "usage", + { + "label": "Dvclive with DVC", + "slug": "dvclive-with-dvc" + } + ] } ] diff --git a/static/img/dvclive_report.png b/static/img/dvclive_report.png new file mode 100644 index 0000000000..c9c4701d57 Binary files /dev/null and b/static/img/dvclive_report.png differ