-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
live: initial docs draft #2227
live: initial docs draft #2227
Changes from 29 commits
36d7286
cbe791d
d13ddf0
edd4138
dde7364
2285569
4e9148c
8478818
0f8b215
f17fdf9
890d9a1
280123c
a534665
3b37309
54d78e0
cf7fadb
c7c0dc1
507d991
f5e4b3e
a3da8ce
cd8a729
8b3dcf4
b4f87c1
6a46050
7c06934
05ae874
402ef5b
b0f5481
1bc36e8
837ba4a
1c60f27
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
# Dvclive with DVC | ||
|
||
Even though Dvclive does not require DVC, they can integrate in several useful | ||
ways. | ||
|
||
> In this section we will modify the [basic usage example](/doc/dvclive/usage) | ||
> to see how DVC can cooperate with Dvclive module. | ||
|
||
```python | ||
# train.py | ||
|
||
from keras.datasets import mnist | ||
from keras.models import Sequential | ||
from keras.layers.core import Dense, Activation | ||
from keras.utils import np_utils | ||
|
||
def load_data(): | ||
(x_train, y_train), (x_test, y_test) = mnist.load_data() | ||
|
||
x_train = x_train.reshape(60000, 784) | ||
x_test = x_test.reshape(10000, 784) | ||
x_train = x_train.astype('float32') | ||
x_test = x_test.astype('float32') | ||
x_train /= 255 | ||
x_test /= 255 | ||
classes = 10 | ||
y_train = np_utils.to_categorical(y_train, classes) | ||
y_test = np_utils.to_categorical(y_test, classes) | ||
return (x_train, y_train), (x_test, y_test) | ||
|
||
def get_model(): | ||
model = Sequential() | ||
model.add(Dense(512, input_dim=784)) | ||
model.add(Activation('relu')) | ||
|
||
model.add(Dense(10, input_dim=512)) | ||
|
||
model.add(Activation('softmax')) | ||
|
||
model.compile(loss='categorical_crossentropy', | ||
metrics=['accuracy'], optimizer='sgd') | ||
return model | ||
|
||
|
||
from keras.callbacks import Callback | ||
import dvclive | ||
|
||
class MetricsCallback(Callback): | ||
def on_epoch_end(self, epoch: int, logs: dict = None): | ||
logs = logs or {} | ||
for metric, value in logs.items(): | ||
dvclive.log(metric, value) | ||
dvclive.next_step() | ||
|
||
(x_train, y_train), (x_test, y_test) = load_data() | ||
model = get_model() | ||
|
||
# dvclive.init("training_metrics") # Implicit with DVC | ||
model.fit(x_train, | ||
y_train, | ||
validation_data=(x_test, y_test), | ||
batch_size=128, | ||
epochs=3, | ||
callbacks=[MetricsCallback()]) | ||
``` | ||
|
||
Note that when using Dvclive in a DVC project, there is no need for manual | ||
initialization of Dvclive (no `dvclive.init()` call). | ||
|
||
Let's use `dvc stage add` to create a stage to wrap this code (don't forget to | ||
`dvc init` first): | ||
|
||
```dvc | ||
$ dvc stage add -n train --live training_metrics | ||
-d train.py python train.py | ||
``` | ||
|
||
`dvc.yaml` will contain a new `train` stage with the Dvclive | ||
[configuration](/doc/dvclive/usage#initial-configuration) (in the `live` field): | ||
|
||
```yaml | ||
stages: | ||
train: | ||
cmd: python train.py | ||
deps: | ||
- train.py | ||
live: | ||
training_metrics: | ||
summary: true | ||
html: true | ||
``` | ||
|
||
The value passed to `--live` (`training_metrics`) became the directory `path` | ||
for Dvclive to write logs in. Other supported command options for DVC | ||
integration: | ||
|
||
- `--live-no-summary` - passes `summary=False` to Dvclive. | ||
- `--live-no-html` - passes `html=False` to Dvclive. | ||
Comment on lines
+93
to
+98
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done in 4042082. |
||
|
||
> Note that these are convenience CLI options. You can still use | ||
> `dvclive.init()` manually, which it will override `dvc stage add` flags. Just | ||
> be careful to match the `--live` value (CLI) and `path` argument (code). | ||
|
||
Run the training with `dvc repro`: | ||
|
||
```bash | ||
$ dvc repro train | ||
``` | ||
|
||
After that's finished, you should see the following content in the project: | ||
|
||
```bash | ||
$ ls | ||
dvc.lock training_metrics training_metrics.json | ||
dvc.yaml training_metrics.html train.py | ||
``` | ||
|
||
If you open `training_metrics.html` in a browser, you'll see a plot for metrics | ||
logged during the model training! | ||
|
||
![](/img/dvclive_report.png) | ||
|
||
## Further integrations | ||
|
||
Dvclive is capable of creating _checkpoint_ signal files used by | ||
[experiments](/doc/user-guide/experiment-management). See this example | ||
[repository](https://github.com/iterative/dvc-checkpoints-mnist) to see how. | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# dvclive | ||
|
||
[`dvclive`](/doc/dvclive) is an open-source Python library for monitoring the | ||
progress of metrics during training of machine learning models. | ||
|
||
Dvclive integrates seamlessly with [DVC](https://dvc.org/) and the logs it | ||
produces can be fed as `dvc plots`. However, `dvc` is not needed to work with | ||
`dvclive` logs, and since they're saved as easily parsable TSV files, you can | ||
use your preferred visualization method. | ||
|
||
We have created Dvclive with two principles in mind: | ||
|
||
- **No dependencies.** While you can install optional integrations for various | ||
frameworks, the basic `dvclive` installation doesn't have requirements besides | ||
[Python](https://www.python.org/). | ||
- **DVC integration.** `dvc` recognizes when its being used along with | ||
`dvclive`. This enables useful features automatically, like producing model | ||
training summaries, among others. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,144 @@ | ||
# Usage | ||
pared marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
We will use sample [MNIST classification](http://yann.lecun.com/exdb/mnist/) | ||
training code in order to see how one can introduce Dvclive into the workflow. | ||
|
||
> Note that [keras](https://keras.io/about/#installation-amp-compatibility) is | ||
> required throughout these examples. | ||
|
||
```python | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I should update the example repo I have with this example. Keras really makes for a clean, minimal example. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, but on the other hand we need to prepare |
||
# train.py | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
from keras.datasets import mnist | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
from keras.models import Sequential | ||
from keras.layers.core import Dense, Activation | ||
from keras.utils import np_utils | ||
|
||
def load_data(): | ||
(x_train, y_train), (x_test, y_test) = mnist.load_data() | ||
|
||
x_train = x_train.reshape(60000, 784) | ||
x_test = x_test.reshape(10000, 784) | ||
x_train = x_train.astype('float32') | ||
x_test = x_test.astype('float32') | ||
x_train /= 255 | ||
x_test /= 255 | ||
classes = 10 | ||
y_train = np_utils.to_categorical(y_train, classes) | ||
y_test = np_utils.to_categorical(y_test, classes) | ||
return (x_train, y_train), (x_test, y_test) | ||
|
||
def get_model(): | ||
model = Sequential() | ||
model.add(Dense(512, input_dim=784)) | ||
model.add(Activation('relu')) | ||
|
||
model.add(Dense(10, input_dim=512)) | ||
|
||
model.add(Activation('softmax')) | ||
|
||
model.compile(loss='categorical_crossentropy', | ||
metrics=['accuracy'], optimizer='sgd') | ||
return model | ||
|
||
|
||
(x_train, y_train), (x_test, y_test) = load_data() | ||
model = get_model() | ||
|
||
model.fit(x_train, | ||
y_train, | ||
validation_data=(x_test, y_test), | ||
batch_size=128, | ||
epochs=3) | ||
``` | ||
|
||
> You may want to run the code manually to verify that the model gets trained. | ||
|
||
In this example we are training the `model` for 3 epochs. Lets use `dvclive` to | ||
log the `accuracy`, `loss`, `validation_accuracy` and `validation_loss` after | ||
each epoch, so that we can observe how the training progresses. | ||
|
||
In order to do that, we will provide a | ||
[`Callback`](https://keras.io/api/callbacks/) for the `fit` method call: | ||
|
||
```python | ||
from keras.callbacks import Callback | ||
import dvclive | ||
class MetricsCallback(Callback): | ||
def on_epoch_end(self, epoch: int, logs: dict = None): | ||
logs = logs or {} | ||
for metric, value in logs.items(): | ||
dvclive.log(metric, value) | ||
dvclive.next_step() | ||
``` | ||
|
||
On the end of each epoch, this callback will iterate over the gathered metrics | ||
(`logs`) and use the `dvclive.log()` function to record their respective value. | ||
After that we call `dvclive.next_step()` to signal Dvclive that we are done | ||
logging for the current iteration. | ||
|
||
And in order to make that work, we need to plug it in with this change: | ||
|
||
```diff | ||
+ dvclive.init("training_metrics") | ||
model.fit(x_train, | ||
y_train, | ||
validation_data=(x_test, y_test), | ||
batch_size=128, | ||
- epochs=3) | ||
+ epochs=3, | ||
+ callbacks=[MetricsCallback()]) | ||
``` | ||
|
||
We call `dvclive.init()` first, which tells Dvclive to write metrics under the | ||
diven directory path (in this case `./training_metrics`). | ||
|
||
After running the code, the `training_metrics` should be created: | ||
|
||
```bash | ||
$ ls | ||
training_metrics training_metrics.json train.py | ||
``` | ||
|
||
The `*.tsv` files inside have names corresponding to the metrics logged during | ||
training. Note that a `training_metrics.json` file has been created as well. | ||
It's contains information about latest training step. You can prevent its | ||
creation by sending `summary = False` to `dvclive.init()` (see all the | ||
[options](#initial-configuration)). | ||
|
||
```bash | ||
$ ls training_metrics | ||
accuracy.tsv loss.tsv val_accuracy.tsv val_loss.tsv | ||
``` | ||
|
||
Each file contains metrics values logged in each epoch. For example: | ||
|
||
```bash | ||
$ cat training_metrics/accuracy.tsv | ||
timestamp step accuracy | ||
1614129197192 0 0.7612833380699158 | ||
1614129198031 1 0.8736833333969116 | ||
1614129198848 2 0.8907166719436646 | ||
``` | ||
|
||
## Initial configuration | ||
|
||
These are the arguments accepted by `dvclive.init()`: | ||
|
||
- `path` (**required**) - directory where `dvclive` will write TSV log files | ||
|
||
- `step` (`0` by default) - the `step` values in log files will start | ||
incrementing from this value. | ||
|
||
- `resume` (`False`) - if set to `True`, Dvclive will try to read the previous | ||
`step` from the `path` dir and start from that point (unless a `step` is | ||
passed explicitly). Subsequent `next_step()` calls will increment the step. | ||
|
||
- `summary` (`True`) - upon each `next_step()` call, Dvclive will dump a JSON | ||
file containing all metrics gathered in the last step. This file uses the | ||
following naming: `<path>.json` (`path` being the logging directory passed to | ||
`init()`). | ||
|
||
- `html` (`True`) - works only when Dvclive is used alongside DVC. If true, upon | ||
each `next_step()` call, DVC will prepare summary of the training currently | ||
running, with all metrics logged in `path`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pared can these be on the top of the file for readability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, shouldn't classes (class MetricsCallback) be defined before functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, I thought that including it just above "execution" part makes it easier to copy-paste for user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I missed this. I'll send a PR with what I meant and request your review @pared 🙂