Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GS:Visualization and Plots #3050

Merged
merged 69 commits into from
Apr 27, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
2b04b63
Added initial paragraphs
iesahin Nov 30, 2021
e865161
added three tentative tasks
iesahin Nov 30, 2021
f6bc44e
added visualization to the sidebar
iesahin Dec 14, 2021
b9013dc
moved visualization below the experiments in the sidebar
iesahin Dec 14, 2021
07086b9
added first two ways to visualize
iesahin Dec 14, 2021
7175d17
a paragraph for dvclive & keras
iesahin Dec 15, 2021
ff00094
changes required in the code for dvclive
iesahin Dec 15, 2021
6771178
simplified the callback
iesahin Dec 15, 2021
77dbdf5
added dvclive info and reformatted
iesahin Dec 15, 2021
42bf550
added install reference to README
iesahin Dec 17, 2021
10e0344
added dvclive image
iesahin Dec 28, 2021
b6c8b3e
linked dvclive image
iesahin Dec 28, 2021
1952a9b
fixed dvc plots command
iesahin Dec 28, 2021
653ffaa
Added confusion image link
iesahin Dec 28, 2021
219a59c
fixed dvc plots command
iesahin Dec 28, 2021
f4df213
added confusion image
iesahin Dec 28, 2021
93fdb93
adding image for confusion matrxi
iesahin Dec 28, 2021
925f472
add confusion image
iesahin Dec 28, 2021
7053759
updated misclassification image
iesahin Jan 5, 2022
b343d34
revised
iesahin Feb 9, 2022
dc6a2ed
added links
iesahin Feb 9, 2022
51b9820
updated the initial sentences
iesahin Feb 14, 2022
7dd9675
fixed vega-lite link
iesahin Feb 14, 2022
07dcd51
fixed dvclive link
iesahin Feb 14, 2022
4f3b0f1
fixed readme link
iesahin Feb 14, 2022
c90918d
comma
iesahin Feb 14, 2022
9a7ab68
removed "now"
iesahin Feb 14, 2022
5ff8661
removed older metrics, plots, vis from the sidebar
iesahin Feb 14, 2022
6b3f660
rephrase
iesahin Feb 14, 2022
8c9f957
Moved the initial text to the end as summary
iesahin Feb 15, 2022
29ceb79
added space for highlighters
iesahin Feb 15, 2022
99b4487
enlarged the confusion image
iesahin Feb 17, 2022
63af2d9
revised after review
iesahin Mar 14, 2022
ca6ae91
revision after review
iesahin Mar 14, 2022
fc49d48
revise after review
iesahin Mar 14, 2022
2eb2e6a
moved the description of image
iesahin Mar 21, 2022
4e6cf30
updated misclassification code link and added description
iesahin Mar 21, 2022
a11b4c5
added section titles
iesahin Mar 21, 2022
3af2c42
Update content/docs/start/visualization.md
jorgeorpinel Apr 11, 2022
b4ee07c
Update content/docs/start/visualization.md
jorgeorpinel Apr 11, 2022
ad9b474
Update content/docs/start/visualization.md
jorgeorpinel Apr 11, 2022
e894d0b
Update content/docs/start/visualization.md
jorgeorpinel Apr 12, 2022
18c45f0
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
4ca9196
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
987105f
added confusion matrix link
iesahin Apr 13, 2022
314791d
plots file
iesahin Apr 13, 2022
f34c59f
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
3b28572
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
a153b70
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
603674a
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
63f9dc0
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
4427ff3
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
1f70192
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
eb1853a
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
79f2bf3
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
4cacc6b
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
80af071
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
debec87
Update content/docs/start/visualization.md
iesahin Apr 13, 2022
fcd0cd5
restyle and minor fix
iesahin Apr 13, 2022
18274b3
fit ... to ... -> use .. in ..
iesahin Apr 13, 2022
2ae5379
deleted the older file
iesahin Apr 13, 2022
b4681fe
Apply suggestions from code review
jorgeorpinel Apr 14, 2022
6f5264d
renamed confusion.png to misclassified.png in the example
iesahin Apr 20, 2022
f26f366
rename one more
iesahin Apr 20, 2022
c1f4961
Update content/docs/start/visualization.md
iesahin Apr 20, 2022
b0ed669
Update content/docs/start/visualization.md
iesahin Apr 27, 2022
7e411e2
Update content/docs/start/visualization.md
iesahin Apr 27, 2022
cf02e81
Revert "removed older metrics, plots, vis from the sidebar"
iesahin Apr 27, 2022
b2a832e
readd metric-params-plots
iesahin Apr 27, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions content/docs/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,13 @@
"tutorials": {
"katacoda": "https://katacoda.com/dvc/courses/get-started/experiments"
}
},
{
"label": "Visualization with Plots",
"slug": "visualization",
"tutorials": {
"katacoda": "https://katacoda.com/dvc/courses/get-started/visualization"
}
}
]
},
Expand Down
109 changes: 109 additions & 0 deletions content/docs/start/visualization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
---
title: 'Get Started: Visualization with Plots'
---

# Get Started: Visualization with Plots
iesahin marked this conversation as resolved.
Show resolved Hide resolved

iesahin marked this conversation as resolved.
Show resolved Hide resolved
In this section, we'll add visualization to the [`example-dvc-experiments`][ede]
project (explored [previously](/doc/start/experiments)). If you would like to
try these yourself, please refer to the project. [README] about how to install.

[ede]: https://github.com/iterative/example-dvc-experiments
[readme]:
https://github.com/iterative/example-dvc-experiments/blob/main/README.md

## Creating plots from tabular data

A useful plot to show the classification performance is the [confusion matrix].
In order to produce it, DVC expects a CSV **plots file** in the form:

```csv
actual,predicted
0,0
0,2
...
```

> We added a [loop] comparing the results to generate this file from the
> predictions.
[loop]:
https://github.com/iterative/example-dvc-experiments/blob/main/src/train.py#L123
[confusion matrix]: https://en.wikipedia.org/wiki/Confusion_matrix

Running the experiment with `dvc exp run` will produce `plots/confusion.csv`.
Use `dvc plots show` to present it as an HTML file, and open it in the browser:

```dvc
$ dvc plots show plots/confusion.csv --template confusion \
-x actual -y predicted
file:///.../example-dvc-experiments/plots/confusion.json.html
```

![confusion matrix](/img/start_visualization_confusion1.png)

## Displaying user-generated plot images

Let's produce another plot to see misclassified examples from each class. This
procedure generates the misclassification examples from the validation data and
arranges them into a _confusion table_ that shows the correct label, and
misclassification sample. The code to generate an image from a set of training
images is omitted here but you can find the code in [the example
project.][misclassified-example-code]
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved

[misclassified-example-code]:
https://github.com/iterative/example-dvc-experiments/blob/48b1e5078c957f71674c00f416290eaa3b20b559/src/util.py#L49

```dvc
$ dvc plots show plots/misclassified.png
```

![Misclassification table](/img/start_visualization_misclassification.png)

## Autogenerating plots from deep learning code

An important issue for deep learning projects is to observe in which epoch do
training and validation loss differ. DVC helps in that regard with its Python
integrations to deep learning libraries via [DVCLive].

The example project uses Keras to train a classifier, and we have a DVCLive
callback that visualizes the training and validation loss for each epoch. We
first import the callback from DVCLive.

```python
from dvclive.keras import DvcLiveCallback
```

Then we add this callback to the
[`fit` method](https://keras.io/api/models/model_training_apis/#fit-method)
call.

```python
model.fit(
...
callbacks=[DvcLiveCallback()],
...)
```

With these two changes, the model metrics are automatically logged to
`dvclive.json` and plotted in `training_metrics/index.html`:

![dvclive](/img/start_visualization_dvclive.png)
Copy link
Contributor

@jorgeorpinel jorgeorpinel Apr 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
![dvclive](/img/start_visualization_dvclive.png)
![DVCLive plots](/img/start_visualization_dvclive.png)

The image is not readable though. Can the fonts be larger?

And maybe the top part with dvclive.json sample should be a regular code block (under the image) with a few rows of the data instead of just one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same in the documentation: https://dvc.org/doc/dvclive/dvclive-with-dvc#html-report

Zooming in to make the axis labels readable makes the whole chart so large. How could we solve this @daavoo ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still change the label as suggested for now.

same in the documentation: https://dvc.org/doc/dvclive/dvclive-with-dvc#html-report

Yes we have that problem in a few places but at least those are GIFs and you can see changing plots so maybe it doesn't matter that much that the text is readable in some other cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to create separate issue or deal with this in #3455

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be tracked in #3470

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but still the image caption could be better. It's helpful to have descriptive image captions for SEO and even image searches.


DVCLive has other capabilities, like saving the model every epoch or modifying
these default values.

In summary, DVC provides more than one option to use visualization in your
workflow:

- DVC can generate HTML files that includes interactive plots from data series
in JSON, YAML, CSV, or TSV format.

- DVC can keep track of image files produced as [plot outputs] from the
training/evaluation scripts.

- [DVCLive] integrations can produce plots automatically during training.

[plot outputs]:
/doc/user-guide/project-structure/pipelines-files#metrics-and-plots-outputs
[dvclive]: /doc/dvclive/dvclive-with-dvc
Binary file added static/img/start_visualization_confusion1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/img/start_visualization_dvclive.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.