diff --git a/content/docs/command-reference/metrics/index.md b/content/docs/command-reference/metrics/index.md
index 25b9170426..0ffa120024 100644
--- a/content/docs/command-reference/metrics/index.md
+++ b/content/docs/command-reference/metrics/index.md
@@ -15,16 +15,6 @@ positional arguments:
diff Show changes in metrics between commits.
```
-## Types of metrics
-
-DVC has two concepts for metrics, that represent different results of machine
-learning training or data processing:
-
-1. `dvc metrics` represent **scalar numbers** such as AUC, _true positive rate_,
- etc.
-2. `dvc plots` can be used to visualize **data series** such as AUC curves, loss
- functions, confusion matrices, etc.
-
## Description
In order to follow the performance of machine learning experiments, DVC has the
@@ -32,9 +22,9 @@ ability to mark a certain stage outputs as metrics. These metrics
are project-specific floating-point or integer values e.g. AUC, ROC, false
positives, etc.
-This type of metrics files are typically generated by user data processing code,
-and are tracked using the `-m` (`--metrics`) and `-M` (`--metrics-no-cache`)
-options of `dvc stage add`.
+Metrics files are typically generated by user data processing code, and are
+tracked using the `-m` (`--metrics`) and `-M` (`--metrics-no-cache`) options of
+`dvc stage add`.
In contrast to `dvc plots`, these metrics should be stored in hierarchical
files. Unlike its `dvc plots` counterpart, `dvc metrics diff` can report the
diff --git a/content/docs/command-reference/plots/diff.md b/content/docs/command-reference/plots/diff.md
index 856168e214..a1b4845d8f 100644
--- a/content/docs/command-reference/plots/diff.md
+++ b/content/docs/command-reference/plots/diff.md
@@ -1,7 +1,7 @@
# plots diff
-Show multiple versions of [plot metrics](/doc/command-reference/plots) by
-overlaying them in a single image. This allows to compare them easily.
+Show multiple versions of [plots](/doc/command-reference/plots) by overlaying
+them in a single image. This allows to compare them easily.
## Synopsis
@@ -123,11 +123,11 @@ file:///Users/usr/src/dvc_plots/index.html
Compare two specific versions (commit hashes, tags, or branches):
```cli
-$ dvc plots diff HEAD 0135527 --targets logs.csv
+$ dvc plots diff HEAD^ 0135527 --targets logs.csv
file:///Users/usr/src/dvc_plots/index.html
```
-
+
## Example: Confusion matrix
diff --git a/content/docs/command-reference/plots/index.md b/content/docs/command-reference/plots/index.md
index 7b20023f6b..f724d873ef 100644
--- a/content/docs/command-reference/plots/index.md
+++ b/content/docs/command-reference/plots/index.md
@@ -1,7 +1,7 @@
# plots
-A set of commands to visualize and compare _plot metrics_:
-[show](/doc/command-reference/plots/show),
+A set of commands to visualize and compare data series or images from ML
+projects: [show](/doc/command-reference/plots/show),
[diff](/doc/command-reference/plots/diff),
[modify](/doc/command-reference/plots/modify) and
[templates](/doc/command-reference/plots/templates).
@@ -13,31 +13,23 @@ usage: dvc plots [-h] [-q | -v] {show,diff,modify,templates} ...
positional arguments:
COMMAND
- show Generate plot from a metrics file.
- diff Plot differences in metrics between commits.
- modify Modify display properties of data-series plots (has no effect on image-type plots).
- templates Write built-in plots templates to a directory (.dvc/plots by default).
+ show Generate plots from target files or from `plots`
+ definitions in `dvc.yaml`.
+ diff Show multiple versions of a plot by overlaying them
+ in a single image.
+ modify Modify display properties of data-series plots
+ defined in stages (has no effect on image plots).
+ templates Write built-in plots templates to a directory
+ (.dvc/plots by default).
```
-## Types of metrics
-
-DVC has two concepts for metrics, that represent different results of machine
-learning training or data processing:
-
-1. `dvc metrics` represent **scalar numbers** such as AUC, _true positive rate_,
- etc.
-2. `dvc plots` can be used to visualize **data series** such as AUC curves, loss
- functions, confusion matrices, etc.
-
## Description
-DVC provides a set of commands to visualize certain metrics of machine learning
-experiments as plots. Usual plot examples are AUC curves, loss functions,
-confusion matrices, among others.
-
-This type of metrics files are created by users, or generated by user data
-processing code, and can be defined in `dvc.yaml` (`plots` field) for tracking
-(optional).
+DVC provides a set of commands to visualize data produced by machine learning
+projects. Usual plots include AUC curves, loss functions, or confusion matrices,
+for example. Plots are a great alternative to `dvc metrics` when working with
+multi-dimensional performance data. They also help you present and compare
+[experiments] effectively.
DVC can work with two types of plots files:
@@ -50,17 +42,18 @@ DVC plots from the [VS Code Extension], which includes a special [Plots
Dashboard] that corresponds to the features in the `dvc plots` commands.
Data-series plots utilize [Vega-Lite](https://vega.github.io/vega-lite/) for
-rendering (declarative JSON grammar for defining graphics). Image-type plots are
-rendered using `` tags directly.
+rendering (declarative JSON grammar for defining graphics). Images are rendered
+using `
` tags directly.
[vs code extension]:
https://marketplace.visualstudio.com/items?itemName=Iterative.dvc
[plots dashboard]:
https://github.com/iterative/vscode-dvc/blob/main/extension/resources/walkthrough/plots.md
+[experiments]: /doc/user-guide/experiment-management/experiments-overview
-## Supported file formats
+### Supported file formats
-Image-type plots are included in HTML as-is, without additional processing.
+Images are included in HTML as-is, without additional processing.
> We recommend to track these source image files with DVC instead of Git, to
> prevent the repository from bloating.
@@ -105,7 +98,144 @@ names in the `train` array below:
}
```
-## Plot templates (data series only)
+## Defining plots
+
+In order to create visualizations, users need to provide the data and
+(optionally) configuration that will help customize the plot. DVC provides two
+ways to configure visualizations. Users can mark specific stage
+outputs as plots or define top-level `plots` in `dvc.yaml`.
+
+### Stage plots
+
+When using `dvc stage add`, instead of using `--outs/--outs-no-cache` particular
+outputs can be marked with `--plots/--plots-no-cache`. This will tell DVC that
+they are intended for visualizations.
+
+Upon running `dvc plots show/diff` DVC will collect stage plots alongside the
+[top-level plots](#top-level-plots) and display them conforming to their
+configuration. Note, that if there are stage plots in the project and they are
+also used in some top-level definitions, DVC will create separate rendering for
+the stage plots and all definitions using them.
+
+This special type of outputs might come in handy if users want to visually
+compare experiments results with other experiments versions and not bother with
+writing top-level plot definitions in `dvc.yaml`.
+
+### Top-level plots
+
+Plots can also be defined in a top-level `plots` key in `dvc.yaml`. Unlike
+[stage plots](#stage-plots), these definitions let you overlay plots from
+different data sources, for example training vs. test results (on the current
+project version). Conversely, you can create multiple plots from a single source
+file. You can also use any plot file in the project, regardless of whether it's
+a stage outputs. This creates a separation between visualization and outputs.
+
+In order to define the plot users need to provide data and an optional
+configuration for the plot. The plots should be defined in `dvc.yaml` file under
+`plots` key.
+
+```yaml
+# dvc.yaml
+stages: ...
+
+plots: ...
+```
+
+Every plot has to have its own ID. Configuration, if provided, should be a
+dictionary.
+
+In the simplest use case, a user can provide the file path as the plot ID and
+not provide configuration at all:
+
+```yaml
+# dvc.yaml
+---
+plots:
+ logs.csv:
+```
+
+In that case the default behavior will be applied. DVC will take data from
+`logs.csv` file and apply `linear` plot
+[template](/doc/command-reference/plots#plot-templates) to the last found column
+(CSV, TSV files) or field (JSON, YAML).
+
+We can customize the plot by adding appropriate fields to the configuration:
+
+```yaml
+# dvc.yaml
+---
+plots:
+ confusion_matrix:
+ y:
+ confusion_matrix_data.csv: predicted_class
+ x: actual_class
+ template: confusion
+```
+
+In this case we provided `confusion_matrix` as a plot ID. It will be displayed
+in the plot as a title, unless we override it with `title` field. In this case
+we provided data source in `y` axis definition. Data will be sourced from
+`confusion_matrix_data.csv`. As `y` axis we will use `predicted_class` field. On
+`x` axis we will have `actual_class` field. Note that DVC will assume that
+`actual_class` is inside `confusion_matrix_data.csv`.
+
+We can provide multiple columns/fields from the same file:
+
+```yaml
+#dvc.yaml
+---
+plots:
+ multiple_series:
+ y:
+ logs.csv: [accuracy, loss]
+ x: epoch
+```
+
+In this case, we will take `accuracy` and `loss` fields and display them agains
+`epoch` column, all coming from `logs.csv` file.
+
+We can source the data from multiple files too:
+
+```yaml
+#dvc.yaml
+---
+plots:
+ multiple_files:
+ y:
+ train_logs.csv: accuracy
+ test_logs.csv: accuracy
+ x: epoch
+```
+
+In this case we will plot `accuracy` field from both `train_logs.csv` and
+`test_logs.csv` against the `epoch`. Note that both files have to have `epoch`
+field.
+
+### Available configuration fields
+
+- `x` - field name from which the X axis data comes from. An auto-generated
+ _step_ field is used by default. It has to be a string.
+
+- `y` - field name from which the Y axis data comes from.
+ - Top-level plots: It can be a string, list or dictionary. If its a string or
+ list, it is assumed that plot ID will be the path to the data source.
+ String, or list elements will be the names of data columns or fields withing
+ the source file. If this field is a dictionary, it is assumed that its keys
+ are paths to data sources. The values have to be either strings or lists,
+ and are treated as column(s)/field(s) within respective files.
+ - Plot outputs: It is a field name from which the Y axis data comes from.
+- `x_label` - X axis label. The X field name is the default.
+- `y_label` - Y axis label. If all provided Y entries have the same field name,
+ this name will be the default, `y` string otherwise.
+- `title` - Plot title. Defaults:
+ - Top-level plots: `path/to/dvc.yaml::plot_id`
+ - Plot outputs: Path to the file.
+
+Refer to the [`show` command] documentation for examples.
+
+[`show` command]: /doc/command-reference/plots/show#example-top-level-plots
+
+## Plot templates (data-series only)
DVC uses [Vega-Lite](https://vega.github.io/vega-lite/) JSON specifications to
create plots from user data. A set of built-in _plot templates_ are included.
@@ -133,7 +263,7 @@ DVC has the following built-in plot templates:
- `confusion` - confusion matrix, see
[example](/doc/command-reference/plots#example-confusion-matrix)
-[custom template]: https://dvc.org/doc/command-reference/plots/templates
+[custom templates]: https://dvc.org/doc/command-reference/plots/templates
- `confusion_normalized` - confusion matrix with values normalized to <0, 1>
range
@@ -187,7 +317,7 @@ important fields that DVC adds to the plot data:
Refer to [`templates`](/doc/command-reference/plots/templates) command for more
information on how to prepare your own template from pre-defined ones.
-## HTML templates
+## Custom HTML templates
It's possible to supply an HTML file to `dvc plots show` and `dvc plots diff` by
using the the `--html-template` option. This allows you to customize the
@@ -209,54 +339,60 @@ this feature to render DVC plots without an Internet connection, below.
- `-v`, `--verbose` - displays detailed tracing information.
-## Example: Tabular data
-
-We'll use tabular metrics file `logs.csv` for this example:
+## Example: Offline HTML Template
-```
-epoch,accuracy,loss,val_accuracy,val_loss
-0,0.9418667,0.19958884770199656,0.9679,0.10217399864746257
-1,0.9763333,0.07896138601688048,0.9768,0.07310650711813942
-2,0.98375,0.05241111190887168,0.9788,0.06665669009438716
-3,0.98801666,0.03681169906261687,0.9781,0.06697812260198989
-4,0.99111664,0.027362171787042946,0.978,0.07385754839298315
-5,0.9932333,0.02069501801203781,0.9771,0.08009233058886166
-6,0.9945,0.017702101902437668,0.9803,0.07830339228538505
-7,0.9954,0.01396906608727198,0.9802,0.07247738889862157
-```
+The plots generated by `dvc plots` uses Vega-Lite JavaScript libraries, and by
+default these load [online resources](https://vega.github.io/vega/usage/#embed).
+There may be times when you need to produce plots without Internet access, or
+want to customize the plots output to put some extra content, like banners or
+extra text. DVC allows to replace the HTML file that contains the final plots.
-Let's plot the last column (default behavior):
+Download the Vega-Lite libraries into the directory where you'll produce the
+`dvc plots`:
```dvc
-$ dvc plots show logs.csv
-file:///Users/usr/src/dvc_plots/index.html
+$ wget https://cdn.jsdelivr.net/npm/vega@5.20.2 -O my_vega.js
+$ wget https://cdn.jsdelivr.net/npm/vega-lite@5.1.0 -O my_vega_lite.js
+$ wget https://cdn.jsdelivr.net/npm/vega-embed@6.18.2 -O my_vega_embed.js
```
-
+Create the following HTML file and save it in `.dvc/plots/mypage.html`:
-Difference in this metric between the current project version and the previous
-commit:
+```html
+
+