Skip to content

Commit

Permalink
Merge pull request #1848 from iterative/jorge
Browse files Browse the repository at this point in the history
metrics(plots): regular udpates & terminology
  • Loading branch information
jorgeorpinel authored Oct 11, 2020
2 parents c4867ea + cb5e270 commit f7f0e2b
Show file tree
Hide file tree
Showing 9 changed files with 57 additions and 59 deletions.
33 changes: 14 additions & 19 deletions content/docs/command-reference/metrics/diff.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# metrics diff

Show changes in [metrics](/doc/command-reference/metrics) between commits in the
Compare [metrics](/doc/command-reference/metrics) between two commits in the
<abbr>DVC repository</abbr>, or between a commit and the <abbr>workspace</abbr>.

## Synopsis
Expand All @@ -21,28 +21,23 @@ positional arguments:
## Description

This command provides a quick way to compare metrics among experiments in the
repository history. It requires that Git is being used to version the metrics.
repository history. All metrics defined in `dvc.yaml` are used by default. The
comparison shown by this command includes the new value, and the numeric
difference (delta) with the previous value (rounded to 5 digits precision).

> This kind of metrics can be defined with the `-m` (`--metrics`) and `-M`
> (`--metrics-no-cache`) options of `dvc run`.
Run without arguments, this command compares metrics currently present in the
<abbr>workspace</abbr> uncommitted changes) with the latest committed version.

The differences shown by this command include the new value, and numeric
difference (delta) from the previous value of metrics (rounded to 5 digits
precision). They're calculated between two commits (hash, branch, tag, or any
[Git revision](https://git-scm.com/docs/revisions)) for all metrics in the
<abbr>project</abbr>, found by examining all of the `dvc.yaml` and `.dvc` files
in both versions.
`a_rev` and `b_rev` are Git commit hashes, tag, or branch names. If none are
specified, `dvc metrics diff` compares metrics currently present in the
<abbr>workspace</abbr> (uncommitted changes) with the latest committed versions
(required). A single specified revision results in comparing the workspace and
that version.

Another way to display metrics is the `dvc metrics show` command, which just
lists all the current metrics without comparisons.
lists all the current metrics, without comparisons.

## Options

- `--targets <paths>` - limit command scope to these metric files. Using `-R`,
directories to search metric files in can also be given. When specifying
- `--targets <paths>` - limit command scope to these metrics files. Using `-R`,
directories to search metrics files in can also be given. When specifying
arguments for `--targets` before `revisions`, you should use `--` after this
option's arguments, e.g.:

Expand All @@ -56,7 +51,7 @@ lists all the current metrics without comparisons.
$ dvc metrics diff HEAD v1 --targets t1.json t2.json
```

- `-R`, `--recursive` - determines the metric files to use by searching each
- `-R`, `--recursive` - determines the metrics files to use by searching each
target directory and its subdirectories for DVC-files to inspect. If there are
no directories among the `targets`, this option is ignored.

Expand Down Expand Up @@ -124,7 +119,7 @@ metrics.json TP 531 4

## Example: compare metrics among specific versions

Metric files committed with Git can be compared by referencing the commits (any
Metrics files committed with Git can be compared by referencing the commits (any
two [revisions](https://git-scm.com/docs/revisions)):

```dvc
Expand Down
17 changes: 9 additions & 8 deletions content/docs/command-reference/metrics/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,9 @@ ability to mark a certain stage <abbr>outputs</abbr> as metrics. These metrics
are project-specific floating-point or integer values e.g. AUC, ROC, false
positives, etc.

This kind of metrics can be defined with the `-m` (`--metrics`) and `-M`
(`--metrics-no-cache`) options of `dvc run`.
This type of metrics files are typically generated by user data processing code,
and are tracked using the `-m` (`--metrics`) and `-M` (`--metrics-no-cache`)
options of `dvc run`.

In contrast to `dvc plots`, these metrics should be stored in hierarchical
files. Unlike its `dvc plots` counterpart, `dvc metrics diff` can report the
Expand All @@ -46,7 +47,7 @@ $ dvc metrics diff
summary.json AUC 0.801807 0.037826
```

`dvc metrics` subcommands by default use the metric files specified in
`dvc metrics` subcommands by default use the metrics files specified in
`dvc.yaml` (if any), for example `summary.json` below:

```yaml
Expand All @@ -63,7 +64,7 @@ stages:
```
> `cache: false` above specifies that `summary.json` is not tracked or
> <abbr>cached</abbr> by DVC (`-M` option of `dvc run`). These metric files are
> <abbr>cached</abbr> by DVC (`-M` option of `dvc run`). These metrics files are
> normally committed with Git instead. See `dvc.yaml` for more information on
> the file format above.

Expand Down Expand Up @@ -113,12 +114,12 @@ $ dvc run -n evaluate -d code/evaluate.py -M eval.json \
python code/evaluate.py
```

> `-M` (`--metrics-no-cache`) tells DVC to mark `eval.json` as a metric file,
> `-M` (`--metrics-no-cache`) tells DVC to mark `eval.json` as a metrics file,
> without tracking it directly (You can track it with Git). See `dvc run` for
> more info.

Now let's print metric values that we are tracking in this <abbr>project</abbr>,
using `dvc metrics show`:
Now let's print metrics values that we are tracking in this
<abbr>project</abbr>, using `dvc metrics show`:

```dvc
$ dvc metrics show
Expand All @@ -128,7 +129,7 @@ $ dvc metrics show
TP: 516
```

When there are metric file changes (before committing them with Git), the
When there are metrics file changes (before committing them with Git), the
`dvc metrics diff` command shows the difference between metrics values:

```dvc
Expand Down
17 changes: 7 additions & 10 deletions content/docs/command-reference/metrics/show.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ usage: dvc metrics show [-h] [-q | -v] [-a] [-T] [--all-commits] [-R]
[--show-json] [targets [targets ...]]
positional arguments:
targets Limit command scope to these metric files.
Using -R, directories to search metric files
targets Limit command scope to these metrics files.
Using -R, directories to search metrics files
in can also be given.
```

Expand All @@ -19,21 +19,18 @@ positional arguments:
Finds and prints all metrics in the <abbr>project</abbr> by examining all of its
[DVC-files](/doc/user-guide/dvc-files-and-directories).

> This kind of metrics can be defined with the `-m` (`--metrics`) and `-M`
> (`--metrics-no-cache`) options of `dvc run`.
If `targets` are provided, it will show those specific metric files instead.
If `targets` are provided, it will show those specific metrics files instead.
With the `-a` or `-T` options, this command shows the different metrics values
across all Git branches or tags, respectively. With the `-R` option, some of the
target can even be directories, so that DVC recursively shows all metric files
target can even be directories, so that DVC recursively shows all metrics files
inside.

An alternative way to display metrics is the `dvc metrics diff` command, which
compares them with a previous version.

## Options

- `-a`, `--all-branches` - print metric file contents in all Git branches
- `-a`, `--all-branches` - print metrics file contents in all Git branches
instead of just those present in the current workspace. It can be used to
compare different experiments. Note that this can be combined with `-T` below,
for example using the `-aT` flag.
Expand All @@ -49,7 +46,7 @@ compares them with a previous version.
- `--show-json` - prints the command's output in easily parsable JSON format,
instead of a human-readable table.

- `-R`, `--recursive` - determines the metric files to show by searching each
- `-R`, `--recursive` - determines the metrics files to show by searching each
target directory and its subdirectories for DVC-files to inspect. If there are
no directories among the `targets`, this option is ignored.

Expand Down Expand Up @@ -122,4 +119,4 @@ increase_bow:
The
[Compare Experiments](/doc/tutorials/get-started/experiments#compare-experiments)
chapter of our _Get Started_ covers the `-a` option to collect and print a
metric file value across all Git branches.
metrics file value across all Git branches.
19 changes: 11 additions & 8 deletions content/docs/command-reference/plots/diff.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# plots diff

Show multiple versions of [plot metrics](/doc/command-reference/plots) by
plotting them in a single image. This allows to easily compare them.
overlaying them in a single plot. This allows to compare them easily.

## Synopsis

Expand All @@ -24,7 +24,7 @@ experiments in the <abbr>repository</abbr> history, by plotting multiple
versions of the metrics. All plots defined in `dvc.yaml` are used by default.

> Note that unlike `dvc metrics diff`, this command does not calculate numeric
> differences between metric file values.
> differences between metrics file values.
`revisions` are Git commit hashes, tag, or branch names. If none are specified,
`dvc plots diff` compares targets currently present in the
Expand All @@ -33,19 +33,22 @@ versions (required). A single specified revision results in comparing the
workspace and that version.

Note that any number of `revisions` can be provided, and the resulting plot
shows all of them in a single output.
shows all of them in a single image.

The plot style can be customized with
[plot templates](/doc/command-reference/plots#plot-templates), using the
`--template` option. To learn more about metric file formats and templates
`--template` option. To learn more about metrics file formats and templates
please see `dvc plots`.

> Note that the default behavior of this command can be modified per metrics
> file with `dvc plots modify`.
Another way to display plots is the `dvc plots show` command, which just lists
all the current plots, without comparisons.

## Options

- `--targets <path>` - specific metric files to visualize. These must be listed
- `--targets <path>` - specific metrics files to visualize. These must be listed
in a [`dvc.yaml`](/doc/user-guide/dvc-files-and-directories#dvcyaml-file) file
(see the `--plots` option of `dvc run`). When specifying arguments for
`--targets` before `revisions`, you should use `--` after this option's
Expand Down Expand Up @@ -74,11 +77,11 @@ please see `dvc plots`.
auto-generated `index` field is used by default. See
[Custom templates](/doc/command-reference/plots#custom-templates) for more
information on this `index` field. Column names or numbers are expected for
tabular metric files.
tabular metrics files.

- `-y <field>` - field name from which the Y axis data comes from. The last
field found in the `--targets` is used by default. Column names or numbers are
expected for tabular metric files.
expected for tabular metrics files.

- `--x-label <text>` - X axis label. The X field name is the default.

Expand Down Expand Up @@ -143,7 +146,7 @@ cat,turtle

The predefined confusion matrix
[template](/doc/command-reference/plots#plot-templates) (in
`.dvc/plots/confusion.json`) shows how metric comparisons can be faceted by
`.dvc/plots/confusion.json`) shows how metrics comparisons can be faceted by
separate plots. It can be enabled with `-t` (`--template`):

```dvc
Expand Down
14 changes: 8 additions & 6 deletions content/docs/command-reference/plots/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,11 @@ DVC provides a set of commands to visualize metrics of machine learning
experiments. Usual plot examples are AUC curves, loss functions, confusion
matrices, among others.

This kind of metric files are created by users, or generated by user data
processing code. `dvc plots` subcommands can work with metric files committed to
a Git repo history, data files controlled by DVC, or any other file in system.
This type of metrics files are created by users, or generated by user data
processing code, and get defined with the `-p` (`--plots`) and
`--plots-no-cache`) options of `dvc run`. `dvc plots` subcommands can work with
plots files committed to a Git repo history, data files controlled by DVC, or
any other file in system.

DVC generates plots as HTML files that can be open with a web browser. These
HTML files use [Vega-Lite](https://vega.github.io/vega-lite/). Vega is a
Expand Down Expand Up @@ -118,7 +120,7 @@ You can create a custom template from scratch, or modify an existing one from

💡 Note that custom templates can be safely added to the template directory.

All metric files given to `dvc plots show` and `dvc plots diff` as input are
All metrics files given to `dvc plots show` and `dvc plots diff` as input are
combined together into a single data array for injection into a template file.
There are two important fields that DVC adds to the plot data:

Expand All @@ -129,12 +131,12 @@ There are two important fields that DVC adds to the plot data:
distinguish between different versions when using the `dvc plots diff`
command.

Note that in the case of CSV/TSV metric files, column names from the table
Note that in the case of CSV/TSV metrics files, column names from the table
header (first row) are equivalent to field names.

### DVC template anchors

- `<DVC_METRIC_DATA>` (**required**) - the plot data from any kind of metric
- `<DVC_METRIC_DATA>` (**required**) - the plot data from any kind of metrics
files is converted to a single JSON array internally, and injected instead of
this anchor. Two additional fields will be added: `index` and `rev` (explained
above).
Expand Down
2 changes: 1 addition & 1 deletion content/docs/command-reference/plots/modify.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ usage: dvc plots modify [-h] [-q | -v] [-t <name_or_path>] [-x <field>]
target
positional arguments:
target Metric file to set properties to
target Metrics file to set properties to
```

## Description
Expand Down
10 changes: 5 additions & 5 deletions content/docs/command-reference/plots/show.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ usage: dvc plots show [-h] [-q | -v] [-t <name_or_path>] [-x <field>]
[targets [targets ...]]
positional arguments:
targets Metric files to visualize.
targets Metrics files to visualize.
Shows all plots by default.
```

Expand All @@ -22,12 +22,12 @@ This command provides a quick way to visualize metrics such as loss functions,
AUC curves, confusion matrices, etc. All plots defined in `dvc.yaml` are used by
default.

Optionally, specific metric file `targets` to show are accepted. These must be
Optionally, specific metrics file `targets` to show are accepted. These must be
listed in a `dvc.yaml` file (see the `--plots` option of `dvc run`).

The plot style can be customized with
[plot templates](/doc/command-reference/plots#plot-templates), using the
`--template` option. To learn more about metric file formats and templates
`--template` option. To learn more about metrics file formats and templates
please see `dvc plots`.

> Note that the default behavior of this command can be modified per metrics
Expand All @@ -48,11 +48,11 @@ please see `dvc plots`.
auto-generated `index` field is used by default. See
[Custom templates](/doc/command-reference/plots#custom-templates) for more
information on this `index` field. Column names or numbers are expected for
tabular metric files.
tabular metrics files.

- `-y <field>` - field name from which the Y axis data comes from. The last
field found in the `targets` is used by default. Column names or numbers are
expected for tabular metric files.
expected for tabular metrics files.

- `--x-label <text>` - X axis label. The X field name is the default.

Expand Down
2 changes: 1 addition & 1 deletion content/docs/command-reference/repro.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ up-to-date and only execute the final stage.
data repeatedly when running multiple experiments.

- `-m`, `--metrics` - show metrics after reproduction. The target pipelines must
have at least one metric file defined either with the `dvc metrics` command,
have at least one metrics file defined either with the `dvc metrics` command,
or by the `-M` or `-m` options of the `dvc run` command.

- `--dry` - only print the commands that would be executed without actually
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -366,7 +366,7 @@ pipelines, and try to apply it here. Don't hesitate to join our
[community](/chat) and ask any questions!

Another detail we only brushed upon here is the way we captured the
`metrics.csv` metric file with the `-M` option of `dvc run`. Marking this
`metrics.csv` metrics file with the `-M` option of `dvc run`. Marking this
<abbr>output</abbr> as a metric enables us to compare its values across Git tags
or branches (for example, representing different experiments). See `dvc metrics`
and
Expand Down

0 comments on commit f7f0e2b

Please sign in to comment.