From 37f4e90272f1d3e775a497d27794e3ae9f04402e Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Thu, 11 Jun 2020 14:27:12 -0500 Subject: [PATCH] Regular updates & plots 1.0 update (#1382) * cmd ref: add note that move creates dirs * cmd ref: improve structure of add ref desc. * grammar: add some commas * term: checksum -> hash value in dvcignore guide * style: lower case bullet text * cmd ref: remove some redundancy in metrics index * cmd ref: update plots refs synopsis and descriptions per iterative/dvc/issues/3924 et al. * Add plots modify cmd * typo: CSV->csv * term: working tree -> workspace per iterative/dvc/pull/3914 * cmd ref: couple improvements to add ref per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-422235749 and https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-422237494 * Update config/prismjs/dvc-commands.js * cmd ref: update plots modify description * cmd ref: add plots modify to nav, with a few more improvements * cmd ref: plots --show-json -> --show-vega per https://github.com/iterative/dvc/pull/3891#issuecomment-638251223 * rename x-lab to x-label * cmd ref: review descriptions of plots index, show, and diff * cmd ref: review and update old plots cmds options per https://github.com/iterative/dvc/pull/3948 et al. * cmd ref: a couple more option updates per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-424070145 * cmd ref: emphasize add works with any large file/dir per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-423970876 * cmd ref: updae plots modify top half (definition, description) per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-423722291et al. * cmd ref: improve all plot cmd option descriptions * Update content/docs/command-reference/plots/modify.md * cmd ref: review examples (mainly images) in plots modify per https://github.com/iterative/dvc.org/pull/1382#discussion_r434968322 et al. * cmd ref: rephrase info about how data arrays are injected to plot templates per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-425713344 * cmd ref: update info on how targets for for plots show/diff per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-425713399 * cmd ref: double check all plots examples per https://github.com/iterative/dvc.org/pull/1382#issuecomment-639989366 * cmd ref: remove info about plots show --select * cmd ref: update add desc per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-425768295 * cmd ref: re-explain dvc add for dirs per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-425768492 * cmd ref: improve description about targets in plots diff per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-425768658 * cmd ref: make emoji note in plots index per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-425769215 * cmd ref: remove ineffective CSV code block highlighting from plots refs per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-425769562 * get started: improve intro in index * glossary: remove external deps entry (no need) * cmd ref: add info about column indexing for headerless tables per https://github.com/iterative/dvc.org/pull/1382#discussion_r436301164 * cmd ref: update template metavar for plots subcommands per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-426682354 * cmd ref: mention YAML is supported for plots per https://github.com/iterative/dvc.org/pull/1382#discussion_r437424901 * cmd ref: rename template metavar again in plots per https://github.com/iterative/dvc.org/pull/1382#discussion_r437287927 * cmd ref: clarify plots modify --no-csv-header per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427441825 * cmd ref: add note about plots modify in show and diff * cmd ref: update all plots options again * cmd ref: more updates to plots et al. per Ivan's review * cmd ref: multiple plots diff --targets allowed * cmd ref: update note about detault metrics in index per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-425768545 * cmd ref: emphasize add --recursive is rarely needed per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427497337 * cmd ref: plots diff: update revisions arg desc per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427498405 * cmd ref: mention column names and numbers in plots {cmd} -x and -y per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427521887 * cmd ref: emphasize that metrics diff is not a real diff per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427524008 * cmd ref: simplify note on plots targets per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427580350 * cmd ref: how to id colmns in plots modify --no-csv-header per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427580726 * cmd ref: add default target behavior to plots show and diff rel: https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427585210 * cmd ref: rename plots option --no-header per iterative/dvc/pull/4001 * cmd ref: term: prop->property (plots) per https://github.com/iterative/dvc.org/pull/1382#discussion_r437784465 * cmd ref: more details on metrics index per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427727137 and https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427728168 * cmd ref: more details on plots index per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427730625 and https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427731044 * cmd ref: note about disply props in plots modify per https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427732411 and https://github.com/iterative/dvc.org/pull/1382#pullrequestreview-427732048 Co-authored-by: Dmitry Petrov --- config/prismjs/dvc-commands.js | 1 + content/docs/command-reference/add.md | 66 +++--- .../docs/command-reference/metrics/index.md | 51 ++--- .../docs/command-reference/metrics/show.md | 4 +- content/docs/command-reference/move.md | 5 +- content/docs/command-reference/plots/diff.md | 120 ++++++----- content/docs/command-reference/plots/index.md | 92 ++++---- .../docs/command-reference/plots/modify.md | 133 ++++++++++++ content/docs/command-reference/plots/show.md | 199 +++++++++--------- content/docs/command-reference/remote/add.md | 4 +- .../docs/command-reference/remote/modify.md | 2 +- content/docs/command-reference/status.md | 2 +- content/docs/sidebar.json | 4 + content/docs/tutorials/deep/sharing-data.md | 3 +- content/docs/tutorials/get-started/index.md | 8 +- content/docs/tutorials/pipelines.md | 2 +- .../basic-concepts/external-dependency.md | 9 - content/docs/user-guide/contributing/blog.md | 26 +-- content/docs/user-guide/dvc-file-format.md | 5 +- content/docs/user-guide/dvcignore.md | 2 +- .../user-guide/large-dataset-optimization.md | 2 +- static/img/plots_mod_acc.svg | 1 + static/img/plots_mod_acc_titles.svg | 1 + static/img/plots_mod_loss.svg | 1 + static/img/plots_show.svg | 2 +- static/img/plots_show_field.svg | 2 +- static/img/plots_show_json.svg | 2 +- static/img/plots_show_json_field.svg | 2 +- 28 files changed, 442 insertions(+), 309 deletions(-) create mode 100644 content/docs/command-reference/plots/modify.md delete mode 100644 content/docs/user-guide/basic-concepts/external-dependency.md create mode 100644 static/img/plots_mod_acc.svg create mode 100644 static/img/plots_mod_acc_titles.svg create mode 100644 static/img/plots_mod_loss.svg diff --git a/config/prismjs/dvc-commands.js b/config/prismjs/dvc-commands.js index 3a9d6f14a8..26c298ede2 100644 --- a/config/prismjs/dvc-commands.js +++ b/config/prismjs/dvc-commands.js @@ -22,6 +22,7 @@ module.exports = [ 'pull', 'pkg', 'plots show', + 'plots modify', 'plots diff', 'plots', 'pipeline show', diff --git a/content/docs/command-reference/add.md b/content/docs/command-reference/add.md index 8bb1192636..c9b321f0e2 100644 --- a/content/docs/command-reference/add.md +++ b/content/docs/command-reference/add.md @@ -16,23 +16,27 @@ positional arguments: ## Description The `dvc add` command is analogous to `git add`, in that it makes DVC aware of -the target data, as a first step to version it. It creates a +the target data, in order to start versioning it. It creates a [`.dvc` file](/doc/user-guide/dvc-file-format) to track the added data. -The `targets` are files or directories to add with this command, that are turned -into data artifacts of the project. By default, these -are committed to the cache (use the `--no-commit` option to avoid -this, and `dvc commit` to finish the process when needed). +This command can be used to +[version control](/doc/use-cases/versioning-data-and-model-files) large files, +models, dataset directories, etc. that are too big for Git. -Note that [external data](/doc/user-guide/managing-external-data) (targets -outside the workspace) is supported. +The `targets` are the files or directories to add, which are turned into +data artifacts of the project. These are stored in the +cache by default (use the `--no-commit` option to avoid this, and +`dvc commit` to finish the process when needed). + +> See also `dvc run` for more advanced ways to version intermediate and final +> results (like ML models). Under the hood, a few actions are taken for each file (or directory) in `targets`: 1. Calculate the file hash. -2. Move the file contents to the cache directory (by default in `.dvc/cache`), - using the file hash to form the cached file names. (See +2. Move the file contents to the cache (by default in `.dvc/cache`), using the + file hash to form the cached file names. (See [Structure of cache directory](/doc/user-guide/dvc-files-and-directories#structure-of-cache-directory) for more details.) 3. Attempt to replace the file with a link to the cached data (more details @@ -59,34 +63,34 @@ files that can be tracked with Git. See To avoid adding files inside a directory accidentally, you can add the corresponding [patterns](/doc/user-guide/dvcignore) in a `.dvcignore` file. -By default DVC tries to use reflinks (see +By default, DVC tries to use reflinks (see [File link types](/doc/user-guide/large-dataset-optimization#file-link-types-for-the-dvc-cache) to avoid copying any file contents and to optimize `.dvc` file operations for large files. DVC also supports other link types for use on file systems without `reflink` support, but they have to be specified manually. Refer to the `cache.type` config option in `dvc config cache` for more information. -A `dvc add` target can be an individual file or a directory. There are two ways -to work with directory hierarchies with `dvc add`: - -1. With `dvc add --recursive`, the hierarchy is traversed and every file is - added individually as described above. This means every file has its own - `.dvc` file, and a corresponding cached file is created (unless the - `--no-commit` option is used). -2. When not using `--recursive` a `.dvc` file is created for the top of the - directory (with default name `dirname.dvc`). Every file in the hierarchy is - added to the cache (unless the `--no-commit` option is used), but DVC does - not produce individual `.dvc` files for each file in the directory tree. - Instead, the single `.dvc` file references a special JSON file in the cache - (with `.dir` extension), that in turn points to the files added from the - hierarchy. - -`dvc add` is typically used to version control raw data or initial datasets from -which data processing [pipelines](/doc/command-reference/pipeline) are built, -but it can be used to track any large file or directory. We recommend using -`dvc run` to version control intermediate and final results (like ML models). -This way you bring data provenance and make your project -[reproducible](/doc/command-reference/repro). +### Tracking directories + +A `dvc add` target can be an individual file or a directory. In the latter case, +a [`.dvc` file](/doc/user-guide/dvc-file-format) is created for the top of the +directory (with default name `.dvc`). + +Every file in the hierarchy is added to the cache (unless the `--no-commit` +option is used), but DVC does not produce individual `.dvc` files for each file +in the directory tree. Instead, the single `.dvc` file references a special JSON +file in the cache (with `.dir` extension), that in turn points to the added +files. + +Note that DVC commands that use tracked files support granular targeting of +files, even when the directory is added as a whole. Examples: `dvc push`, +`dvc pull`, `dvc get`, `dvc import`, etc. + +As a rarely needed alternative, the `--recursive` option causes every file in +the hierarchy to be added individually. A corresponding `.dvc` file will be +generated for each file in he same location. This may be helpful to save time +adding several data files grouped in a structural directory, but it's +undesirable for data directories with a large number of files. ## Options diff --git a/content/docs/command-reference/metrics/index.md b/content/docs/command-reference/metrics/index.md index 60961fadbe..3624a8d245 100644 --- a/content/docs/command-reference/metrics/index.md +++ b/content/docs/command-reference/metrics/index.md @@ -46,6 +46,28 @@ $ dvc metrics diff summary.json AUC 0.801807 0.037826 ``` +`dvc metrics` subcommands by default use the metric files specified in +`dvc.yaml` (if any), for example `summary.json` below: + +```yaml +stages: + train: + cmd: python train.py + deps: + - users.csv + outs: + - model.pkl + metrics: + - summary.json: + cache: false +``` + +> `cache: false` above specifies that `summary.json` is not tracked or +> cached by DVC (`-M` option of `dvc run`). These metric files are +> normally committed with Git instead. See +> [`dvc.yaml`](/doc/user-guide/dvc-file-format) for more information on the file +> format above. + ### Supported file formats Metrics can be organized as tree hierarchies in JSON or YAML files. DVC @@ -69,35 +91,6 @@ DVC itself does not ascribe any specific meaning for these numbers. Usually they are produced by the model training or model evaluation code and serve as a way to compare and pick the best performing experiment. -### Default metric files - -`dvc metrics` subcommands use all metric files that are specified in `dvc.yaml` -by default. There's no need to specify metric file names to see these metrics. -Metric files can be added to `dvc.yaml` with the `--metrics` (`-m`) or -`--metrics-no-cache` (`-M`) options of `dvc run`, or manually to the `metrics` -section of a stage in `dvc.yaml`: - -```yaml -stages: - train: - cmd: python train.py - deps: - - users.csv - params: - - epochs - - dropout - - lr - outs: - - model.pkl - metrics: - - summary.json: - cache: false -``` - -`cache: false` above specifies that `summary.json is not a data file: it will -not be cached by DVC. Metric files are normally committed with Git -instead. - ## Options - `-h`, `--help` - prints the usage/help message, and exit. diff --git a/content/docs/command-reference/metrics/show.md b/content/docs/command-reference/metrics/show.md index af3a5e2bea..898d2995c8 100644 --- a/content/docs/command-reference/metrics/show.md +++ b/content/docs/command-reference/metrics/show.md @@ -79,7 +79,7 @@ history use `--all-commits` option: ```dvc $ dvc metrics show --all-commits -working tree: +workspace: eval.json: AUC: 0.66729 error: 0.16982 @@ -100,7 +100,7 @@ Metrics from different branches can be shown by `--all-branches` (`-a`) option: ```dvc $ dvc metrics show -a -working tree: +workspace: eval.json: AUC: 0.66729 error: 0.16982 diff --git a/content/docs/command-reference/move.md b/content/docs/command-reference/move.md index 32e6b6439d..56ce8918ae 100644 --- a/content/docs/command-reference/move.md +++ b/content/docs/command-reference/move.md @@ -20,8 +20,9 @@ positional arguments: `dvc move` is useful when a `src` file or directory has previously been added to the project with `dvc add`, creating a [`.dvc` file](/doc/user-guide/dvc-file-format) (with `src` as a dependency). -`dvc move` behaves like `mv src dst`, moving `src` to the given `dst` path, but -it also renames and updates the corresponding `.dvc` file appropriately. +`dvc move` behaves similar to `mv src dst`, moving `src` to the given `dst` +path, but it also renames and updates the corresponding `.dvc` file +appropriately. > Note that `src` may be a copy or a > [link](/doc/user-guide/large-dataset-optimization#file-link-types-for-the-dvc-cache) diff --git a/content/docs/command-reference/plots/diff.md b/content/docs/command-reference/plots/diff.md index 381e054f30..eda3d28d04 100644 --- a/content/docs/command-reference/plots/diff.md +++ b/content/docs/command-reference/plots/diff.md @@ -1,81 +1,86 @@ # plots diff Show multiple versions of [plot metrics](/doc/command-reference/plots) by -plotting them in a single image. +plotting them in a single image. This allows to easily compare them. ## Synopsis ```usage -usage: dvc plots diff [-h] [-q | -v] [-t [TEMPLATE]] [-d [DATAFILE]] [-f FILE] - [-s SELECT] [-x X] [-y Y] [--stdout] [--no-csv-header] - [--no-html] [--title TITLE] [--xlab XLAB] [--ylab YLAB] - [revisions [revisions ...]] +usage: dvc plots diff [-h] [-q | -v] [--targets [ [ ...]]] + [-t ] [-x ] [-y ] + [--no-header] [--title ] + [--x-label ] [--y-label ] [-o ] + [--show-vega] + [revisions [revisions ...]] positional arguments: - revisions Git commits to plot from + revisions Git commits to find metrics to compare ``` ## Description -This command visualize difference between metrics among experiments in the -repository history. Requires that Git is being used to version the metrics -files. +This command is a way to visualize the "difference" between metrics among +experiments in the repository history, by plotting multiple +versions of the metrics. All plots defined in `dvc.yaml` are used by default. -The metrics file needs to be specified through `-d`/`--datafile` option. Also, a -plot can be customized with -[plot templates](/doc/command-reference/plots#plot-templates) using the -`--template` option. To learn more about the file formats and templates please -see `dvc plots`. +> Note that unlike `dvc metrics diff`, this command does not calculate numeric +> differences between metric file values. `revisions` are Git commit hashes, tag, or branch names. If none are specified, -`dvc plots diff` compares metrics currently present in the -workspace (uncommitted changes) with the latest committed version. -A single specified revision results in plotting the difference in metrics -between the workspace and that version. +`dvc plots diff` compares targets currently present in the +workspace (uncommitted changes) with their latest committed +versions (required). A single specified revision results in comparing the +workspace and that version. -In contrast to commands such as `git diff`, `dvc metrics diff` and -`dvc params diff`, **any number of `revisions` can be provided**, and the -resulting plot shows all of them in a single output. +Note that any number of `revisions` can be provided, and the resulting plot +shows all of them in a single output. -This command can work with metric files that are committed to a repository -history, data files controlled by DVC, or any other file in the workspace. In -the case of DVC-tracked `datafile`, the `revisions` are used to find the -corresponding [DVC-files](/doc/user-guide/dvc-file-format). +The plot style can be customized with +[plot templates](/doc/command-reference/plots#plot-templates), using the +`--template` option. To learn more about metric file formats and templates +please see `dvc plots`. + +> Note that the default behavior of this command can be modified per metrics +> file with `dvc plots modify`. ## Options -- `-d [DATAFILE], --datafile [DATAFILE]` - metrics file to visualize. +- `--targets ` - specific metric files to visualize. These must be listed + in a [`dvc.yaml`](/doc/user-guide/dvc-file-format) file (see the `--plots` + option of `dvc run`). + +- `-o , --out ` - name of the generated file. By default, the output + file name is equal to the input filename with a `.html` file extension (or + `.json` when using `--show-vega`). -- `-t [TEMPLATE], --template [TEMPLATE]` - +- `-t , --template ` - [plot template](/doc/command-reference/plots#plot-templates) to be injected with data. The default template is `.dvc/plots/default.json`. See more details in `dvc plots`. -- `-f FILE, --file FILE` - name of the generated file. By default, the output - file name is equal to the input filename with additional `.html` suffix or - `.json` suffix for `--no-html` mode. - -- `--no-html` - do not wrap output Vega specification (JSON) with HTML. - -- `-x X` - field name for X axis. An auto-generated `index` field is used by - default. +- `-x ` - field name from which the X axis data comes from. An + auto-generated `index` field is used by default. See + [Custom templates](/doc/command-reference/plots#custom-templates) for more + information on this `index` field. Column names or numbers are expected for + tabular metric files. -- `-y Y` - field name for Y axis. The last column or field found in the - `datafile` is used by default. +- `-y ` - field name from which the Y axis data comes from. The last + field found in the `--targets` is used by default. Column names or numbers are + expected for tabular metric files. -- `-s SELECT, --select SELECT` - select which fields or JSONPath to store in the - metrics file [metadata](https://vega.github.io/vega/docs/data/). The - auto-generated, zero-based `index` column is always included. +- `--x-label ` - X axis label. The X field name is the default. -- `--xlab XLAB` - X axis title. The X field name is the default title. +- `--y-label ` - Y axis label. The Y field name is the default. -- `--ylab YLAB` - Y axis title. The Y field name is the default title. +- `--title ` - plot title. -- `--title TITLE` - plot title. +- `--show-vega` - produce a + [Vega specification](https://vega.github.io/vega/docs/specification/) file + instead of HTML. See `dvc plots` for more info. -- `-o, --stdout` - print plot content to stdout. - -- `--no-csv-header` - provided CSV or TSV datafile does not have a header. +- `--no-header` - lets DVC know that CSV or TSV `--targets` do not have a + header. A 0-based numeric index can be used to identify each column instead of + names. - `-h`, `--help` - prints the usage/help message, and exit. @@ -86,21 +91,22 @@ corresponding [DVC-files](/doc/user-guide/dvc-file-format). ## Examples -To visualize the difference between uncommitted changes of a metrics file and -the last commit: +To compare uncommitted changes of a metrics file and its last committed version: ```dvc -$ dvc plots diff -d logs.csv +$ dvc plots diff --targets logs.csv --x-label x file:///Users/dmitry/src/plots/logs.html ``` ![](/img/plots_auc.svg) -The difference between two versions (commit hashes, tags, or branches can be -provided): +> Note that we renamed the X axis label with option `--x-label x`. + +Compare two specific versions (commit hashes, tags, or branches can be provided, +for example): ```dvc -$ dvc plots diff -d logs.csv HEAD 0135527 +$ dvc plots diff --targets logs.csv HEAD 0135527 file:///Users/usr/src/plots/logs.csv.html ``` @@ -110,7 +116,7 @@ file:///Users/usr/src/plots/logs.csv.html We'll use tabular metrics file `classes.csv` for this example: -```csv +``` predicted,actual cat,cat cat,cat @@ -124,13 +130,13 @@ cat,turtle ... ``` -A predefined confusion matrix +The predefined confusion matrix [template](/doc/command-reference/plots#plot-templates) (in -`.dvc/plots/confusion.json`) shows how metric differences can be faceted by -separate plots: +`.dvc/plots/confusion.json`) shows how metric comparisons can be faceted by +separate plots. It can be enabled with `-t` (`--template`): ```dvc -$ dvc plots diff -t confusion -x predicted -d classes.csv +$ dvc plots diff -t confusion --targets classes.csv -x predicted file:///Users/usr/src/test/plot_old/classes.csv.html ``` diff --git a/content/docs/command-reference/plots/index.md b/content/docs/command-reference/plots/index.md index 8367c568eb..092bd8dcd9 100644 --- a/content/docs/command-reference/plots/index.md +++ b/content/docs/command-reference/plots/index.md @@ -1,20 +1,20 @@ # plots -Contains commands to visualize _plot metrics_ in structured files (JSON, CSV, or -TSV): [show](/doc/command-reference/plots/show), -[diff](/doc/command-reference/plots/diff). +A set of commands to visualize and compare _plot metrics_ in structured files +(JSON, YAML, CSV, or TSV): [show](/doc/command-reference/plots/show), +[diff](/doc/command-reference/plots/diff), and +[modify](/doc/command-reference/plots/modify). ## Synopsis ```usage -usage: dvc plots [-h] [-q | -v] {show,diff} ... +usage: dvc plots [-h] [-q | -v] {show,diff,modify} ... positional arguments: COMMAND - show Generate a plot image file from a metrics file. - diff Plot differences in metrics between commits in the - DVC repository, or between the last commit and the - workspace. + show Generate plot from a metrics file. + diff Plot differences in metrics between commits. + modify Modify plot properties associated with a target file. ``` ## Types of metrics @@ -48,13 +48,13 @@ differences between the metrics in different experiments. ### Supported file formats -Continuous metrics can be organized as data series in JSON, CSV, or TSV files. +Plot metrics can be organized as data series in JSON, YAML, CSV, or TSV files. DVC expects to see an array (or multiple arrays) of objects (usually _float numbers_) in the file. -In tabular file formats such as CSV and TSV, each column (or field) is an array. -`dvc plots show` can generate visuals for a specified column or a set of -columns. Like `AUC` column: +In tabular file formats such as CSV and TSV, each column is an array. +`dvc plots` subcommands can produce plots for a specified column or a set of +them. For example, `epoch`, `AUC`, and `loss` are the column names below: ``` epoch, AUC, loss @@ -64,10 +64,12 @@ epoch, AUC, loss 37, 0.92302, 0.0299015 ``` -In hierarchical file formats such as JSON, an array of JSON objects is expected. -`dvc plots show` command can generate visuals for a specified field name or a -set of fields from the array's object. Like `val_loss` field in the `train` -array in this example: +In hierarchical file formats (JSON or YAML), an array of consistent objects is +expected: every object should have the same structure. + +`dvc plots` subcommands can produce plots for a specified field or a set of +them, from the array's objects. For example, `val_loss` is one of the field +names in the `train` array below: ``` { @@ -85,11 +87,11 @@ array in this example: ## Plot templates -DVC gives users the ability to change the -[Vega JSON schema](https://github.com/vega/schema), and generate plots in the -format that best fits the their needs. This doesn't make DVC -projects dependent on user visualization code, programming language, or -specific environments, keeping DVC agnostic. +Users have the ability to change the way plots are displayed by modifying the +[Vega specification](https://vega.github.io/vega/docs/specification/), thus +generating plots in the style that best fits the their needs. This keeps +DVC projects programming language agnostic, as it's independent +from user display configuration and visualization code. Built-in _plot templates_ are stored in the `.dvc/plots/` directory. The default one is called `default.json`. It can be changed with the `--template` (`-t`) @@ -99,13 +101,16 @@ can specify only the base name e.g. `--template scatter`. ### Custom templates -Plot template files are just JSON specifications with predefined DVC anchors -that help DVC to inject user's data properly. You can create a custom template -from scratch or modify an existing one from `.dvc/plots/`. Custom templates can -be added to the template directory. +Plot template files are +[Vega specification](https://vega.github.io/vega/docs/specification/) files that +use predefined DVC anchors as placeholders for DVC to inject the plot values. +You can create a custom template from scratch, or modify an existing one from +`.dvc/plots/`. + +💡 Note that custom templates can be safely added to the template directory. -All JSON files given to `dvc plots show` and `dvc plots diff` as input are -combined together into a single data array for the injection to a template file. +All metric files given to `dvc plots show` and `dvc plots diff` as input are +combined together into a single data array for injection into a template file. There are two important fields that DVC adds to the plot data: - `index` - self-incrementing, zero-based counter for the data rows/values. In @@ -115,29 +120,30 @@ There are two important fields that DVC adds to the plot data: distinguish between different versions when using the `dvc plots diff` command. -DVC applies the same logic to all CSV/TSV files, but first transforms the data -into JSON. DVC uses column names from a header for JSON conversion into fields. +Note that in the case of CSV/TSV metric files, column names from the table +header (first row) are equivalent to field names. -DVC template anchors: +#### DVC template anchors -- `` - plotting command input data from either CSV or JSON - files is converted to JSON array and injected instead of this anchor. Two +- `` - the plot data from any kind of metric files is converted + to a single JSON array internally, and injected instead of this anchor. Two additional fields will be added: `index` and `rev` (explained above). -- `` - a title for the plot, that can be defined by `--title` - option. +- `` - a title for the plot, that can be defined with the + `--title` option of the `dvc plot` subcommands. -- `` - a field name for Y axis of the plot. It can be defined by - `-y` option of the commands. The default field is the last field found in the - input file: the last column in CSV file or the last field in the JSON array - object. +- `` - field name of the data for the X axis. It can be defined + with the `-x` option of the `dvc plot` subcommands. The auto-generated `index` + field (explained above) is the default. -- `` - a field name for Y axes. It can be defined by `-x` option. - `index` is the default field for X. +- `` - field name of the data for the Y axis. It can be defined + with the `-y` option of the `dvc plot` subcommands. The default is the last + one found in the metrics file: the last column for CSV/TSV, or the last field + for JSON/YAML. -- `` - a displayed field label for Y. +- `` - field name to display as the X axis label -- `` - a displayed field label for X. +- `` - field name to display as the X axis label ## Options @@ -195,7 +201,7 @@ file:///Users/usr/src/plots/logs.html We'll use `classes.csv` for this example: -```csv +``` actual,predicted cat,cat cat,cat diff --git a/content/docs/command-reference/plots/modify.md b/content/docs/command-reference/plots/modify.md new file mode 100644 index 0000000000..ac9c24101a --- /dev/null +++ b/content/docs/command-reference/plots/modify.md @@ -0,0 +1,133 @@ +# plots modify + +Modify display properties of [plot metrics](/doc/command-reference/plots) files. + +## Synopsis + +```usage +usage: dvc plots modify [-h] [-q | -v] [-t ] [-x ] + [-y ] [--no-header] [--title ] + [--x-label ] [--y-label ] + [--unset [ [ ...]]] + target + +positional arguments: + target Metric file to set properties to +``` + +## Description + +It might be not convenient for users or automation systems to specify all the +_display properties_ (such as `y-label`, `template`, `title`, etc.) each time +plots are generated with `dvc plot show` or `dvc plot diff`. This command sets +(or unsets) default display properties for a specific metrics file. + +The path to the metrics file `target` is required. It must be listed in a +[`dvc.yaml`](/doc/user-guide/dvc-file-format) file (see the `--plots` option of +`dvc run`). `dvc plots modify` adds the display properties to `dvc.yaml`. + +Property names are passed as [options](#options) to this command (prefixed with +`--`). These are based on the full +[Vega specification](https://vega.github.io/vega/docs/specification/). + +## Options + +- `-t , --template ` - set a default + [plot template](/doc/command-reference/plots#plot-templates). + +- `-x ` - set a default field or column name (or number) from which the X + axis data comes from. + +- `-y ` - set a default field or column name (or number) from which the Y + axis data comes from. + +- `--x-label ` - set a default title for the X axis. + +- `--y-label ` - set a default title for the Y axis. + +- `--title ` - set a default plot title. + +- `--unset [ [ ...]]` - unset one or more display + properties. Use the property name(s) without `--` in the argument sent to this + option. + +- `--no-header` - lets DVC know that the `target` CSV or TSV does not have a + header. A 0-based numeric index can be used to identify each column instead of + names. + +- `-h`, `--help` - prints the usage/help message, and exit. + +- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no + problems arise, otherwise 1. + +- `-v`, `--verbose` - displays detailed tracing information. + +## Examples + +The initial plot was showing the last column of CSV file by default which is +_loss_ metrics while _accuracy_ is expected as Y axis: + +``` +epoch,accuracy,loss +0,0.9403833150863647,0.2019129991531372 +1,0.9733833074569702,0.08973673731088638 +2,0.9815833568572998,0.06529958546161652 +3,0.9861999750137329,0.04984375461935997 +4,0.9882333278656006,0.041892342269420624 +``` + +```dvc +$ dvc plots show logs.csv +file:///Users/usr/src/myclassifier/logs.html +``` + +![](/img/plots_mod_loss.svg) + +Changing the y-axis to _accuracy_: + +```dvc +$ dvc plots modify logs.csv -y accuracy +$ dvc plots show logs.csv +file:///Users/usr/src/myclassifier/logs.html +``` + +![](/img/plots_mod_acc.svg) + +Note, a new field _y_ was added to `dvc.yaml` file for the plot. Please do not +forget to commit the change in Git if the modification needs to be preserved. + +```yaml +- logs.csv: + cache: false + y: accuracy +``` + +Changing the plot `title` and `x-label`: + +```dvc +$ dvc plots modify logs.csv --title Accuracy -x epoch --x-label Epoch +$ dvc plots show logs.csv +file:///Users/usr/src/myclassifier/logs.html +``` + +![](/img/plots_mod_acc_titles.svg) + +Two new fields were added to `dvc.yaml`: `x-label` and `title`: + +```yaml +plots: + - plots.csv: + cache: false + y: accuracy + x_label: epoch + title: Accuracy +``` + +## Example: Template change + +_dvc run --plots file.csv ..._ command assign the default template that needs to +be changed in many cases. A simple command changes the template: + +```dvc +$ dvc plots modify classes.csv --template confusion +``` diff --git a/content/docs/command-reference/plots/show.md b/content/docs/command-reference/plots/show.md index 57fa9abf02..2875380811 100644 --- a/content/docs/command-reference/plots/show.md +++ b/content/docs/command-reference/plots/show.md @@ -1,58 +1,72 @@ # plots show -Generate a plot image from from a [plot metrics](/doc/command-reference/plots) -file. +Generate [plot](/doc/command-reference/plots) from a metrics file. ## Synopsis ```usage -usage: dvc plots show [-h] [-q | -v] [-t [TEMPLATE]] [-f FILE] - [-s SELECT] [-x X] [-y Y] [--stdout] - [--no-csv-header] [--no-html] [--title TITLE] - [--xlab XLAB] [--ylab YLAB] [datafile] +usage: dvc plots show [-h] [-q | -v] [-t ] [-x ] + [-y ] [--no-header] [--title ] + [--x-label ] [--y-label ] [-o ] + [--show-vega] + [targets [targets ...]] positional arguments: - datafile Metrics file to visualize + targets Metric files to visualize. + Shows all plots by default. ``` ## Description This command provides a quick way to visualize metrics such as loss functions, -AUC curves, confusion matrices, etc. Please see `dvc plots` for information on -the supported data formats and other relevant details about DVC plots. +AUC curves, confusion matrices, etc. All plots defined in `dvc.yaml` are used by +default. + +Optionally, specific metric file `targets` to show are accepted. These must be +listed in a [`dvc.yaml`](/doc/user-guide/dvc-file-format) file (see the +`--plots` option of `dvc run`). + +The plot style can be customized with +[plot templates](/doc/command-reference/plots#plot-templates), using the +`--template` option. To learn more about metric file formats and templates +please see `dvc plots`. + +> Note that the default behavior of this command can be modified per metrics +> file with `dvc plots modify`. ## Options -- `-t [TEMPLATE], --template [TEMPLATE]` - +- `-o , --out ` - name of the generated file. By default, the output + file name is equal to the input filename with a `.html` file extension (or + `.json` when using `--show-vega`). + +- `-t , --template ` - [plot template](/doc/command-reference/plots#plot-templates) to be injected with data. The default template is `.dvc/plots/default.json`. See more details in `dvc plots`. -- `-f FILE, --file FILE` - name of the generated file. By default, the output - file name is equal to the input filename with additional `.html` suffix or - `.json` suffix for `--no-html` mode. - -- `--no-html` - do not wrap output Vega specification (JSON) with HTML. - -- `-x X` - field name for X axis. An auto-generated `index` field is used by - default. +- `-x ` - field name from which the X axis data comes from. An + auto-generated `index` field is used by default. See + [Custom templates](/doc/command-reference/plots#custom-templates) for more + information on this `index` field. Column names or numbers are expected for + tabular metric files. -- `-y Y` - field name for Y axis. The last column or field found in the - `datafile` is used by default. +- `-y ` - field name from which the Y axis data comes from. The last + field found in the `targets` is used by default. Column names or numbers are + expected for tabular metric files. -- `-s SELECT, --select SELECT` - select which fields or JSONPath to store in the - metrics file [metadata](https://vega.github.io/vega/docs/data/). The - auto-generated, zero-based `index` column is always included. +- `--x-label ` - X axis label. The X field name is the default. -- `--xlab XLAB` - X axis title. The X field name is the default title. +- `--y-label ` - Y axis label. The Y field name is the default. -- `--ylab YLAB` - Y axis title. The Y field name is the default title. +- `--title ` - plot title. -- `--title TITLE` - plot title. +- `--show-vega` - produce a + [Vega specification](https://vega.github.io/vega/docs/specification/) file + instead of HTML. See `dvc plots` for more info. -- `-o, --stdout` - print plot content to stdout. - -- `--no-csv-header` - provided CSV or TSV datafile does not have a header. +- `--no-header` - lets DVC know that CSV or TSV `targets` do not have a header. + A 0-based numeric index can be used to identify each column instead of names. - `-h`, `--help` - prints the usage/help message, and exit. @@ -61,11 +75,50 @@ the supported data formats and other relevant details about DVC plots. - `-v`, `--verbose` - displays detailed tracing information. -## Examples +## Example: Hierarchical data + +We'll use tabular metrics file `train.json` for this example: + +```json +{ + "train": [ + { "accuracy": 0.96658, "loss": 0.10757 }, + { "accuracy": 0.97641, "loss": 0.07324 }, + { "accuracy": 0.87707, "loss": 0.08136 }, + { "accuracy": 0.87402, "loss": 0.09026 }, + { "accuracy": 0.8795, "loss": 0.0764 }, + { "accuracy": 0.88038, "loss": 0.07608 }, + { "accuracy": 0.89872, "loss": 0.08455 } + ] +} +``` + +DVC identifies and plots JSON objects from the first JSON array found in the +file (`train`): + +```dvc +$ dvc plots show train.json +file:///Users/usr/src/plots/train.json.html +``` + +![](/img/plots_show_json.svg) + +> Note that only the last field name (`loss`) is used for the plot by default. + +Use the `-y` option to change the field to plot: + +```dvc +$ dvc plots show -y accuracy train.json +file:///Users/usr/src/plots/logs.json.html +``` + +![](/img/plots_show_json_field.svg) + +## Example: Tabular data We'll use tabular metrics file `logs.csv` for these examples: -```csv +``` epoch,accuracy,loss,val_accuracy,val_loss 0,0.9418667,0.19958884770199656,0.9679,0.10217399864746257 1,0.9763333,0.07896138601688048,0.9768,0.07310650711813942 @@ -89,50 +142,33 @@ file:///Users/usr/src/plots/logs.csv.html Use the `-y` option to change the column to plot: ```dvc -$ dvc plots show -y loss logs.csv +$ dvc plots show logs.csv -y loss file:///Users/usr/src/plots/logs.csv.html ``` ![](/img/plots_show_field.svg) -### Plot file size - -Note that by default, all the columns (or fields) are embedded in the plot file -metadata. You can select a subset of the columns using the `--select` option, -which can help reduce the file size: - -```dvc -$ ls -lh /Users/usr/src/plots/logs.csv.html --rw-r--r-- 1 usr grp 2.8K ... /Users/usr/src/plot/logs.csv.html - -$ dvc plots show -y loss --select loss logs.csv -file:///Users/usr/src/plots/logs.csv.html - -$ ls -lh /Users/usr/src/plots/logs.csv.html --rw-r--r-- 1 usr grp 1.8K ... /Users/usr/src/plots/logs.csv.html -``` - ### Headerless tables -A tabular data file without headers can be plotted with `--no-csv-header` -option. A field or column can be specified with `--select` by it's numeric -position (starting with `0`): +A tabular data file without headers can be plotted with `--no-header` option. A +column can be specified with `-y` by it's numeric position (starting with `0`): ```dvc -$ dvc plots show --no-csv-header --select 2 logs.csv +$ dvc plots show --no-header logs.csv -y 2 file:///Users/usr/src/plots/logs.csv.html ``` -### Vega specification +## Example: Vega specification file In many automation scenarios (like CI/CD for ML), it is convenient to have the -[Vega-Lite](https://vega.github.io/vega-lite/) specification instead of the -entire HTML plot file. For example to generating another image format like PNG -or JPEG, or to include differently into a web app. The `--no-html` option -prevents wrapping the plot in HTML. Note that the resulting file is JSON: +[Vega specification](https://vega.github.io/vega/docs/specification/) file +instead of a rendered HTML plot file. For example, to generating another image +format like PNG or JPEG, or to include it differently into a web/mobile app. The +`--show-vega` option prevents wrapping this plot spec in HTML. Note that the +resulting file is JSON: ```dvc -$ dvc plots show --select accuracy --no-html logs.csv +$ dvc plots show --show-vega logs.csv -y accuracy file:///Users/usr/src/plots/logs.csv.json ``` @@ -143,50 +179,5 @@ file:///Users/usr/src/plots/logs.csv.json "values": [ { "accuracy": "0.9418667", - "index": 0, - "rev": "workspace" - }, - { - "accuracy": "0.9763333", - "index": 1, - "rev": "workspace" - }, ... ``` - -## Example: Hierarchical data (JSON) - -We'll use tabular metrics file `train.json` for this example: - -```json -{ - "train": [ - { "accuracy": 0.96658, "loss": 0.10757 }, - { "accuracy": 0.97641, "loss": 0.07324 }, - { "accuracy": 0.87707, "loss": 0.08136 }, - { "accuracy": 0.87402, "loss": 0.09026 }, - { "accuracy": 0.8795, "loss": 0.0764 }, - { "accuracy": 0.88038, "loss": 0.07608 }, - { "accuracy": 0.89872, "loss": 0.08455 } - ] -} -``` - -DVC identifies and plots JSON objects from the first JSON array found in the -file: - -```dvc -$ dvc plots show train.json -file:///Users/usr/src/plots/train.json.html -``` - -![](/img/plots_show_json.svg) - -Same as with tabular data, use the `-y` option to change the field to plot: - -```dvc -$ dvc plots show -y accuracy train.json -file:///Users/usr/src/plots/logs.json.html -``` - -![](/img/plots_show_json_field.svg) diff --git a/content/docs/command-reference/remote/add.md b/content/docs/command-reference/remote/add.md index e08a65c669..dcc39b5e09 100644 --- a/content/docs/command-reference/remote/add.md +++ b/content/docs/command-reference/remote/add.md @@ -93,7 +93,7 @@ The following are the types of remote storage (protocols) supported: $ dvc remote add -d myremote s3://bucket/path ``` -By default DVC expects your AWS CLI is already +By default, DVC expects your AWS CLI is already [configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html). DVC will be using default AWS credentials file to access S3. To override some of these settings, use the parameters described in `dvc remote modify`. @@ -237,7 +237,7 @@ modified. $ dvc remote add -d myremote gs://bucket/path ``` -By default DVC expects your GCP CLI is already +By default, DVC expects your GCP CLI is already [configured](https://cloud.google.com/sdk/docs/authorizing). DVC will be using default GCP key file to access Google Cloud Storage. To override some of these settings, use the parameters described in `dvc remote modify`. diff --git a/content/docs/command-reference/remote/modify.md b/content/docs/command-reference/remote/modify.md index 7dd5235792..60493cd499 100644 --- a/content/docs/command-reference/remote/modify.md +++ b/content/docs/command-reference/remote/modify.md @@ -78,7 +78,7 @@ The following are the customizable types of remote storage (protocols): ### Click for Amazon S3 -By default DVC expects your AWS CLI is already +By default, DVC expects your AWS CLI is already [configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html). DVC will be using default AWS credentials file to access S3. To override some of these settings, you could use the following options: diff --git a/content/docs/command-reference/status.md b/content/docs/command-reference/status.md index 50a1ce7b21..9c1a270adb 100644 --- a/content/docs/command-reference/status.md +++ b/content/docs/command-reference/status.md @@ -78,7 +78,7 @@ describing the changes (described below). corresponding file hash saved in a DVC-file yet. - _modified_: An output or dependency is found in the workspace, but the corresponding file hash the DVC-file is not up to date. - - _deleted_: The output or dependency is references in a DVC-file, but does + - _deleted_: The output or dependency is referenced in a DVC-file, but does not exist in the workspace. - _not in cache_: An output exists in workspace and the corresponding file hash in the DVC-file is up to date, but there is no corresponding diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json index 62f5071126..38ebaa26c8 100644 --- a/content/docs/sidebar.json +++ b/content/docs/sidebar.json @@ -299,6 +299,10 @@ { "label": "plots diff", "slug": "diff" + }, + { + "label": "plots modify", + "slug": "modify" } ] }, diff --git a/content/docs/tutorials/deep/sharing-data.md b/content/docs/tutorials/deep/sharing-data.md index 199dbe59f7..19b297bbc0 100644 --- a/content/docs/tutorials/deep/sharing-data.md +++ b/content/docs/tutorials/deep/sharing-data.md @@ -38,8 +38,7 @@ $ dvc push ``` The command does not push all cached files, but only the ones currently -references in the workspace (in the _working tree_ of the Git -repo). +referenced in the workspace. For example, in this tutorial 16 data files were created and only 9 will be pushed because the rest of the data files belong to different branches like diff --git a/content/docs/tutorials/get-started/index.md b/content/docs/tutorials/get-started/index.md index f887f9c4b1..585e30085b 100644 --- a/content/docs/tutorials/get-started/index.md +++ b/content/docs/tutorials/get-started/index.md @@ -1,9 +1,9 @@ # Get Started with DVC! -Data Version Control is a data version control, data pipelining, and experiment -management command-line tool built on top of existing engineering toolsets and -practices, particularly Git. In this guide we will show the basic features of -DVC step by step. +DVC is a **data version control**, pipeline management, and experiment +management tool that brings existing engineering toolsets and practices to data +science and machine learning. In this guide we will introduce the basic features +and concepts of DVC step by step. ## Initialize diff --git a/content/docs/tutorials/pipelines.md b/content/docs/tutorials/pipelines.md index 5283c3ea2c..8594dc3af2 100644 --- a/content/docs/tutorials/pipelines.md +++ b/content/docs/tutorials/pipelines.md @@ -368,7 +368,7 @@ Once that's done, check the AUC metric again for an improvement: ```dvc $ dvc metrics show -a -working tree: +workspace: auc.metric: AUC: 0.648462 master: auc.metric: AUC: 0.587951 diff --git a/content/docs/user-guide/basic-concepts/external-dependency.md b/content/docs/user-guide/basic-concepts/external-dependency.md deleted file mode 100644 index 731824592c..0000000000 --- a/content/docs/user-guide/basic-concepts/external-dependency.md +++ /dev/null @@ -1,9 +0,0 @@ ---- -name: 'External Dependency' -match: ['external dependency', 'external dependencies'] ---- - -A DVC-file dependency with origin in an external source, for example HTTP, SSH, -Amazon S3, Google Cloud Storage remote locations, or even other DVC -repositories. See -[External Dependencies](/doc/user-guide/external-dependencies). diff --git a/content/docs/user-guide/contributing/blog.md b/content/docs/user-guide/contributing/blog.md index 6c1e3ebf07..b6af282e19 100644 --- a/content/docs/user-guide/contributing/blog.md +++ b/content/docs/user-guide/contributing/blog.md @@ -42,23 +42,23 @@ tags: ``` -- `title` - **Required.** Title of the post. -- `date` - **Required.** Publication date in the `YYYY-MM-DD` format. Will be +- `title` (**required**) - title of the post. +- `date` (**required**) - publication date in the `YYYY-MM-DD` format. Will be used to sort posts and in RSS. -- `description` - **Required.** Short description to show in the feed. -- `descriptionLong` - Optional long description to show before the image on the - post page. If not set, `description` will be used instead. Supports basic +- `description` (**required**) - short description to show in the feed. +- `descriptionLong` (optional) - long description to show before the image on + the post page. If not set, `description` will be used instead. Supports basic Markdown markup. -- `picture` - Optional cover image, relative to `static/uploads/images` -- `pictureComment` - Optional cover image comment. Supports basic Markdown +- `picture` (optional) - cover image, relative to `static/uploads/images` +- `pictureComment` (optional) - cover image comment. Supports basic Markdown markup. -- `author` - **Required.** The name of the file in `content/authors` +- `author` (**required**) - base name of the file in `content/authors` representing this post's author. See [Adding authors](/doc/user-guide/contributing/blog#adding-authors) to add a new author. -- `commentsUrl` - Optional link to the [DVC forum](https://discuss.dvc.org) +- `commentsUrl` (optional) - link to the [DVC forum](https://discuss.dvc.org) topic. It will contain comments for the post. -- `tags` - Optional list of tags. +- `tags` (optional) - list of tags. ## Content guidelines @@ -137,7 +137,7 @@ avatar: avatar.jpeg link: https://www.twitter.com/johndoe ``` -- `name` – **Required.** Author's name. -- `avatar` - **Required.** Path to the author's avatar, relative to +- `name` (**required**) – author's name. +- `avatar` (**required**) - path to the author's avatar, relative to `static/uploads/avatars` (1024x1024 is recommended). -- `link` - Optional location that the author's name will link to. +- `link` (optional) - location that the author's name will link to. diff --git a/content/docs/user-guide/dvc-file-format.md b/content/docs/user-guide/dvc-file-format.md index a4b13ffe03..d1036dfa95 100644 --- a/content/docs/user-guide/dvc-file-format.md +++ b/content/docs/user-guide/dvc-file-format.md @@ -59,8 +59,9 @@ A dependency entry consists of a these possible fields: - `path`: Path to the dependency, relative to the `wdir` path (always present) - `md5`: MD5 hash for the dependency (most [stages](/doc/command-reference/run)) -- `etag`: Strong ETag response header (only HTTP external - dependencies created with `dvc import-url`) +- `etag`: Strong ETag response header (only HTTP + [external dependencies](/doc/user-guide/external-dependencies) created with + `dvc import-url`) - `params`: If this is a [parameter dependency](/doc/command-reference/params) file, contains a list of the parameter names and their current values. - `repo`: This entry is only for external dependencies created with diff --git a/content/docs/user-guide/dvcignore.md b/content/docs/user-guide/dvcignore.md index 783a8ab48c..22f8e0c1bb 100644 --- a/content/docs/user-guide/dvcignore.md +++ b/content/docs/user-guide/dvcignore.md @@ -80,7 +80,7 @@ $ tree .dvc/cache └── c3d3797971f12c7f5e1d106dd5cee2 ``` -Only the checksums of a directory (`data/`) and one files have been +Only the hash values of a directory (`data/`) and one file have been cached. This means that `dvc add` ignored one of the files (`data1`). diff --git a/content/docs/user-guide/large-dataset-optimization.md b/content/docs/user-guide/large-dataset-optimization.md index 4c64803ce6..40b1b57076 100644 --- a/content/docs/user-guide/large-dataset-optimization.md +++ b/content/docs/user-guide/large-dataset-optimization.md @@ -105,7 +105,7 @@ efficiency: ## Configuring DVC cache file link type -By default DVC tries to use reflinks for the cache if available on +By default, DVC tries to use reflinks for the cache if available on your system, however this is not the most common case at this time, so it falls back to the copying strategy. If you wish to enable hard or soft links, you can configure DVC like this: diff --git a/static/img/plots_mod_acc.svg b/static/img/plots_mod_acc.svg new file mode 100644 index 0000000000..b882b81959 --- /dev/null +++ b/static/img/plots_mod_acc.svg @@ -0,0 +1 @@ +01234index0.940.950.960.970.980.99accuracyworkspacerev \ No newline at end of file diff --git a/static/img/plots_mod_acc_titles.svg b/static/img/plots_mod_acc_titles.svg new file mode 100644 index 0000000000..00172f9d07 --- /dev/null +++ b/static/img/plots_mod_acc_titles.svg @@ -0,0 +1 @@ +01234Epoch0.940.950.960.970.980.99accuracyworkspacerevAccuracy \ No newline at end of file diff --git a/static/img/plots_mod_loss.svg b/static/img/plots_mod_loss.svg new file mode 100644 index 0000000000..a76f662fad --- /dev/null +++ b/static/img/plots_mod_loss.svg @@ -0,0 +1 @@ +01234index0.050.100.150.20lossworkspacerev \ No newline at end of file diff --git a/static/img/plots_show.svg b/static/img/plots_show.svg index eae88eb33e..2e49efef9c 100644 --- a/static/img/plots_show.svg +++ b/static/img/plots_show.svg @@ -1 +1 @@ -01234567x0.000.020.040.060.080.10val_lossworkspacerevlogs.csv \ No newline at end of file +01234567index0.070.080.090.10val_lossworkspacerev \ No newline at end of file diff --git a/static/img/plots_show_field.svg b/static/img/plots_show_field.svg index d4830836a9..ddaabf710b 100644 --- a/static/img/plots_show_field.svg +++ b/static/img/plots_show_field.svg @@ -1 +1 @@ -01234567x0.000.050.100.150.20lossworkspacerevlogs.csv \ No newline at end of file +01234567index0.000.050.100.150.20lossworkspacerev \ No newline at end of file diff --git a/static/img/plots_show_json.svg b/static/img/plots_show_json.svg index 444df28a17..0329ff8057 100644 --- a/static/img/plots_show_json.svg +++ b/static/img/plots_show_json.svg @@ -1 +1 @@ -0123456index0.000.020.040.060.080.10lossworkspacerev \ No newline at end of file +0123456index0.080.090.100.11lossworkspacerev \ No newline at end of file diff --git a/static/img/plots_show_json_field.svg b/static/img/plots_show_json_field.svg index 0c907a7bd5..e2743f3d79 100644 --- a/static/img/plots_show_json_field.svg +++ b/static/img/plots_show_json_field.svg @@ -1 +1 @@ -0123456index0.00.20.40.60.81.0accuracyworkspacerev \ No newline at end of file +0123456index0.880.900.920.940.960.98accuracyworkspacerev \ No newline at end of file