Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc. updates (2.0ish) #2062

Merged
merged 41 commits into from
Jan 5, 2021
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
1cf6519
Merge branch 'master' into jorge
jorgeorpinel Dec 21, 2020
8dd77a9
config: standardize sample variable/option values
jorgeorpinel Dec 22, 2020
df113ec
Merge branch 'master' into jorge
jorgeorpinel Dec 22, 2020
4dd5322
cmd: misc updates to repro, gc, run
jorgeorpinel Dec 22, 2020
fa30e85
Merge branch 'jorge' of github.com:iterative/dvc.org into jorge
jorgeorpinel Dec 22, 2020
27b3007
Merge branch 'master' into jorge
jorgeorpinel Dec 23, 2020
2e370de
Merge branch 'jorge' of github.com:iterative/dvc.org into jorge
jorgeorpinel Dec 23, 2020
ec5cb83
Merge branch 'master' into jorge
jorgeorpinel Dec 27, 2020
e7297b5
blog: manually format a md string in frontmatter
jorgeorpinel Dec 27, 2020
523af6b
guide: multi cmd in main dvc.yaml example
jorgeorpinel Dec 27, 2020
442d325
guide: copy edit in dvc.yaml
jorgeorpinel Dec 27, 2020
3e5435a
cmd: remove term "self-incrementing"
jorgeorpinel Dec 27, 2020
95965ea
guide: re-instate deleted changes
jorgeorpinel Dec 27, 2020
7a59869
cmd: review params refs
jorgeorpinel Dec 27, 2020
a7f35ef
blog: fix broken frontmatter
jorgeorpinel Dec 27, 2020
194e1e4
cmd: update params def. explanation in diff like it is in index
jorgeorpinel Dec 27, 2020
4e1cb47
cmd: absorb a params diff note into a p
jorgeorpinel Dec 27, 2020
8ede12f
cmd: more improvements and reord to params docs
jorgeorpinel Dec 28, 2020
0930fa4
guide: expand on dvc.yaml params field (multiple params files)
jorgeorpinel Dec 28, 2020
bf0f04c
guide: remove emoji we never use from recs
jorgeorpinel Dec 28, 2020
59fe782
Merge branch 'master' into jorge
jorgeorpinel Dec 29, 2020
082d52d
blog: remove prettier-ignore
jorgeorpinel Dec 29, 2020
7165723
cmd: differentiate between params and param deps in params refs
jorgeorpinel Dec 29, 2020
917b0ab
cmd: start with example in params index
jorgeorpinel Dec 29, 2020
7dc0596
cmd: move params value info to its section
jorgeorpinel Dec 29, 2020
3cdd408
cmd: review --targets arg descs.
jorgeorpinel Dec 29, 2020
64bf6b7
cmd: params diff --targets don't expand anything
jorgeorpinel Dec 29, 2020
dc38b81
Merge branch 'master' into jorge
jorgeorpinel Jan 2, 2021
919f6fd
cmd: remove 3rd mention of repro in params Desc
jorgeorpinel Jan 2, 2021
0579261
cmd: update params diff --targets
jorgeorpinel Jan 2, 2021
5d38ea0
cmd: update metrics diff --targets
jorgeorpinel Jan 2, 2021
edf7565
cmd: std. --targets option accross refs
jorgeorpinel Jan 2, 2021
a163a35
cmd: simplify params Desc and fix Examples
jorgeorpinel Jan 2, 2021
25834cc
cmd: make params intro sample realistic
jorgeorpinel Jan 2, 2021
64b31de
cmd: clarify default behavior of params diff
jorgeorpinel Jan 2, 2021
cbd6d62
cmd: clarify about params/metrics/plots diff --tagets
jorgeorpinel Jan 2, 2021
53ed2d2
cmd: typo
jorgeorpinel Jan 2, 2021
d8221e0
Merge branch 'master' into jorge
jorgeorpinel Jan 3, 2021
19c846a
cmd: note that metrics/params diff work in any Git repo
jorgeorpinel Jan 3, 2021
ff5665d
cmd: clarify more about default params used by diff
jorgeorpinel Jan 3, 2021
3e7d40f
cmd: final details on params index
jorgeorpinel Jan 3, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions content/blog/2020-04-06-april-20-dvc-heartbeat.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ descriptionLong: |
projects by our users and big ideas about best practices in ML and data
science.
picture: 2020-04-06/april_header.png
pictureComment:
A view from [Barrancas del
Cobre](https://en.wikipedia.org/wiki/Copper_Canyon), shot by Jorge Orpinel
Pérez. Jorge has mastered the art of working on DVC remotely.
pictureComment: |
A view from
[Barrancas del Cobre](https://en.wikipedia.org/wiki/Copper_Canyon), shot by
Jorge Orpinel Pérez. Jorge has mastered the art of working on DVC remotely.
author: elle_obrien
commentsUrl: https://discuss.dvc.org/t/april-20-heartbeat/347
tags:
Expand Down
13 changes: 4 additions & 9 deletions content/docs/command-reference/diff.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,18 +44,13 @@ for example when `dvc init` was used with the `--no-scm` option.

## Options

- `--targets <paths>` - limit command scope to these paths. When specifying
arguments for `--targets` before `a_rev`/`b_rev`, you should use `--` after
this option's arguments, e.g.:
- `--targets <paths>` - specific DVC-tracked files to compare.

```dvc
$ dvc diff --targets t1.json t2.yaml -- HEAD v1
```

Alternatively, you can also run the above statement as:
When specifying arguments for `--targets` before `a_rev`/`b_rev`, you should
use `--` after this option's arguments (POSIX terminals), e.g.:

```dvc
$ dvc diff HEAD v1 --targets t1.json t2.json
$ dvc diff --targets t1.json t2.yaml -- HEAD v1
```
shcheklein marked this conversation as resolved.
Show resolved Hide resolved

- `--show-json` - prints the command's output in easily parsable JSON format,
Expand Down
16 changes: 6 additions & 10 deletions content/docs/command-reference/metrics/diff.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,19 +36,15 @@ lists all the current metrics, without comparisons.

## Options

- `--targets <paths>` - limit command scope to these metrics files. Using `-R`,
directories to search metrics files in can also be given. When specifying
arguments for `--targets` before `revisions`, you should use `--` after this
option's arguments, e.g.:
- `--targets <paths>` - specific metrics files to compare. It accepts `paths` to
any valid metrics file, regardless of whether it's used by DVC. Using `-R`,
directories to search metrics files in can also be given.

```dvc
$ dvc metrics diff --targets t1.json t2.yaml -- HEAD v1
```

Alternatively, you can also run the above statement as:
When specifying arguments for `--targets` before `revisions`, you should use
`--` after this option's arguments (POSIX terminals), e.g.:

```dvc
$ dvc metrics diff HEAD v1 --targets t1.json t2.json
$ dvc metrics diff --targets t1.json t2.yaml -- HEAD v1
```

- `-R`, `--recursive` - determines the metrics files to use by searching each
Expand Down
38 changes: 16 additions & 22 deletions content/docs/command-reference/params/diff.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# params diff

Show changes in [parameter dependencies](/doc/command-reference/params) between
commits in the <abbr>DVC repository</abbr>, or between a commit and the
Show changes in [parameters](/doc/command-reference/params) between commits in
the <abbr>DVC repository</abbr>, or between a commit and the
<abbr>workspace</abbr>.

## Synopsis

```usage
usage: dvc params diff [-h] [-q | -v] [--targets [<path> [<path> ...]]]
usage: dvc params diff [-h] [-q | -v] [--targets [<paths> [<paths> ...]]]
[--all] [--show-json] [--show-md] [--no-path]
[a_rev] [b_rev]

Expand All @@ -19,35 +19,29 @@ positional arguments:

## Description

This command provides a quick way to compare parameter values among experiments
in the repository history. Requires that Git is being used to version the
project params.
Provides a quick way to compare parameter values among experiments in the
repository history. Requires that Git is being used to version the project
params.

> Parameter dependencies are defined with the `-p` option in `dvc run`. See also
> `dvc params`.
> Parameter dependencies are defined in the `params` field of `dvc.yaml` (e.g.
> with the the `-p` (`--params`) option of `dvc run`).

Without arguments, this command compares parameters currently present in the
<abbr>workspace</abbr> (uncommitted changes) with the latest committed version.

Supported parameter _value_ types are: string, integer, float, and arrays. DVC
itself does not ascribe any specific meaning for these values.

❗ By default it only shows parameters that were changed.
This includes everything in the default parameters file (`params.yaml`) as well
as all `params` found in [DVC files](/doc/user-guide/dvc-files). Only params
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved
that have changes are listed.

## Options

- `--targets <paths>` - limit command scope to these params files. When
specifying arguments for `--targets` before `revisions`, you should use `--`
after this option's arguments, e.g.:
- `--targets <paths>` - specific params files to compare. It accepts `paths` to
any valid parameters file, regardless of whether it's used by DVC.

```dvc
$ dvc params diff --targets m1.json m2.yaml -- HEAD v1
```

Alternatively, you can also run the above statement as:
When specifying arguments for `--targets` before `a_rev`/`b_rev`, you should
use `--` after this option's arguments (POSIX terminals), e.g.:

```dvc
$ dvc params diff HEAD v1 --targets m1.json m2.json
$ dvc params diff --targets m1.json m2.yaml -- HEAD v1
```

- `--all` - prints all parameters including not changed.
Expand Down
93 changes: 57 additions & 36 deletions content/docs/command-reference/params/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,44 +18,64 @@ positional arguments:

In order to track parameters and hyperparameters associated to machine learning
experiments in <abbr>DVC projects</abbr>, DVC provides a different type of
dependencies: _parameters_. Parameters are defined using the the `-p`
(`--params`) option of `dvc run`, using simple names like `epochs`,
dependencies: _parameters_. They usually have simple names like `epochs`,
`learning-rate`, `batch_size`, etc.

In contrast to a regular <abbr>dependency</abbr>, a parameter is not a file (or
directory). Instead, it consists of a _parameter name_ (or key) to find inside a
YAML 1.2, JSON, TOML, or [Python](#examples-python-parameters-file) _parameters
file_. Multiple parameter dependencies can be specified from one or more
parameters files.
To start tracking parameters, list them under the `params` field of `dvc.yaml`
stages (manually or with the the `-p`/`--params` option of `dvc run`). For
example:

The default parameters file name is `params.yaml`. Parameters should be
organized as a tree hierarchy inside, as DVC will locate param names by their
tree path. Parameters files have to be manually written, or generated, and these
can be versioned directly with Git.
```yaml
stages:
mystage:
cmd: ./myscript.sh
params:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use epochs, learning-rate, batch_siz above, let's do the same in the example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is way better 😅. Done!

- foo
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved
- bar.baz
- myparams.toml:
- qux
```

Supported parameter _value_ types are: string, integer, float, and arrays. DVC
itself does not ascribe any specific meaning for these values. They are
user-defined, and serve as a way to generalize and parametrize an machine
learning algorithms or data processing code.
> By default, parameters are read from `params.yaml`. Other params files can be
> listed too, with sub-lists of the params found in them (as shown above).
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved

DVC saves the param names and their latest values in the `dvc.yaml` file. These
values will be compared to the ones in the params files to determine if the
stage is invalidated upon pipeline [reproduction](/doc/command-reference/repro).
In contrast to a regular <abbr>dependency</abbr>, a parameter is not a file or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a parameter dependency ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

directory. Instead, it consists of a _parameter name_ (or key) in a
[parameters file](#parameters-file), where the _parameter value_ should be
found. This allows you to define [stage](/doc/command-reference/run)
dependencies more granularly: changes to other parts of the params file will not
affect the stage. Parameter dependencies also prevent situations where several
stages share a regular dependency (e.g. a config file), and any change in it
invalidates all these stages, causing unnecessary re-executions upon
`dvc repro`.

> Note that DVC does not pass the parameter values to stage commands. The
> associated command executed by `dvc run` or `dvc repro` will have to open and
> parse the parameters file by itself, and use the params specified with `-p`.
The `dvc params diff` command is available to show parameter changes, displaying
their current and previous [values](#parameter-values).

### Parameters files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This content should be simple enough to avoid additional structure to my mind. Keep it simpler, remove repetitions, move from high level explanation (an example) to the details or/and advanced cases (custom file name).

Also keep in mind, that dvc params diff can actually operate on any file now, not even registered in dvc.yaml.

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Dec 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

content should be simple enough to avoid additional structure

Yes that would be ideal although I'm not sure that sections hurt in this instance (there's several relevant aspects of params so maybe sections are appropriate for a reference doc.)

But 1. all of this content was already there (in fact this PR already makes the text shorter compared to https://dvc.org/doc/command-reference/params and 2. again, since we will move a lot of this info to basic concepts soonish, should we let it be for now?

keep in mind, that dvc params diff can actually operate on any file now

Good point. It will also operate on parameters insterted to dvc.yaml from vars soon... Will address this ⌛ UPDATE: See #2062 (review) below.

Copy link
Member

@shcheklein shcheklein Dec 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I don't see any value in those sections, they have a lot of repetition with the previous introductory section. They don't have any motivation behind them (not clear why do they exist, how do they connect to the other text).

Will address this ⌛

is is addressed already? UPDATE: See #2062 (review) below.

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Jan 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sections, they have a lot of repetition

Maybe in past iterations? I'm not seeing the repetition right now (just 2 mentions of repro).

motivation behind them (not clear why do they exist... how do they connect

They are the references for parameters files and for parameter values, which are not detailed in the Description, but they are linked as part of the explanation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a few concerns (overall comes down to complicating and making things less clear):

  • Parameters files section repeats the note to my mind (expands a bit, but it's hard to justify both to my mind)
  • It's hard to justify the whole section for a single sentence content as well.
  • Parameter values starting from the DVC saves ... has some general content about the params not only values.
  • If deal with that ^^ we'll one-paragraph content again.
  • Most important - both sections appear w/o any motivation, w/o any reference to them. See the last comment They don't have any motivation behind them (not clear why do they exist, how do they connect to the other text). When I read it I don't understand this structure, why does it exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any repetition I should definitely address (⌛), its kind of a separate issue (it could be there with or without sections).

hard to justify the whole section for a single sentence
DVC saves ... has some general content

Good points. OK, I'll removed the headers ⌛

both sections appear w/o any motivation, w/o any reference to them

Not sure what you mean by w/o ref. since they're linked to from the main Desc. but indeed there's no intro paragraph giving them some motivation, that's because these are reference docs so we don't have a story in each one. Sections can be pretty useful in this kinds of docs so I'm not sure why we're trying to avoid them in general (your specific points did convince me this time though). For example what about https://dvc.org/doc/command-reference/metrics#supported-file-formats? We have many of those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I removed the H3s and addressed the other specific feedback.

I also just realized that the Examples had some problems so I threw that in. PTAL


The parameters concept helps to define [stage](/doc/command-reference/run)
dependencies more granularly. A particular parameter or set of parameters will
be required for the stage invalidation (see `dvc status` and `dvc repro`).
Changes to other parts of the dependency file will not affect the stage. This
prevents situations where several stages share a (configuration) file as a
common dependency, and any change in this dependency invalidates all these
stages and causes their reproduction unnecessarily.
The default params file name is `params.yaml`, but any other YAML 1.2, JSON,
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved
TOML, or [Python](#examples-python-parameters-file) files can be used
additionally. These files are typically written manually (or they can be
generated) and they can be versioned directly with Git.

`dvc params diff` is available to show changes in parameters, displaying the
param names as well as their current and previous values.
### Parameter values

Param values should be organized in tree-like hierarchies (dictionaries) inside
param files (see [Examples](#examples)). DVC will interpret param names as the
tree path to find those values.

Supported types are: string, integer, float, and arrays. Note that DVC does not
ascribe any specific meaning to these values.

DVC saves parameter names and values in the project's
[DVC files](/doc/user-guide/dvc-files) in order to track them over time. They
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case we can be specific? dvc.lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Specified.

will be compared to the latest params files to determine if the stage is
outdated upon `dvc repro` (or `dvc status`).

> Note that DVC does not pass the parameter values to stage commands. The
> commands executed by DVC will have to load and parse the parameters file by
> itself.

## Options

Expand Down Expand Up @@ -95,8 +115,8 @@ $ dvc run -n train -d users.csv -o model.pkl \
> Note that we could use the same parameter addressing with JSON, TOML, or
> Python parameters files.

The `train.py` script will have some code to parse the needed parameters. For
example:
The `train.py` script will have some code to parse and load the needed
parameters. For example:

```py
import yaml
Expand All @@ -109,11 +129,12 @@ epochs = params['train']['epochs']
layers = params['train']['layers']
```

You can find that each parameter and it's value were saved to `dvc.yaml`. These
values will be compared to the ones in the parameters files whenever `dvc repro`
is used, to determine if dependency to the params file is invalidated:
You can find that each parameter and their values were saved to `dvc.yaml` and
`dvc.lock`. These are compared to the values in the params files when
`dvc repro` is used to determine if the parameter dependency has changed.

```yaml
# dvc.yaml
stages:
train:
cmd: python train.py
Expand All @@ -127,7 +148,7 @@ stages:
```

Alternatively, the entire group of parameters `train` can be referenced, instead
of specifying each of the group parameters separately:
of specifying each of the params separately:

```dvc
$ dvc run -n train -d users.csv -o model.pkl \
Expand Down
28 changes: 12 additions & 16 deletions content/docs/command-reference/plots/diff.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ overlaying them in a single image. This allows to compare them easily.
## Synopsis

```usage
usage: dvc plots diff [-h] [-q | -v] [--targets [<path> [<path> ...]]]
usage: dvc plots diff [-h] [-q | -v]
[--targets [<paths> [<paths> ...]]]
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved
[-t <name_or_path>] [-x <field>] [-y <field>]
[--no-header] [--title <text>]
[--x-label <text>] [--y-label <text>] [-o <path>]
Expand All @@ -24,7 +25,7 @@ This command is a way to visualize the "difference" between
versions of the <abbr>repository</abbr>, by overlaying them in a single plot.

> Note that unlike `dvc metrics diff`, this command does not calculate numeric
> differences between metrics file values.
> differences between plots file values.

`revisions` are Git commit hashes, tag, or branch names. If none are specified,
`dvc plots diff` compares plots currently present in the <abbr>workspace</abbr>
Expand All @@ -34,14 +35,13 @@ revision results in comparing the workspace and that version.
💡 Note that any number of `revisions` can be provided (the resulting plot shows
all of them in a single image).

All plots defined in `dvc.yaml` are used by default, but specific plots files
can be specified with the `--targets` option (note that targets don't
necessarily have to be defined in `dvc.yaml`).
All plots defined in `dvc.yaml` are used by default, but specific files can be
specified with the `--targets` option (any valid plots file is accepted).

The plot style can be customized with
[plot templates](/doc/command-reference/plots#plot-templates), using the
`--template` option. To learn more about metrics file formats and templates
please see `dvc plots`.
`--template` option. To learn more about plots files and templates please see
`dvc plots`.

> Note that the default behavior of this command can be modified per metrics
> file with `dvc plots modify`.
Expand All @@ -51,18 +51,14 @@ all the current plots, without comparisons.

## Options

- `--targets <path>` - specific metrics files to visualize. When specifying
arguments for `--targets` before `revisions`, you should use `--` after this
option's arguments, e.g.:
- `--targets <paths>` - specific plots files to visualize. It accepts `paths` to
any valid plots file, regardless of whether it's used by DVC.

```dvc
$ dvc plots diff --targets t1.json t2.csv -- HEAD v1 v2
```

Alternatively, you can also run the above statement as:
When specifying arguments for `--targets` before `revisions`, you should use
`--` after this option's arguments, e.g.:

```dvc
$ dvc plots diff HEAD v1 v2 --targets t1.json t2.csv
$ dvc plots diff --targets t1.json t2.csv -- HEAD v1 v2
```

- `-o <path>, --out <path>` - name of the generated file. By default, the output
Expand Down
4 changes: 2 additions & 2 deletions content/docs/command-reference/plots/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,8 +125,8 @@ All metrics files given to `dvc plots show` and `dvc plots diff` as input are
combined together into a single data array for injection into a template file.
There are two important fields that DVC adds to the plot data:

- `index` - self-incrementing, zero-based counter for the data rows/values. In
many cases it corresponds to a machine learning training epoch or step number.
- `index` - zero-based counter for the data rows/values. In many cases it
shcheklein marked this conversation as resolved.
Show resolved Hide resolved
corresponds to a machine learning training epoch or step number.

- `rev` - Git commit hash, tag, or branch of the metrics file. This helps
distinguish between different versions when using the `dvc plots diff`
Expand Down
3 changes: 1 addition & 2 deletions content/docs/user-guide/contributing/docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -202,5 +202,4 @@ We also use "emoji" symbols sparingly for visibility on certain notes. Mainly:
and "Note that..." notes)
- 💡 Useful notes and tips, often related to external tools and integrations

> Some other emojis currently in use here and there: ⚡✅🙏🐛⭐❗ (among
> others).
> Some other emojis currently in use here and there: ⚡✅🙏🐛⭐ (among others).
21 changes: 14 additions & 7 deletions content/docs/user-guide/dvc-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,9 +143,12 @@ stages:
metrics:
- performance.json
training:
desc: Training stage description
cmd: python train.py
desc: Train model with Python
cmd:
- pip install -r requirements.txt
- python train.py --out ${model_file}
deps:
- requirements.txt
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved
- train.py
- features
outs:
Expand All @@ -163,15 +166,19 @@ stages:
by the user with the `--name` (`-n`) option of `dvc run`. Each stage can contain
the following fields:

- `cmd` (always present): Executable command defined in this stage
- `cmd` (always present): One or more commands executed by the stage (may
contain either a single value, or a list). Commands are executed sequentially
until all are finished or until one of them fails (see `dvc repro` for
details).
- `wdir`: Working directory for the stage command to run in (relative to the
file's location). If this field is not present explicitly, it defaults to `.`
(the file's location).
- `deps`: List of <abbr>dependency</abbr> file or directory paths of this stage
(relative to `wdir` which defaults to the file's location). See
[Dependency entries](#dependency-entries) above for more details.
- `params`: List of <abbr>parameter</abbr> dependency keys (field names) that
are read from a YAML, JSON, TOML, or Python file (`params.yaml` by default)
- `params`: List of <abbr>parameter</abbr> dependency keys (field names) to
track in `params.yaml`. The list may also contain other YAML, JSON, TOML, or
Python file names, with a sub-list of the param names to track in them.
Comment on lines -173 to +181
Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Dec 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I also expanded on this dvc.yaml field since we're encouraging users to avoid dvc run now (so we should better explain all the possible ways to write this manually or generate it).

- `outs`: List of <abbr>output</abbr> file or directory paths of this stage
(relative to `wdir` which defaults to the file's location). See
[Output entries](#output-entries) above for more details.
Expand Down Expand Up @@ -207,8 +214,8 @@ For every `dvc.yaml` file, a matching `dvc.lock` (YAML) file usually exists.
It's created or updated by DVC commands such as `dvc run` and `dvc repro`.
`dvc.lock` describes the latest pipeline state. It has several purposes:

- Tracking of intermediate and final results of a pipeline — similar to
[`.dvc` files](#dvc-files).
- Tracking of intermediate and final <abbr>outputs</abbr> of a pipeline —
similar to [`.dvc` files](#dvc-files).
- Allow DVC to detect when stage definitions, or their dependencies have
changed. Such conditions invalidate stages, requiring their reproduction (see
`dvc status`, `dvc repro`).
Expand Down