diff --git a/content/docs/command-reference/exp/run.md b/content/docs/command-reference/exp/run.md
index 96d7851b81..d103dcd247 100644
--- a/content/docs/command-reference/exp/run.md
+++ b/content/docs/command-reference/exp/run.md
@@ -224,7 +224,7 @@ train_config.json train.weight_decay - 0.001
Note that `exp run --set-param` (`-S`) doesn't update your `dvc.yaml`. When
appending or removing parameters, make sure to update the
-[`params` section](https://dvc.org/doc/user-guide/project-structure/dvcyaml-files#parameter-dependencies)
+[`params` section](https://dvc.org/doc/user-guide/project-structure/dvcyaml-files#parameters)
of your `dvc.yaml` accordingly.
diff --git a/content/docs/command-reference/params/diff.md b/content/docs/command-reference/params/diff.md
index 5f647f46a7..da14820bef 100644
--- a/content/docs/command-reference/params/diff.md
+++ b/content/docs/command-reference/params/diff.md
@@ -1,8 +1,7 @@
# params diff
-Show changes in [parameters](/doc/command-reference/params) between commits in
-the DVC repository, or between a commit and the
-workspace.
+Show changes in `dvc params` between commits in the DVC repository,
+or between a commit and the workspace.
> Requires that Git is being used to version the project.
@@ -21,12 +20,15 @@ positional arguments:
## Description
-Provides a quick way to compare parameter values among experiments in the
+Provides a quick way to compare parameters among experiments in the
repository history. The differences shown by this command include the old and
new param values, along with the param name.
-> Parameter dependencies are defined in the `params` field of `dvc.yaml` (e.g.
-> with the the `-p` (`--params`) option of `dvc stage add`).
+
+
+Parameters are defined in the `params` field of `dvc.yaml`. See `dvc params`.
+
+
Without arguments, `dvc params diff` compares parameters currently present in
the workspace (uncommitted changes) with the latest committed
diff --git a/content/docs/command-reference/params/index.md b/content/docs/command-reference/params/index.md
index adb041326b..b86eb66c0c 100644
--- a/content/docs/command-reference/params/index.md
+++ b/content/docs/command-reference/params/index.md
@@ -1,6 +1,6 @@
# params
-Contains a command to show changes in parameters:
+Contains a command to show changes in parameters:
[diff](/doc/command-reference/params/diff).
## Synopsis
@@ -16,62 +16,69 @@ positional arguments:
## Description
-In order to track parameters and hyperparameters associated to machine learning
-experiments in DVC projects, DVC provides a different type of
-dependencies: _parameters_. They usually have simple names like `epochs`,
-`learning-rate`, `batch_size`, etc.
+Parameters can be any values used inside your code to influence the results
+(e.g. machine learning [hyperparameters]). DVC can track these as key/value
+pairs from structured YAML 1.2, JSON, TOML 1.0,
+[or Python](#examples-python-parameters-file) files (`params.yaml` by default).
+Params usually have simple names like `epochs`, `learning-rate`, `batch_size`,
+etc. Example:
-To start tracking parameters, list them under the `params` field of `dvc.yaml`
-stages (manually or with the the `-p`/`--params` option of `dvc stage add`). For
-example:
+```yaml
+epochs: 900
+tuning:
+ - learning-rate: 0.945
+ - max_depth: 7
+paths:
+ - labels: 'materials/labels'
+ - truth: 'materials/ground'
+```
+
+To start tracking parameters, list their names under the `params` field of
+`dvc.yaml` (manually or with the the `-p`/`--params` option of `dvc stage add`).
+For example:
```yaml
stages:
learn:
- cmd: ./deep.py
+ cmd: python deep.py # reads params.yaml internally
params:
- - epochs # track specific parameter (from params.yaml)
- - tuning.learning-rate
- - myparams.toml: # track specific params from custom file
- - batch_size
- - config.json: # track all parameters in this file
+ - epochs # specific param from params.yaml
+ - tuning.learning-rate # nested param from params.yaml
+ - paths # entire group from params.yaml
+ - myparams.toml:
+ - batch_size # param from custom file
+ - config.json: # all params in this file
```
-In contrast to a regular dependency, a parameter dependency is not
-a file or directory. Instead, it consists of a _parameter name_ (or key) in a
-_parameters file_, where the _parameter value_ should be found. This allows you
-to define [stage](/doc/command-reference/run) dependencies more granularly:
-changes to other parts of the params file will not affect the stage. Parameter
-dependencies also prevent situations where several stages share a regular
-dependency (e.g. a config file), and any change in it invalidates all of them
-(see `dvc status`), causing unnecessary re-executions upon `dvc repro`.
-
-The default **parameters file** name is `params.yaml`, but any other YAML 1.2,
-JSON, TOML 1.0, or [Python](#examples-python-parameters-file) files can be used
-additionally (listed under `params:` as shown in the sample above). These files
-are typically written manually (or they can be generated) and they can be
-versioned directly with Git.
-
-**Parameter values** should be organized in tree-like hierarchies (dictionaries)
-inside params files (see [Examples](#examples)). DVC will interpret param names
-as the tree path to find those values. Supported types are: string, integer,
-float, boolean, and arrays (groups of params). Note that DVC does not ascribe
-any specific meaning to these values.
+
-DVC saves parameter names and values to `dvc.lock` in order to track them over
-time. They will be compared to the latest params files to determine if the stage
-is outdated upon `dvc repro` (or `dvc status`).
+See [more details] about this syntax.
+
+
-> Note that DVC does not pass the parameter values to stage commands. The
-> commands executed by DVC will have to load and parse the parameters file by
-> itself.
+Multiple stages of a pipeline can [use the same params file] as
+dependency, but only certain values will affect each
+stage.
+
+Parameters can also be used for [templating] `dvc.yaml` itself (see also **Dict
+Unpacking**), which means you can pass them to your [stage commands] as
+command-line arguments. You can also load them in Python code with
+`dvc.api.params_show()`.
The `dvc params diff` command is available to show parameter changes, displaying
their current and previous values.
-💡 Parameters can also be used for
-[templating](/doc/user-guide/project-structure/dvcyaml-files#templating)
-`dvc.yaml` itself.
+DVC saves parameter names and values to `dvc.lock` in order to track them over
+time. They will be compared to the latest params files to determine if the stage
+is outdated upon `dvc repro` (or `dvc status`).
+
+[hyperparameters]:
+ /doc/user-guide/experiment-management/running-experiments#tuning-hyperparameters
+[use the same params file]:
+ /doc/user-guide/data-pipelines/defining-pipelines#parameter-dependencies
+[more details]: /doc/user-guide/project-structure/dvcyaml-files#parameters
+[templating]: /doc/user-guide/project-structure/dvcyaml-files#templating
+[stage commands]: /doc/user-guide/project-structure/dvcyaml-files#stage-commands
## Options
@@ -98,9 +105,9 @@ process:
bow: 15000
```
-Using `dvc stage add`, define a [stage](/doc/command-reference/run) that depends
-on params `lr`, `layers`, and `epochs` from the params file above. Full paths
-should be used to specify `layers` and `epochs` from the `train` group:
+Using `dvc stage add`, define a stage that depends on params `lr`,
+`layers`, and `epochs` from the params file above. Full paths should be used to
+specify `layers` and `epochs` from the `train` group:
```cli
$ dvc stage add -n train -d train.py -d users.csv -o model.pkl \
@@ -112,7 +119,7 @@ $ dvc stage add -n train -d train.py -d users.csv -o model.pkl \
> Python parameters files.
The `train.py` script will have some code to parse and load the needed
-parameters. For example, you can use `dvc.api.params_show()`:
+parameters. You can use `dvc.api.params_show()` for this:
```py
import dvc.api
@@ -197,9 +204,13 @@ previous version, which is why all `Old` values are `—`.
## Examples: Python parameters file
-> ⚠️ Note that complex expressions (unsupported by
-> [ast.literal_eval](https://docs.python.org/3/library/ast.html#ast.literal_eval))
-> won't be parsed as DVC parameters.
+
+
+See Note that complex expressions (unsupported by
+[ast.literal_eval](https://docs.python.org/3/library/ast.html#ast.literal_eval))
+won't be parsed as DVC parameters.
+
+
Consider this Python parameters file named `params.py`:
@@ -237,8 +248,8 @@ class TestConfig:
METRICS = ['metric']
```
-The following [stage](/doc/command-reference/run) depends on params `BOOL`,
-`INT`, as well as `TrainConfig`'s `EPOCHS` and `layers`:
+The following stage depends on params `BOOL`, `INT`, as well as
+`TrainConfig`'s `EPOCHS` and `layers`:
```cli
$ dvc stage add -n train -d train.py -d users.csv -o model.pkl \
diff --git a/content/docs/user-guide/basic-concepts/parameter.md b/content/docs/user-guide/basic-concepts/parameter.md
index bd1da45ce5..59ff11c65c 100644
--- a/content/docs/user-guide/basic-concepts/parameter.md
+++ b/content/docs/user-guide/basic-concepts/parameter.md
@@ -1,9 +1,10 @@
---
-name: 'Parameter Dependency'
-match: [parameter, parameters, param, params, hyperparameter, hyperparameters]
+name: 'Parameters'
+match: [parameter, parameters]
tooltip: >-
- Pipeline stages (defined in `dvc.yaml`) can depend on specific values inside
- an arbitrary YAML, JSON, TOML, or Python file (`params.yaml` by default).
- Stages are invalid (considered outdated) when any of their parameter values
- change. See [`dvc params`](/doc/command-reference/params).
+ Hyperparameters or other config values used by your code, loaded from a a
+ structured file (`params.yaml` by default). They can be tracked as granular
+ dependencies for stages of DVC pipelines (defined in `dvc.yaml`). DVC can also
+ compare them among machine learning experiments (useful for optimization). See
+ `dvc params`.
---
diff --git a/content/docs/user-guide/experiment-management/index.md b/content/docs/user-guide/experiment-management/index.md
index 01c90c47a3..d8aedc99e1 100644
--- a/content/docs/user-guide/experiment-management/index.md
+++ b/content/docs/user-guide/experiment-management/index.md
@@ -6,8 +6,8 @@ of the development of data features, hyperspace exploration, deep learning
optimization, etc.
Some of DVC's base features already help you codify and analyze experiments.
-[Parameters](/doc/command-reference/params) are simple values in a formatted
-text file which you can tweak and use in your code. On the other end,
+[Parameters](/doc/command-reference/params) are values in a structured text
+file, which you can tweak and use in your code. On the other end,
[metrics](/doc/command-reference/metrics) (and
[plots](/doc/command-reference/plots)) let you define, visualize, and compare
quantitative measures of your results.
diff --git a/content/docs/user-guide/experiment-management/running-experiments.md b/content/docs/user-guide/experiment-management/running-experiments.md
index 8208cffc1d..a3d810c484 100644
--- a/content/docs/user-guide/experiment-management/running-experiments.md
+++ b/content/docs/user-guide/experiment-management/running-experiments.md
@@ -20,8 +20,8 @@ experiment(s). These files codify _pipelines_ that specify one or more
### Running the pipeline(s)
-You can run the experiment pipeline using `dvc exp run`. It uses `./dvc.yaml`
-(in the current directory) by default.
+You can run the experiment pipelines using `dvc exp run`. It uses
+`./dvc.yaml` (in the current directory) by default.
```dvc
$ dvc exp run
@@ -45,20 +45,20 @@ once.
> 📖 `dvc exp run` is an experiment-specific alternative to `dvc repro`.
[reproduction targets]: /doc/command-reference/repro#options
-[dependency graph]:
- /doc/user-guide/data-pipelines/defining-pipelines#directed-acyclic-graph
+[dependency graph]: /doc/user-guide/data-pipelines/defining-pipelines
## Tuning (hyper)parameters
-Parameters are the values that modify the behavior of coded processes -- in this
-case producing different experiment results. Machine learning experimentation
-often involves defining and searching hyperparameter spaces to improve the
-resulting model metrics.
+Parameters are any values used inside your code to tune modeling attributes, or
+that affect experiment results in any other way. For example, a [random forest
+classifier] may require a _maximum depth_ value. Machine learning
+experimentation often involves defining and searching hyperparameter spaces to
+improve the resulting model metrics.
-In DVC project source code, parameters should be read from _params
-files_ (`params.yaml` by default) and defined in `dvc.yaml`. When a tracked
-param value has changed, `dvc exp run` invalidates any stages that depend on it,
-and reproduces them.
+Your source code should read params from structured [parameters files]
+(`params.yaml` by default). Define them with the `params` field of `dvc.yaml`
+for DVC to track them. When a param value has changed, `dvc exp run` invalidates
+any stages that depend on it, and reproduces them.
> 📖 See `dvc params` for more details.
@@ -80,6 +80,11 @@ $ dvc exp run -S learning_rate=0.001 -S units=128 # set multiple params
...
```
+[random forest classifier]:
+ https://medium.com/all-things-ai/in-depth-parameter-tuning-for-random-forest-d67bb7e920d
+[parameters files]:
+ /doc/user-guide/project-structure/dvcyaml-files#parameters-files
+
## Experiment results
The results of the last `dvc exp run` can be seen in the workspace.
diff --git a/content/docs/user-guide/pipelines/defining-pipelines.md b/content/docs/user-guide/pipelines/defining-pipelines.md
index 003f87b978..c7495287e4 100644
--- a/content/docs/user-guide/pipelines/defining-pipelines.md
+++ b/content/docs/user-guide/pipelines/defining-pipelines.md
@@ -186,10 +186,10 @@ changed for the purpose of stage invalidation.
## Parameter dependencies
A more granular type of dependency is the parameter (`params` field of
-`dvc.yaml`), or _hyperparameters_ in machine learning. These represent simple
-values used inside your code to tune data processing, or that affect stage
-execution in any other way. For example, training a [Neural Network] usually
-requires _batch size_ and _epoch_ values.
+`dvc.yaml`), or _hyperparameters_ in machine learning. These are any values used
+inside your code to tune data processing, or that affect stage execution in any
+other way. For example, training a [Neural Network] usually requires _batch
+size_ and _epoch_ values.
Instead of hard-coding param values, your code can read them from a structured
file (e.g. YAML format). DVC can track any key/value pair in a supported
@@ -228,7 +228,8 @@ Use `dvc params diff` to compare parameters across project versions.
Stage outputs are files (or directories) written by pipelines, for
example machine learning models, intermediate artifacts, as well as data [plots]
and performance [metrics]. These files are cached by DVC
-automatically, and tracked with the help of `dvc.lock` files.
+automatically, and tracked with the help of `dvc.lock` files (or `.dvc` files,
+see `dvc add`).
Outputs can be dependencies of subsequent stages (as explained earlier). So when
they change, DVC may need to reproduce downstream stages as well (handled
diff --git a/content/docs/user-guide/project-structure/dvcyaml-files.md b/content/docs/user-guide/project-structure/dvcyaml-files.md
index 474889a34e..c404afe814 100644
--- a/content/docs/user-guide/project-structure/dvcyaml-files.md
+++ b/content/docs/user-guide/project-structure/dvcyaml-files.md
@@ -87,13 +87,20 @@ $ dvc stage add -n a_stage "./a_script.sh > /dev/null 2>&1"
$ dvc exp init './another_script.sh $MYENVVAR'
```
-### Parameter dependencies
+
+
+See also [Templating](#templating) (and **Dict Unpacking**) for useful ways to
+parametrize `cmd` strings.
+
+
+
+### Parameters
-[Parameters](/doc/command-reference/params) are a special type of stage
-dependency. They consist of a list of params to track in one of these formats:
+Parameters are simple key/value pairs consumed by the `command`
+code from a structured [parameters file](#parameters-files). They are defined
+per-stage in the `params` field of `dvc.yaml` and should contain one of these:
-1. A param key/value pair that can be found in `params.yaml` (default params
- file);
+1. A param name that can be found in `params.yaml` (default params file);
2. A dictionary named by the file path to a custom params file, and with a list
of param key/value pairs to find in it;
3. An empty set (give no value or use `null`) named by the file path to a params
@@ -101,8 +108,7 @@ dependency. They consist of a list of params to track in one of these formats:
-Note that file paths used must be to valid YAML, JSON, TOML, or Python
-parameters file.
+Dot-separated param names become tree paths to locate values in the params file.
@@ -114,7 +120,7 @@ stages:
- raw.txt
params:
- threshold # track specific param (from params.yaml)
- - passes
+ - nn.batch_size
- myparams.yaml: # track specific params from custom file
- epochs
- config.json: # track all parameters in this file
@@ -122,8 +128,31 @@ stages:
- clean.txt
```
-This allows several stages to depend on values of a shared structured file
-(which can be versioned directly with Git). See also `dvc params diff`.
+
+
+Params are a more granular type of stage dependency: multiple `stages` can use
+the same params file, but only certain values will affect their state (see
+`dvc status`).
+
+
+
+#### Parameters files
+
+The supported params file formats are YAML 1.2, JSON, TOML 1.0, [and Python].
+[Parameter](#parameters) key/value pairs should be organized in tree-like
+hierarchies inside. Supported value types are: string, integer, float, boolean,
+and arrays (groups of params).
+
+These files are typically written manually (or generated) and they can be
+versioned directly with Git along with other workspace files.
+
+[and python]: /doc/command-reference/params#examples-python-parameters-file
+
+
+
+See also `dvc params diff` to compare params across project version.
+
+
### Metrics and Plots outputs
@@ -173,7 +202,8 @@ models:
```
Those values can be used anywhere in `dvc.yaml` with the `${}` _substitution
-expression_:
+expression_, for example to pass parameters as command-line arguments to a
+[stage command](#stage-command):
```yaml