From a795da5562093255301adfd760bd8edd58b5e54b Mon Sep 17 00:00:00 2001 From: David de la Iglesia Castro Date: Tue, 5 Jul 2022 10:33:45 +0200 Subject: [PATCH] templating: Add `dict unpacking` section. config: Add `parsing` section. Per https://github.com/iterative/dvc/pull/7907 --- content/docs/command-reference/config.md | 71 +++++++++++++++++++ .../project-structure/pipelines-files.md | 67 ++++++++++++++--- 2 files changed, 130 insertions(+), 8 deletions(-) diff --git a/content/docs/command-reference/config.md b/content/docs/command-reference/config.md index ffe0af27ed..938f5abd3c 100644 --- a/content/docs/command-reference/config.md +++ b/content/docs/command-reference/config.md @@ -247,6 +247,77 @@ experiments or projects use a similar structure. - `exp.live` - path to your [DVCLive](/doc/dvclive) output logs. +### parsing + +- `parsing.bool` - Controls the templating syntax for boolean values when used + in + [dict unpacking](/doc/user-guide/project-structure/pipelines-files#dict-unpacking). + + Valid values are `"store_true"` (default) and `"boolean_optional"`, named + after + [Python argparse actions](https://docs.python.org/3/library/argparse.html#action). + + Given the following `params.yaml`: + + ```yaml + dict: + bool-true: true + bool-false: false + ``` + + And corresponding `dvc.yaml`: + + ```yaml + stages: + foo: + cmd: python foo.py ${dict} + ``` + + When using `store_true`, `cmd` will be: + + ```shell + python foo.py --bool-true + ``` + + Whereas when using `boolean_optional`, `cmd` will be: + + ```shell + python foo.py --bool-true --no-bool-false + ``` + +- `parsing.list` - Controls the templating syntax for list values when used in + [dict unpacking](/doc/user-guide/project-structure/pipelines-files#dict-unpacking). + + Valid values are `"nargs"` (default) and `"append"`, named after + [Python argparse actions](https://docs.python.org/3/library/argparse.html#action). + + Given the following `params.yaml`: + + ```yaml + dict: + list: [1, 2, 'foo'] + ``` + + And corresponding `dvc.yaml`: + + ```yaml + stages: + foo: + cmd: python foo.py ${dict} + ``` + + When using `nargs`, `cmd` will be: + + ```shell + python foo.py --list 1 2 'foo' + ``` + + Whereas when using `append`, `cmd` will be: + + ```shell + python foo.py --list 1 --list 2 --list 'foo' + ``` + ### plots - `plots.html_template` - sets a diff --git a/content/docs/user-guide/project-structure/pipelines-files.md b/content/docs/user-guide/project-structure/pipelines-files.md index 4b077ccf02..6b53736d11 100644 --- a/content/docs/user-guide/project-structure/pipelines-files.md +++ b/content/docs/user-guide/project-structure/pipelines-files.md @@ -6,8 +6,11 @@ individual [stages](/doc/command-reference/run) in one or more `dvc.yaml` files (forming a _dependency graph_, see `dvc dag`). Refer to [Get Started: Data Pipelines](/doc/start/data-pipelines). -> Note that a helper command, `dvc stage`, is available to create and list -> stages. + + +A helper command, `dvc stage`, is available to create and list stages. + + `dvc.yaml` files can be versioned with Git. @@ -15,8 +18,12 @@ These files use the [YAML 1.2](https://yaml.org/) file format, and a human-friendly schema explained below. We encourage you to get familiar with it so you may modify, write, or generate stages and pipelines on your own. -> Note that we use [GNU/Linux](https://www.gnu.org/software/software.html) in -> most of our examples. + + +We use [GNU/Linux](https://www.gnu.org/software/software.html) in most of our +examples. + + ## Stages @@ -123,9 +130,6 @@ in the YAML structure itself. These sources can be [parameters files](/doc/command-reference/params), or `vars` defined in `dvc.yaml` instead. -> Note that this parameterization feature is only supported via manual editing -> of `dvc.yaml` and incompatible with `dvc run`. - Let's say we have `params.yaml` (default params file) with the following contents: @@ -156,6 +160,49 @@ stages: DVC will track simple param values (numbers, strings, etc.) used in `${}` (they will be listed by `dvc params diff`). +### Dict Unpacking + +Only inside the `cmd` entries, you can also reference a dictionary inside `${}` +and DVC will _unpack_ it. For example, given the following `params.yaml`: + +```yaml +dict: + foo: foo + bar: 2 + bool: true + nested: + foo: bar + list: [1, 2, 'foo'] +``` + +You can reference `dict` in the `cmd` section of a `dvc.yaml`: + +```yaml +stages: + train: + cmd: python train.py ${dict} +``` + +And DVC will _unpack_ the values inside `dict`, creating the following `cmd` +call: + +```shell +python train.py --foo 'foo' --bar 2 --bool --nested.foo 'bar' --list 1 2 'foo' +``` + +This can be useful for avoiding to write every argument passed to the `cmd` or +having to modify the `dvc.yaml` when adding or removing arguments. + + + +The [parsing](/doc/command-reference/config#parsing) section of `dvc config` can +be used to customize the syntax used for some ambiguous types like booleans and +lists. + + + +### Vars + Alternatively, values for substitution can be listed as top-level `vars` like this: @@ -172,7 +219,11 @@ stages: cmd: python train.py --thresh ${models.us.threshold} ``` -> Note that values from `vars` are not tracked like parameters. + + +Values from `vars` are not tracked like parameters. + + To load additional params files, list them in the top `vars`, in the desired order, e.g.: