Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restyle add example with python parameters file #1833

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 76 additions & 4 deletions content/docs/command-reference/params/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ dependencies: _parameters_. Parameters are defined using the the `-p`

In contrast to a regular <abbr>dependency</abbr>, a parameter is not a file (or
directory). Instead, it consists of a _parameter name_ (or key) to find inside a
YAML, JSON, or TOML _parameters file_. Multiple parameter dependencies can be
specified from one or more parameters files.
YAML, JSON, TOML, or Python _parameters file_. Multiple parameter dependencies
can be specified from one or more parameters files.

The default parameters file name is `params.yaml`. Parameters should be
organized as a tree hierarchy inside, as DVC will locate param names by their
Expand Down Expand Up @@ -91,8 +91,8 @@ $ dvc run -n train -d users.csv -o model.pkl \
python train.py
```

> Note that we could use the same parameter addressing with JSON or TOML
> parameters files.
> Note that we could use the same parameter addressing with JSON, TOML, or
> Python parameters files.

The `train.py` script will have some code to parse the needed parameters. For
example:
Expand Down Expand Up @@ -143,6 +143,77 @@ $ dvc run -n train -d logs/ -o users.csv \
python train.py
```

## Examples: Python parameters file

Consider this parameters file in Python format, named `params.py`:

```python
IS_BOOL: bool = True
CONST = 5


class TrainConfig:
EPOCHS = 70

def __init__(self):
self.layers = 9


class TestConfig:
TEST_DIR = "path"
METRICS = ["metric"]
```

The following [stage](/doc/command-reference/run) depends on params `IS_BOOL`,
`CONST`, as well as `TrainConfig`'s `EPOCHS` and `layers`:

```dvc
$ dvc run -n train -d users.csv -o model.pkl \
-p params.py:IS_BOOL,CONST,TrainConfig.EPOCHS,TrainConfig.layers \
python train.py
```

Resulting `dvc.yaml` and `dvc.lock` files (notice the `params` list):

```yaml
stages:
train:
cmd: python train.py
deps:
- users.csv
params:
- IS_BOOL
- CONST
- TrainConfig.EPOCHS
- TrainConfig.layers
outs:
- model.pkl
```

```yaml
train:
cmd: python train.py
deps:
- path: users.csv
md5: 23be4307b23dcd740763d5fc67993f11
params:
CONST: 5
IS_BOOL: true
TrainConfig.EPOCHS: 70
TrainConfig.layers: 9
outs:
- path: model.pkl
md5: 1c06b4756f08203cc496e4061b1e7d67
```

Alternatively, the entire `TestConfig` group can be referenced, instead of the
parameters in it:

````dvc
$ dvc run -n train -d users.csv -o model.pkl \
-p params.py:IS_BOOL,CONST,TestConfig \
python train.py

## Examples: Print all parameters

Following the previous example, we can use `dvc params diff` to list all of the
Expand All @@ -161,3 +232,4 @@ params.yaml train.layers None 9
This command shows the difference in parameters between the workspace and the
last committed version of the `params.yaml` file. In our example there's no
previous version, which is why all `Old` values are `None`.
````
4 changes: 2 additions & 2 deletions content/docs/command-reference/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,8 +114,8 @@ Relevant notes:

[parameters](/doc/command-reference/params) (`-p`/`--params` option) are a
special type of key/value dependencies. Multiple parameter dependencies can be
specified from within one or more YAML, JSON or TOML parameters files (e.g.
`params.yaml`). This allows tracking experimental hyperparameters easily.
specified from within one or more YAML, JSON, TOML, or Python parameters files
(e.g. `params.yaml`). This allows tracking experimental hyperparameters easily.

Special types of output files, [metrics](/doc/command-reference/metrics) (`-m`
and `-M` options) and [plots](/doc/command-reference/plots) (`--plots` and
Expand Down
2 changes: 1 addition & 1 deletion content/docs/start/experiments.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ parameters.
It's pretty common for data science pipelines to include configuration files
that define adjustable parameters to train a model, do pre-processing, etc. DVC
provides a mechanism for stages to depend on the values of specific sections of
such a config file (YAML, JSON and TOML formats are supported).
such a config file (YAML, JSON, TOML, and Python formats are supported).

Luckily, we should already have a stage with
[parameters](/doc/command-reference/params) in `dvc.yaml`:
Expand Down
4 changes: 2 additions & 2 deletions content/docs/user-guide/basic-concepts/parameter.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@ match: [parameter, parameters, param, params, hyperparameter, hyperparameters]
---

Pipeline stages (defined in `dvc.yaml`) can depend on specific values inside an
arbitrary YAML, JSON, or TOML file (`params.yaml` by default). Stages are
invalidated when any of their parameter values change. See `dvc param`.
arbitrary YAML, JSON, TOML, or Python file (`params.yaml` by default). Stages
are invalidated when any of their parameter values change. See `dvc param`.
2 changes: 1 addition & 1 deletion content/docs/user-guide/dvc-files-and-directories.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ the possible following fields:
- `deps`: List of <abbr>dependency</abbr> file or directory paths of this stage
(relative to `wdir` which defaults to the file's location)
- `params`: List of <abbr>parameter</abbr> dependency keys (field names) that
are read from a YAML, JSON, or TOML file (`params.yaml` by default).
are read from a YAML, JSON, TOML, or Python file (`params.yaml` by default).
- `outs`: List of <abbr>output</abbr> file or directory paths of this stage
(relative to `wdir` which defaults to the file's location), and optionally,
whether or not this file or directory is <abbr>cached</abbr> (`true` by
Expand Down