-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: replace dvc pipeline
with dvc dag
#1383
Changes from all commits
5a8ba59
28cc03f
05d40b2
1fb7f4a
a9718a7
8f46437
3cbeb07
a74f4a1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
# dag | ||
|
||
Show [stages](/doc/command-reference/run) in a pipeline that lead to the | ||
specified stage. By default it lists | ||
[DVC-files](/doc/user-guide/dvc-files-and-directories). | ||
|
||
## Synopsis | ||
|
||
```usage | ||
usage: dvc dag [-h] [-q | -v] [--dot] [--full] [target] | ||
|
||
positional arguments: | ||
targets Stage or output to show pipeline for (optional) | ||
Finds all stages in the workspace by default. | ||
``` | ||
|
||
## Description | ||
|
||
A data pipeline, in general, is a series of data processing | ||
[stages](/doc/command-reference/run) (for example console commands that take an | ||
input and produce an <abbr>output</abbr>). A pipeline may produce intermediate | ||
data, and has a final result. Machine learning (ML) pipelines typically start a | ||
with large raw datasets, include intermediate featurization and training stages, | ||
and produce a final model, as well as accuracy | ||
[metrics](/doc/command-reference/metrics). | ||
|
||
In DVC, pipeline stages and commands, their data I/O, interdependencies, and | ||
results (intermediate or final) are specified with `dvc add` and `dvc run`, | ||
among other commands. This allows DVC to restore one or more pipelines of stages | ||
interconnected by their dependencies and outputs later. (See `dvc repro`.) | ||
|
||
> DVC builds a dependency graph | ||
> ([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) to do this. | ||
|
||
`dvc dag` displays the stages of a pipeline up to the target stage. If `target` | ||
is omitted, it will show the full project DAG. | ||
Comment on lines
+35
to
+36
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we shuold start the description with this paragraph though. Since it's now a specific command after all 🙂 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Never mind, the short intro is enough. That one needs updating though... Will do. |
||
|
||
## Options | ||
|
||
- `--dot` - show DAG in | ||
[DOT](<https://en.wikipedia.org/wiki/DOT_(graph_description_language)>) | ||
format. It can be passed to third party visualization utilities. | ||
|
||
- `--full` - show full DAG that the `target` belongs too, instead of showing the | ||
part that consists only of the target ancestors. | ||
|
||
- `-h`, `--help` - prints the usage/help message, and exit. | ||
|
||
- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no | ||
problems arise, otherwise 1. | ||
|
||
- `-v`, `--verbose` - displays detailed tracing information. | ||
|
||
## Paging the output | ||
|
||
This command's output is automatically piped to | ||
[Less](<https://en.wikipedia.org/wiki/Less_(Unix)>), if available in the | ||
terminal. (The exact command used is `less --chop-long-lines --clear-screen`.) | ||
If `less` is not available (e.g. on Windows), the output is simply printed out. | ||
|
||
> It's also possible to | ||
> [enable Less paging on Windows](/doc/user-guide/running-dvc-on-windows#enabling-paging-with-less). | ||
|
||
### Providing a custom pager | ||
|
||
It's possible to override the default pager via the `DVC_PAGER` environment | ||
variable. For example, the following command will replace the default pager with | ||
[`more`](<https://en.wikipedia.org/wiki/More_(command)>), for a single run: | ||
|
||
```dvc | ||
$ DVC_PAGER=more dvc dag | ||
``` | ||
|
||
For a persistent change, define `DVC_PAGER` in the shell configuration. For | ||
example in Bash, we could add the following line to `~/.bashrc`: | ||
|
||
```bash | ||
export DVC_PAGER=more | ||
``` | ||
|
||
## Examples | ||
|
||
Visualize DVC pipeline: | ||
|
||
```dvc | ||
$ dvc dag | ||
+---------+ | ||
| prepare | | ||
+---------+ | ||
* | ||
* | ||
* | ||
+-----------+ | ||
| featurize | | ||
+-----------+ | ||
** ** | ||
** * | ||
* ** | ||
+-------+ * | ||
| train | ** | ||
+-------+ * | ||
** ** | ||
** ** | ||
* * | ||
+----------+ | ||
| evaluate | | ||
+----------+ | ||
``` |
This file was deleted.
This file was deleted.
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, 2 DAGs together is 1 DAG too, just with weakly connected components :)
Pipeline is a DAG in which all stages are somehow connected with each other, so it is not quite that. We could call it pipelineS, but
dvc add
foo is not quite a pipeline strictly speaking.dependency graph
sounds as the most correct term, but DAG is a synonym, hence the https://dagshub.com/ :) So we could and probably should use them interchangeably.This doc is adapted from the old
dvc pipeline show
so it does suffer from some legacy sentence structure :(