Skip to content

Latest commit

Β 

History

History
217 lines (183 loc) Β· 16.9 KB

exp-show-target.md

File metadata and controls

217 lines (183 loc) Β· 16.9 KB

Add targets/stages argument to dvc exp show

  • Enhancement Proposal PR: (leave this empty)
  • Contributors: dberenbaum, (add your Github handle)

Summary

The dvc exp show command shows a comparison of the results of different experiments along with the parameters and metrics values for each experiment. The command's output currently includes:

  • All values that are defined in params.yaml.
  • All values in other referenced parameters files in any DVC stage.
  • All values in any referenced metrics files in any DVC stage.

This proposal allows users to see only parameters and metrics relevant to specified pipeline stages.

See iterative/dvc#5451 for more background.

Motivation

dvc exp show is the main view by which users may compare experiments, and it's becoming useful for other purposes, like comparing multiple commits (see https://discord.com/channels/485586884165107732/563406153334128681/822464119906369606).

It's also likely the busiest DVC output:

┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━
┃ Experiment     ┃ Created      ┃     auc ┃ prepare.split ┃ prepare.seed ┃ featurize.max_features ┃ featurize.ngrams ┃ train.seed ┃ train.n_estimators ┃ topic_modeling.seed ┃ top
┑━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━
β”‚ workspace      β”‚ -            β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ 20170428            β”‚ 10
β”‚ topic_modeling β”‚ 04:41 PM     β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ 20170428            β”‚ 10
β”‚ master         β”‚ Feb 17, 2021 β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                   β”‚ -
β”‚ └── exp-56d11  β”‚ 04:40 PM     β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ 20170428            β”‚ 10
β”‚ 263732a        β”‚ Feb 11, 2021 β”‚  0.5674 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ -          β”‚ 100                β”‚ -                   β”‚ -
β”‚ b297969        β”‚ Feb 11, 2021 β”‚  0.5674 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ -          β”‚ 100                β”‚ -                   β”‚ -
β”‚ 4edd930        β”‚ Feb 11, 2021 β”‚  0.5674 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ -          β”‚ 100                β”‚ -                   β”‚ -
β”‚ 72ed9cd        β”‚ Nov 15, 2020 β”‚  0.5674 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 50                 β”‚ -                   β”‚ -
β”‚ β”œβ”€β”€ exp-47dfa  β”‚ Feb 17, 2021 β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                   β”‚ -
β”‚ └── exp-44136  β”‚ Feb 17, 2021 β”‚  0.5674 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 50                 β”‚ -                   β”‚ -
β”‚ f8e9d93        β”‚ Nov 15, 2020 β”‚ 0.54175 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 50                 β”‚ -                   β”‚ -

This table view is from a simplified example and is still missing several columns, so this view can become cluttered quickly. Eliminating irrelevant information is a priority.

There are several ways in which the included parameters might be irrelevant:

  • Parameters files may include values that aren't tracked by DVC at all (for example, debug level).
  • The parameters may be used in a different pipeline branch (for example, the topic_modeling stage is in a separate branch of the pipeline from the evaluate stage, which produces the relevant experiment metrics).
  • The parameters may be used in a downstream pipeline branch (for example, the post-processing combine stage parameters aren't relevant to experiment evaluation).
                +-------------------+
                | data/data.xml.dvc |
                +-------------------+
                          *
                          *
                          *
                     +---------+
                     | prepare |
                     +---------+
                          *
                          *
                          *
                    +-----------+
                    | featurize |
                  **+-----------+**
              ****       *         ****
          ****           *             ***
        **              *                 ***
+-------+               *                    **
| train |             **                      *
+-------+            *                        *
         **        **                         *
           **    **                           *
             *  *                             *
        +----------+                  +----------------+
        | evaluate |                  | topic_modeling |
        +----------+                  +----------------+
                   ***             ****
                      *         ***
                       **     **
                     +---------+
                     | combine |
                     +---------+

There are no apparent reasons that users want to see untracked parameters in dvc exp show. Unlike dvc params diff and dvc metrics diff, dvc exp show is not applicable to arbitrary files. Likewise, there are no apparent reasons that users want to see parameters or metrics from irrelevant stages of their pipelines. Users could keep untracked parameters in other files if they don't want them shown in dvc exp show, and they could exclude parameters or metrics from irrelevant stages, but DVC already has enough information to understand that these should not be shown.

Detailed design

dvc exp show could take stages as (positional?) arguments, similar to dvc exp run. The help message could be similar:

positional arguments:
stages               Stages for which to print experiment info. 'dvc.yaml'
by default.

If no stages are passed, parameters and metrics defined in any stage would be shown. This differs from current behavior only because untracked parameters would be excluded.

If stages are passed, parameters and metrics defined anywhere in the pipeline for those stages would be shown.

Using the above examples, dvc exp show evaluate would be equivalent to using the --exclude-params and exclude-metrics options to exclude all parameters and metrics for the topic_modeling and combine stages:

┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━┓
┃ Experiment     ┃ Created      ┃     auc ┃ prepare.split ┃ prepare.seed ┃ featurize.max_features ┃ featurize.ngrams ┃ train.seed ┃ train.n_estimators ┃ train.random_state ┃ … ┃
┑━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━┩
β”‚ workspace      β”‚ -            β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β”‚ topic_modeling β”‚ Mar 18, 2021 β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β”‚ β”œβ”€β”€ exp-e2784  β”‚ 12:15 PM     β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β”‚ └── exp-44136  β”‚ 10:29 AM     β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β”‚ master         β”‚ Feb 17, 2021 β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β”‚ └── exp-56d11  β”‚ Mar 18, 2021 β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β”‚ 263732a        β”‚ Feb 11, 2021 β”‚  0.5674 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ -          β”‚ 100                β”‚ 20170428           β”‚ - β”‚
β”‚ b297969        β”‚ Feb 11, 2021 β”‚  0.5674 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ -          β”‚ 100                β”‚ 20170428           β”‚ - β”‚
β”‚ 4edd930        β”‚ Feb 11, 2021 β”‚  0.5674 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ -          β”‚ 100                β”‚ 20170428           β”‚ - β”‚
β”‚ 72ed9cd        β”‚ Nov 15, 2020 β”‚  0.5674 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 50                 β”‚ -                  β”‚ - β”‚
β”‚ β”œβ”€β”€ exp-47dfa  β”‚ Feb 17, 2021 β”‚ 0.51625 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β”‚ └── exp-44136  β”‚ Feb 17, 2021 β”‚  0.5674 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 50                 β”‚ -                  β”‚ - β”‚
β”‚ f8e9d93        β”‚ Nov 15, 2020 β”‚ 0.54175 β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 50                 β”‚ -                  β”‚ - β”‚
β”‚ 377c988        β”‚ Nov 15, 2020 β”‚ 0.54175 β”‚ 0.2           β”‚ 20170428     β”‚ 500                    β”‚ 1                β”‚ 20170428   β”‚ 50                 β”‚ -                  β”‚ - β”‚
β”‚ 27d4e7c        β”‚ Nov 15, 2020 β”‚       - β”‚ 0.2           β”‚ 20170428     β”‚ 500                    β”‚ 1                β”‚ 20170428   β”‚ 50                 β”‚ -                  β”‚ - β”‚
β”‚ 8bf2091        β”‚ Nov 15, 2020 β”‚       - β”‚ 0.2           β”‚ 20170428     β”‚ 500                    β”‚ 1                β”‚ 20170428   β”‚ 50                 β”‚ -                  β”‚ - β”‚
β”‚ ff9e2fa        β”‚ Nov 15, 2020 β”‚       - β”‚ 0.2           β”‚ 20170428     β”‚ 500                    β”‚ 1                β”‚ 20170428   β”‚ 50                 β”‚ -                  β”‚ - β”‚
β”‚ 11e1705        β”‚ Nov 15, 2020 β”‚       - β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β”‚ a82585e        β”‚ Nov 15, 2020 β”‚       - β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β”‚ c778a78        β”‚ Nov 15, 2020 β”‚       - β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β”‚ 551082e        β”‚ Nov 15, 2020 β”‚       - β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β”‚ 15bef96        β”‚ Nov 15, 2020 β”‚       - β”‚ 0.2           β”‚ 20170428     β”‚ 1500                   β”‚ 2                β”‚ 20170428   β”‚ 10                 β”‚ -                  β”‚ - β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”˜

This table includes parameters and metrics from the evaluate stage and all stages upstream of it. It's possible to also have a --single-item argument to exclude upstream changes, but this is out of scope for now.

How We Teach This

See Detailed design for a proposed help message.

This is consistent with other commands, and it's only one additional argument to an existing command, so it likely doesn't require much teaching. It might be necessary to explicitly add that parameters and metrics from upstream stages will be shown.

Drawbacks

This proposal may require changes to how DVC collects and tracks parameters and metrics. Implementation could break existing functionality, and it is inconsistent with the current behaviors of dvc params, dvc metrics, and dvc exp diff commands.

Further, this proposal adds to an already crowded set of dvc exp show options, and there is no single way to show parameters and metrics that's going to work for every use case. For example, this proposal does nothing to filter out parameters for upstream stages, which may frequently be irrelevant if only the stages near the end of the pipeline change during experimentation.

Alternatives

There are many alternative methods to refine the output of dvc exp show:

  • Current functionality: exp show already has options to include/exclude any column, so the flexibility to get the same output as suggested here already exists.
  • Wildcards: Expand the include/exclude options to accept wildcard/glob/regex arguments for more flexibility to exclude an entire file or section of a file like exclude-params=params.json:local_config*.
  • Save configuration options for dvc exp show so they can be reused and modified without typing out all options at the command line each time.
  • CSV: Add a tsv/csv output format option in dvc exp show to enable users to manipulate the table with other tools.

This proposal does not preclude any of these options. The current output will continue to be available except for untracked parameters. The other alternatives may still be implemented.

Each of the alternatives provide flexibility to modify the dvc exp show outputs, which is important but not the goal of this proposal. Instead, the proposed feature uses the information DVC has to make smart decisions about what to show and save the users from needing to make modifications that would always be desirable.

Unresolved questions

  • What about consistency with parameters, metrics, and exp diff commands? Should these also be modified or left as is? As mentioned above, parameters and metrics in particular may be used on arbitrary files outside of DVC projects.
  • Should experiments themselves be excluded based on what stages were run? Using the example above, if a user ran both dvc exp run evaluate and dvc exp run topic_modeling experiments, should all of these experiments show when running dvc exp show evaluate?