exp show: checkpoint experiment summary #6194

dberenbaum · 2021-06-17T18:41:06Z

In dvc exp show, the last checkpoint for an experiment also contains the experiment name:

┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Experiment              ┃ Created      ┃ State ┃ dvclive.json:step ┃ dvclive.json:acc ┃ mylive.json:step ┃ mylive.json:acc ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ workspace               │ -            │ -     │ 3                 │ 0.9727           │ -                │ -               │
│ live                    │ Apr 19, 2021 │ -     │ -                 │ -                │ -                │ -               │
│ │ ╓ c1e9c94 [exp-f34fc] │ 10:45 AM     │ -     │ 3                 │ 0.9727           │ -                │ -               │
│ │ ╟ b97efe5             │ 10:45 AM     │ -     │ 2                 │ 0.9663           │ -                │ -               │
│ │ ╟ bc6accc             │ 10:45 AM     │ -     │ 1                 │ 0.9713           │ -                │ -               │
│ ├─╨ 796e9eb             │ 10:44 AM     │ -     │ 0                 │ 0.9538           │ -                │ -               │
│ │ ╓ 9197d13 [exp-c6a3d] │ 10:43 AM     │ -     │ 1                 │ 0.9133           │ -                │ -               │
│ ├─╨ 31df343             │ 10:43 AM     │ -     │ 0                 │ 0.8523           │ -                │ -               │
│ └── 7ada7e5             │ May 18, 2021 │ Queue │ -                 │ -                │ -                │ -               │
└─────────────────────────┴──────────────┴───────┴───────────────────┴──────────────────┴──────────────────┴─────────────────┘

It would be useful to have the experiment on a separate row to keep the experiment and individual checkpoints distinct.

The table combines info that's really about the experiment with info that's about the individual checkpoint. For example, parameters should be static for all checkpoints in an experiment, right?

Also, it opens up options to show things like summary metrics for an experiment, or to collapse the checkpoints if the user just wants a summary of each experiment.

The text was updated successfully, but these errors were encountered:

dberenbaum · 2021-06-17T18:43:18Z

Related (#6174 (comment)):

   The Run status shows next to the already completed checkpoint.
This makes sense to me because when you are resuming/continuing a checkpoint run, you are starting from the last completed checkpoint (rather than starting from the queued row that only has params and no metrics/outputs)

To me it's like if I queue an experiment based on master, I wouldn't expect the master row to show Run when I run it.

pmrowla · 2021-06-21T11:05:50Z

For example, parameters should be static for all checkpoints in an experiment, right?

This is entirely dependent on what the user's stage code is doing. We just capture the state of the user's workspace, in theory the user can write to any file they want to before we capture that state, including params files.

And as we've discussed before, with the way the exp feature actually works, in theory users could leverage DVC to generate commits within an experiment ref that are not actually checkpoint experiments at all (i.e. just use them as a DVC-managed git branches). We would still want exp show to display whatever is in each of those commits, but could not make any assumptions about how they are actually supposed to be grouped or summarized

dberenbaum · 2021-06-21T13:32:09Z

This is entirely dependent on what the user's stage code is doing. We just capture the state of the user's workspace, in theory the user can write to any file they want to before we capture that state, including params files.

Won't any git-tracked or dvc-tracked changes generate a new experiment ref?

users could leverage DVC to generate commits within an experiment ref that are not actually checkpoint experiments at all (i.e. just use them as a DVC-managed git branches)

Good point, but is there any mechanism other than checkpoints where users could generate multiple commits to the same experiment? Obviously, it's possible users could manually mess with the experiments, but this seems like it would be an unexpected use.

pmrowla · 2021-06-22T00:29:42Z

Not currently, but there have been discussions about potentially allowing users to use the make_checkpoint signaling without checkpoint: true outs in their pipeline. So any stage could generate multiple commits in an experiment regardless of whether or not it was really a checkpoint experiment.

dberenbaum · 2021-06-22T01:01:17Z

For what purpose? It seems like holding dependencies fixed is the defining feature of an experiment. If parameters or dependencies change within an experiment, what is the meaning of an experiment? Would this ever be desirable?

pmrowla · 2021-06-22T01:34:18Z

Being able to track any kind of intermediate state during a stage or pipeline seems like it would be useful, it doesn't have to be restricted to intermediate checkpoints or circular deps.

dberenbaum · 2021-06-22T15:18:27Z

Even in checkpoints that are breaking down a multi-part stage without any kind of circular dependency, why would parameters change? Also, this scenario probably breaks all kinds of things in dvc exp show since it might not be an ML experiment with parameters or metrics at all.

A hyperparameter search script is one scenario where parameters might get changed within a stage, but my personal preference would be for each result to be tracked as a separate experiment, although I'm not sure how that would work. Otherwise, each checkpoint would include the history of all previous checkpoints even though they would be independent events in that scenario. Also, making them modified/independent experiments seems more consistent with current behavior where any parameter modifications generate a modified experiment.

dberenbaum · 2021-07-14T13:50:01Z

cc @daavoo

dberenbaum mentioned this issue Jun 17, 2021

exp show: display running/queued state for experiments #6174

Merged

2 tasks

dberenbaum added the discussion requires active participation to reach a conclusion label Jun 22, 2021

dberenbaum self-assigned this Jun 22, 2021

dberenbaum added A: experiments Related to dvc exp diff/show Related to the diff/show feature labels Jun 22, 2021

dberenbaum mentioned this issue Jun 22, 2021

logger: extend with log_param iterative/dvclive#100

Closed

dberenbaum mentioned this issue Nov 16, 2021

exp show: Add option to filter out checkpoints #6988

Closed

dberenbaum added the p3-nice-to-have It should be done this or next sprint label Feb 17, 2023

daavoo mentioned this issue Mar 29, 2023

exp: Drop checkpoints #9271

Merged

dberenbaum closed this as not planned Won't fix, can't repro, duplicate, stale May 11, 2023

skshetry closed this as completed in #9271 May 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exp show: checkpoint experiment summary #6194

exp show: checkpoint experiment summary #6194

dberenbaum commented Jun 17, 2021

dberenbaum commented Jun 17, 2021

pmrowla commented Jun 21, 2021

dberenbaum commented Jun 21, 2021

pmrowla commented Jun 22, 2021

dberenbaum commented Jun 22, 2021

pmrowla commented Jun 22, 2021

dberenbaum commented Jun 22, 2021

dberenbaum commented Jul 14, 2021

exp show: checkpoint experiment summary #6194

exp show: checkpoint experiment summary #6194

Comments

dberenbaum commented Jun 17, 2021

dberenbaum commented Jun 17, 2021

pmrowla commented Jun 21, 2021

dberenbaum commented Jun 21, 2021

pmrowla commented Jun 22, 2021

dberenbaum commented Jun 22, 2021

pmrowla commented Jun 22, 2021

dberenbaum commented Jun 22, 2021

dberenbaum commented Jul 14, 2021