-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
live: Revisit output names and structure. #322
Conversation
src/dvclive/data/plot.py
Outdated
@@ -6,7 +6,7 @@ | |||
|
|||
class Plot(Data): | |||
suffixes = [".json"] | |||
subfolder = "plots" | |||
subfolder = "data_series" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is the only part I don't like. It doesn't really match the meaning of VSCode extension as their data series
is our metrics
and this folder can contain things like confusion matrix, so it doesn't really apply
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sklearn
seems too implementation-focused, but it's mentioned in the docs, and it's more descriptive than data_series
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
json
is another that's too implementation-focused but at least descriptive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe something like builtin
or presets
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shcheklein @jorgeorpinel
Any suggestions regarding alternatives to the data_series
name (and/or the rest of the outputs)?
data_series
is meant to store pre-defined plots based on scikit-learn visualizations .
The current state is:
Method | Output |
---|---|
live.log | dvclive/plots/metrics & dvclive/metrics.json |
live.log_image | dvclive/plots/images |
live.log_param | dvclive/params.yaml |
live.log_plot | dvclive/plots/data_series |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(btw, by saying "I don't like it" I don't mean that we should not go with this - I'm sharing my thoughts. If we don't come with something better we should proceed with this)
With that in mind, I think live.log_sklearn makes sense since the inputs closely follow the underlying sklearn methods. The output could then be plots/sklearn.
hmm, that's interesting ... maybe I missed something, but why sklearn
specifically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From https://dvc.org/doc/dvclive/api-reference/live/log_plot and the link at the top (which may need to be updated because it doesn't link to the expected methods), the method arguments come from sklearn. Most sklearn evaluation methods take similar arguments like metrics.method(y_true, y_score)
.
If we start to log any other plot types that take different arguments, we are likely to need a separate method.
For users who are familiar, it's the easiest way to explain the inputs, and sklearn is used for evaluation even when it's not used for modeling (the top result for pytorch roc curve is https://discuss.pytorch.org/t/how-to-plot-roc-curve-using-pytorch-model/122033).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From https://dvc.org/doc/dvclive/api-reference/live/log_plot and the link at the top ...
Got it, thanks! I forgot (not the first time :) ) that interface there was coupled with the sklearn types of plots. Yes, it makes more sense to rename into sklearn_plot
. On the other hand, we already do things like this:
https://github.com/iterative/example-get-started/blob/main/src/evaluate.py#L49-L59
should it be taken care of by the some kind of `log_plot("") ?
feels like even if rename the existing one to log_sklearn_plot
or something, we'll need the second log_plot
pretty soon (now?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the method arguments come from sklearn. Most sklearn evaluation methods take similar arguments like
Not only that. We use sklearn
internally to compute the results to be plotted.
If we start to log any other plot types that take different arguments, we are likely to need a separate method.
Indeed, we would need to define a different interface for plots beyond the predefined sklearn
set.
What wandb
does is provide secondary methods that are passed as arguments to wandb.log
:
data = [[x, y] for (x, y) in zip(x_values, y_values)]
table = wandb.Table(data=data, columns = ["x", "y"])
wandb.log({"my_custom_plot_id" : wandb.plot.line(table, "x", "y",
title="Custom Y vs X Line Plot")})
should it be taken care of by the some kind of `log_plot("") ?
It could already use log_sklearn_plot("precision_recall", ...)
but afaik (per the comment above) it was kept to showcase an alternative approach of manually generating a plot file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like we can move forward with log_sklearn_plot
?
should it be taken care of by the some kind of `log_plot("") ?
It could already use
log_sklearn_plot("precision_recall", ...)
but afaik (per the comment above) it was kept to showcase an alternative approach of manually generating a plot file.
Even if we don't need it for the example get started repo, it would probably make sense after #311 to introduce a more generic method that logs any dvc plot (log_custom_plot
?). Should we discuss in #271?
Codecov ReportBase: 94.85% // Head: 94.92% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## 1.0 #322 +/- ##
==========================================
+ Coverage 94.85% 94.92% +0.06%
==========================================
Files 36 36
Lines 1770 1774 +4
Branches 165 165
==========================================
+ Hits 1679 1684 +5
+ Misses 66 65 -1
Partials 25 25
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
948b76a
to
f607e4e
Compare
f607e4e
to
e670f38
Compare
Applied #246 (comment) Closes #246
src/dvclive/data/sklearn.py
Outdated
@@ -4,7 +4,7 @@ | |||
from .base import Data | |||
|
|||
|
|||
class Plot(Data): | |||
class SKLearn(Data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we name it SKLeanrPlot (sklearn alone can be misleading?)
* live: Revisit output names and structure. Applied #246 (comment) Closes #246 * Rename `log_plot` to `log_sklearn_plot`. * Rename `plot` -> `sklearn` * Rename `sklearn` -> `sklearn_plot`
* live: Revisit output names and structure. Applied #246 (comment) Closes #246 * Rename `log_plot` to `log_sklearn_plot`. * Rename `plot` -> `sklearn` * Rename `sklearn` -> `sklearn_plot`
* live: Revisit output names and structure. Applied #246 (comment) Closes #246 * Rename `log_plot` to `log_sklearn_plot`. * Rename `plot` -> `sklearn` * Rename `sklearn` -> `sklearn_plot`
* live: Revisit output names and structure. Applied #246 (comment) Closes #246 * Rename `log_plot` to `log_sklearn_plot`. * Rename `plot` -> `sklearn` * Rename `sklearn` -> `sklearn_plot`
* initial refactor * Mention API Reference * Fix link * Add redirect * Update content/docs/dvclive/how-to/checkpoints.md Co-authored-by: Dave Berenbaum <[email protected]> * Remove how-to folder * Add outputs content * Reorder sidebar * Add tabs with popular frameworks * Remove outdated `dvclive with DVC` * clean * Add params * Clarify value without DVC * Update outputs to current status * Update content/docs/dvclive/get-started.md Co-authored-by: Dave Berenbaum <[email protected]> * Reduce wording * Add Python API tab * More details * Updates from review * Capitalize pages * Mention studio for sharing experiments * New section for Studio * Add sample output * Move ml-frameworks to api-refeference * Drop resume page * Apply suggestions from code review Co-authored-by: Dave Berenbaum <[email protected]> * Restyled by prettier (#4050) Co-authored-by: Restyled.io <[email protected]> * Suggestions from code review * caps * Add sections * Add example output * Add admon * Share experiment -> Share Results * Update output structure. Per iterative/dvclive#322 * Revisit output names * path -> dir * Add make_report call * Rename log -> log_metric * Update log_sklearn_plot. (#4074) Per iterative/dvclive#339 * Apply suggestions from code review Co-authored-by: Jorge Orpinel <[email protected]> * Replace dvc with cli * Remove tree * Update output dir in frameworks * grammarly * Revisit frameworks naming. * Remplaze `path` with `dir` * Rename `outputs` to `how-it-works` * Rename `Output Folder Structure` to `How it Works` * Rename Scalars to Metrics * Simplify `Live` Attributes description * Renamed folder to directory * Revisit `step` definition * Fix tree * Remove cat * Replace vscode plots with data series * Updates with new summary behavior and step property * Fix links * Consistent framework tabs * Update content/docs/dvclive/api-reference/live/log_metric.md * start: dvclive 1.0 updates to exp viz page * Update content/docs/start/experiment-management/visualization.md Co-authored-by: David de la Iglesia Castro <[email protected]> * Revisit dvclive usage outside doc/dvclive * Focus on context manager usage. * hint about next_step * Update content/docs/dvclive/api-reference/live/next_step.md Co-authored-by: Dave Berenbaum <[email protected]> * Link for `Live.metrics_file` * Restyled by prettier (#4097) Co-authored-by: Restyled.io <[email protected]> * Updates per iterative/example-repos-dev#143 Co-authored-by: Dave Berenbaum <[email protected]> Co-authored-by: restyled-io[bot] <32688539+restyled-io[bot]@users.noreply.github.com> Co-authored-by: Restyled.io <[email protected]> Co-authored-by: Jorge Orpinel <[email protected]> Co-authored-by: rogermparent <[email protected]>
Applied #246 (comment)
Closes #246
Note this targets branch
1.0
dvclive/plots/metrics
&dvclive/metrics.json
dvclive/plots/images
dvclive/params.yaml
dvclive/plots/sklearn
Example:
Output: