Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log_plot(): add custom line plot #271

Closed
gcaria opened this issue Aug 10, 2022 · 6 comments · Fixed by #543
Closed

log_plot(): add custom line plot #271

gcaria opened this issue Aug 10, 2022 · 6 comments · Fixed by #543

Comments

@gcaria
Copy link

gcaria commented Aug 10, 2022

During my evaluation stage, I plot a simple line plot where the x values are predefined, and the y values are calculated from truth and predicted values, and are informative of the model performance.

I'd love to see how the line plot changes from experiment to experiment, which means that I'd like to use log_plot for doing this.

Would it be possible to implement such a feature? It seems to me that log_plot offers advanced features (ROC etc) but not a simple one like this.

@daavoo daavoo transferred this issue from iterative/dvc Aug 10, 2022
@dberenbaum
Copy link
Collaborator

During my evaluation stage, I plot a simple line plot where the x values are predefined, and the y values are calculated from truth and predicted values, and are informative of the model performance.

Could you clarify more what kind of plot you would like or give an example?

@gcaria
Copy link
Author

gcaria commented Aug 11, 2022

Although I have a specific line plot in mind, I've tried to be as generic as possible because this is not really relevant for what I'm proposing.

In simple terms, the user would just need to specify two arrays (in a json file I guess), one for the x and one for the y coordinates of the points in a line plot (in my specific case the x values would always be the same, but they could change too, I don't expect this to be requirement).

I haven't used the available options for log_plot() because they don't apply to my case, but it seems they all have to perform some kind of computation on the json data to get the x and y values that actually go into the plot. What I'm proposing is a simpler logging option, where the user provides directly x and y (which she computed in whatever way she wants).

@daavoo
Copy link
Contributor

daavoo commented Aug 11, 2022

We would need to change the current API of log_plot a little bit.

We currently use name (first arg) to select the plot template to use (https://dvc.org/doc/dvclive/api-reference/live/log_plot#supported-plots):

y_true = [0, 0, 1, 1]
y_score = [0.1, 0.4, 0.35, 0.8]
live.log_plot("calibration", y_true, y_score)

calibration above indicates both the name of the output plot and the template to use.

If we add support to this, I assume that it would make sense to support arbitrary names.
So, we would need to do 2 things:

  • Introduce support for linear template
  • Decouple name and introduce a new template arg (💣 breaking change)
x = [0, 1, 2]
y = [0.1, 0.2, 0.3]
live.log_plot(x, y, name="foo", tempalte="linear")

@gcaria
Copy link
Author

gcaria commented Aug 11, 2022

My two cents on the API, admitting that I haven't seen the DVC code yet, but just trying to minimize the changes, more importantly the breaking ones:

how about just adding a new name option, which would be linear and then simply adding an optional title argument, which when provided sets the title of the plot/figure (and that would apply to all existing templates, maybe as a last step to make life easy and uniform) ?

If title is not provided then for name='linear' the title would just be e.g. Linear or Line plot

@dberenbaum
Copy link
Collaborator

Related: #322 (comment).

Based on that discussion, it probably makes more sense to introduce a separate method here like log_custom_plot since the current one is so sklearn-focused. The arguments can mostly follow the available configuration fields for dvc plots.

It should handle tabular (dataframe/array/tensor) or hierarchical (dict) input data (although output format could all be JSON if it's easier). Or saving the data could be separate from the plotting like in wandb. This seems a little less straightforward to me but would better support flexible plots where you want to make multiple plots from the same data source or combined data sources.

This might be a stretch for 1.0, but it would be nice to have since it sort of completes the dvc integration of being able to log any kind of dvc output from dvclive.

Like the existing log_plot method, we can start with support for no-step scenarios only, but there's some related discussion about how to support multi-step scenarios in #82.

@daavoo
Copy link
Contributor

daavoo commented Oct 7, 2022

is so sklearn-focused

An additional argument towards a separate command is that, in the current sklearn plots, the inputs don't match what gets saved in the plot: (y_true, y_pred) gets transformed to some (x, y) depending on each plot.

@dberenbaum dberenbaum mentioned this issue Nov 1, 2022
13 tasks
@daavoo daavoo self-assigned this Apr 18, 2023
@daavoo daavoo added this to DVC Apr 18, 2023
@daavoo daavoo moved this to Todo in DVC Apr 18, 2023
@daavoo daavoo moved this from Todo to Review In Progress in DVC Apr 24, 2023
daavoo added a commit that referenced this issue Apr 25, 2023
Create DVC plots from datapoints (list of dictionaries) and plot config.

Closes #271
Closes #453

```
datapoints = [{"foo": 1, "bar": 2}, {"foo": 3, "bar": 4}]
with Live() as live:
        live.log_plot("foo_default", datapoints, x="foo", y="bar")
        live.log_plot(
            "foo_scatter",
            datapoints,
            x="foo",
            y="bar",
            template="scatter",
        )
```
daavoo added a commit that referenced this issue Apr 27, 2023
Create DVC plots from datapoints (list of dictionaries) and plot config.

Closes #271
Closes #453

```
datapoints = [{"foo": 1, "bar": 2}, {"foo": 3, "bar": 4}]
with Live() as live:
        live.log_plot("foo_default", datapoints, x="foo", y="bar")
        live.log_plot(
            "foo_scatter",
            datapoints,
            x="foo",
            y="bar",
            template="scatter",
        )
```
@github-project-automation github-project-automation bot moved this from Review In Progress to Done in DVC Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants