Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live plots without checkpoints #1256

Closed
daavoo opened this issue Jan 27, 2022 · 10 comments
Closed

Live plots without checkpoints #1256

daavoo opened this issue Jan 27, 2022 · 10 comments
Labels
A: experiments Area: experiments table webview and everything related A: plots Area: plots webview, side panel and everything related

Comments

@daavoo
Copy link
Contributor

daavoo commented Jan 27, 2022

Even if checkpoints are not enabled on dvc.yaml, DVCLive still generates plots that are updated at the end of each training iteration (a.k.a step), and the htm report generated by DVC gets updated.

However, without checkpoints the vscode-dvc extension does not update the live plots because the exp show output does not reflect this updates.

Simple project to test:

stages:
  test-dvclive:
    cmd: python test_dvclive.py
    live:
      dvclive:
        cache: false
        summary: true
        html: true
import random
import time

from dvclive import Live


live = Live()

for i in range(10):
    live.log("foo", i + random.random())
    live.log("bar", i + random.random())
    time.sleep(5)
    live.next_step()

dvc repro/dvc exp run and open dvclive_dvc_plots/index.html.

@daavoo
Copy link
Contributor Author

daavoo commented Jan 27, 2022

vscode-dvc could rely on watching the output directory where the dvclive plots are being generated. It should be exposed in DVCLIVE_PATH env var by DVC

@daavoo daavoo added A: experiments Area: experiments table webview and everything related A: plots Area: plots webview, side panel and everything related labels Jan 27, 2022
@mattseddon
Copy link
Member

mattseddon commented Jan 28, 2022

@daavoo we are already taking the output of plots diff and parsing the output to get all of the paths that are being updated. We then watch these paths for updates, on update we will call plots diff with the necessary revisions. Does this cover the above case?

I.e are the plots provided in plots diff by default.

@daavoo
Copy link
Contributor Author

daavoo commented Jan 28, 2022

on update we will call plots diff with the necessary revisions. Does this cover the above case?

I.e are the plots provided in plots diff by default.

The DVCLive plots are provided by plots diff but I'm not really sure how it would work in this scenario.

DVCLive will be updating the content of the plots data while the stage is being run so, without checkpoints, there won't be new revisions being generated.

@mattseddon
Copy link
Member

DVCLive will be updating the content of the plots data while the stage is being run so, without checkpoints, there won't be new revisions being generated.

Unless I'm misunderstanding something we should be able to call plots diff (again) with the current revision whenever the file being logged to changes. I would expect the "new" data would come through in the cli output.

If that's the case I'll just need to put in some rules to differentiate between checkpoint and non-checkpoint projects... I think.

@daavoo
Copy link
Contributor Author

daavoo commented Jan 28, 2022

Unless I'm misunderstanding something we should be able to call plots diff (again) with the current revision whenever the file being logged to changes.

Sorry I was the one misunderstanding 🙏

I would expect the "new" data would come through in the cli output.

Yes, the new data would be available in the workspace revision.

@mattseddon
Copy link
Member

mattseddon commented Jan 31, 2022

For the record:

For queued experiments that are being run inside a temporary directory, we will need to watch files in those temporary directories as opposed to the ones located in the workspace. E.g ./.dvc/tmp/exps/tmpn5gwajyf/plots/heatmap.png instead of ./plots/heatmap.png.

The good news is that we already have this scenario covered because we set up our watchers using the following code:

      createFileSystemWatcher(
        join(this.dvcRoot, '**', `{${files.join(',')}}`),
        path => {
          if (!path) {
            return
          }
          this.managedUpdate()
        }
      )

The '**' will take care of the fact that DVC is working from a copy of the current workspace as long as the temporary directory is also inside the current workspace (i.e under .dvc/tmp).

edit: The plumbing works but it looks like plots diff doesn't return the correct data for the revision. Still have some work to do for the workspace record as well.

@pared
Copy link

pared commented Jan 31, 2022

plots, similar to params and metrics is not designated to work for experiments. Those commands are aiming to handle the repository that we are currently in - experiments, if run with --tmp, are technically a different repository. AFAIR one could get training info from dvc exp. If I am right, it might be possible that we could get the plots data via dvc exp. But that would probably need implementing partial rendering we have been talking about with @mattseddon

@mattseddon
Copy link
Member

Thanks, @pared.

I've tried to consolidate all of the information surrounding plots and what I think we need into #1274. We can continue the discussion there and then (hopefully) spin off a proposal into Notion.

@mattseddon
Copy link
Member

Part one (capturing plots for experiments running in the workspace) is done. I will park part two until we have more information on what the new mechanism for queuing/running experiments in a temp workspace looks like.

@mattseddon mattseddon removed their assignment Feb 6, 2022
@daavoo
Copy link
Contributor Author

daavoo commented Mar 3, 2022

I believe this can be closed (as far as I tested, the functionality described in the issue works with the current release)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: experiments Area: experiments table webview and everything related A: plots Area: plots webview, side panel and everything related
Projects
None yet
Development

No branches or pull requests

3 participants