-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better support plots functionality #1274
Comments
Relates to #1256 |
Some further info on trying to "live update" plots when running a non-checkpoint experiment in the workspace. Running a 10 epoch experiment against
The output remains the same until the experiment is complete. I am not even sure where this experiment is being run because nothing is coming through any of the watchers. This would be another reason to have the plots data come through the edit: obviously this does not work because the project does not have a logger (e.g dvclive) setup. |
If I understand, this issue is limited to a scenario where:
Is that correct? |
Yep, I think that is the only scenario that we could be getting updates but we currently can't. The table shows all of the permutations that I can think of at the moment. ✅ = can get updated ❌ = not currently possible.
Can you think of any scenarios that I've missed? I realised that the whole queuing system is experimental at the moment and will be getting worked on soon. Let's talk about this as I think it would be very beneficial to nail down which are the most important scenarios. ** Tested by adding DvcLiveCallback to example-dvc-experiments train script` Screen.Recording.2022-02-02.at.9.48.33.am.mov |
Notes from our meeting:
This is the highest priority at the moment, so let's focus on this for now.
This is the scenario in the table above for "queue + no checkpoints w/ logger." Since this isn't as high priority, we don't need to make any decision on it yet, but it's unclear to me if this is a blocker for initial release since it seems like a more advanced scenario. @daavoo might have thoughts on both the importance and possible implementation for this. |
Some initial thoughts on live plots for queued/temp experiments without checkpoints. I don't think |
We do already get the updates from the temp directories coming through. We even call for a specific revision relating to the running experiment (e.g
The bit of plumbing that is missing is the mapping of the temp directory to the revision. Once the running experiment finishes all of the data shows up in the workspace and the plots are updated "in bulk". LMK if that doesn't make sense. |
The temp dir info should be in @pmrowla How stable is this for finding the temp dir where an experiment is running? The VS Code team wants to run |
For me, I could not tell about
So, live tracking of locally queued experiments is a scenario I haven't really explored in practice. |
@dberenbaum we can use the approach of reading that file and processing the JSON and using the information to cd as a temporary patch but (seeing as that is relevant information) I would expect it to come through in the |
This should not be considered stable right now, and the directory/file structure will probably continue to change in the near future, especially while the queueing work is ongoing. But eventually, the idea is that yes, we will have some kind of serialized information where consumers can lookup status info for what is running and where it's being run. So in theory at that point the vscode extension could get the live plots data from the temp dir instead of needing it to all be fetched/collected by DVC into the main repo. |
Do you expect to have a way to find the location where the experiment is running, or do you expect |
I would be happy with the location as a short term solution. I am unsure as to what the long term solution should be. I agree that having plots data in the
Can I ask what would be expected for plots in terms of remote execution? My expectation would be that I could see live updates for multiple experiments running in the cloud. With that in mind maybe we would want to add a |
👍 That's a good question, and your proposal makes sense. I haven't put much thought into this yet. There's no expectation in any DVC proposals so far that users could see live plots updates for non-checkpoint experiments running in the cloud, and I would say we have much more basic problems to solve first for remote execution 😁 . In dvclive, there are discussions about how to provide regular notifications/updates: iterative/dvclive#90, which may be enough for users who want to keep tabs on remote experiments.
👍 Queuing local experiments is more of a prerequisite for remote execution than a fully realized feature right now, but a typical workflow for me would have been to log in to a large cloud instance/cluster and run multiple experiments there in parallel. I think plenty of users are running dvc inside cloud instances rather than on their laptops, so "local" execution may not be limited to laptop scenarios. However, I'm not sure how well the DVC VS Code extension would work in that remote-ssh scenario 🤔 . |
This is something that VS Code does well: https://code.visualstudio.com/docs/remote/ssh. We should be able to piggyback that behaviour 👍🏻 . |
Original use case
Give the user the ability to view any plot(s) from the current workspace. I.e any plot generated between
HEAD
&workspace
, including all experiments and checkpoints.Plots current state
exp show
data is used to gather all of the revisions in the current workspace i.e most recent commit + all experiments & checkpoints.plots diff <REVISIONS> --show-json -o .dvc/tmp/plots
.exp show
data (e.g after a commit is made)..dvc/tmp/plots
folder is removed from disk.Limitations (in order of priority):
plots diff
returns revision data baked into the template we then have to manually split as per 3. This is a hack at best and we will not be able to rely on it when the extension starts accepting more than a handle of predefined templates.plots diff
: duplicate revisions not returned dvc#7265** When an experiment has been queued and is then running under
"executor": "temp"
the appropriate "live" data is available under.dvc/tmp/exps/<TEMP_DIR_NAME>/path/to/file
as opposed topath/to/file
. Until such time that the experiment has been completedplots diff
will return the data for the parent revision. These two videos demonstrate what is shown in the extension when a repo (without checkpoints) that has "live" plots has an experiment running that was queued:cc9db9e
(running) matchesb137fa8
:Screen.Recording.2022-01-31.at.5.22.41.pm.mov
cc9db9e
completes and the final data is copied into the workspace:Screen.Recording.2022-01-31.at.5.23.03.pm.mov
Proposed solution:
I quickly talked to @pawel on this and this was the provisional idea that we came up with:
Add an extra flag to
plots show
that provides only the half baked templates with a path to insert the data into the template (saves us scanning for an anchor).Have
exp show
return the plots data for each revision that it sends. This would greatly simplify the code on our end but also seems like the logical way to deal with the situation of an experiment running outside of the current workspace.cc @efiop @dberenbaum
The text was updated successfully, but these errors were encountered: