Sharing experiments #3077

dberenbaum · 2023-01-09T21:19:46Z

Related to #2855, the extension can make it easier to share experiments. Let's discuss what's needed here?

My initial thoughts on what's needed:

Show a comparison of all or a subset of experiments like what you see in the table and plots views, except that it's not stuck on your local machine.
Merge or otherwise move forward with an experiment that you think is a keeper.

For 1, I think it makes sense to use Studio since it already has all this functionality. The extension can upload the params, metrics, and plots to Studio like dvclive is doing for live metrics (except for the "live" part). After selecting any number of experiments, there could be an option to post to Studio. The only user friction should be having a Studio token.

For 2, I think there are a lot of ways to do it in DVC already, so it's probably not as critical, but maybe VS Code can make it smoother. With one click, the extension could create a branch with the same name as the experiment, push that to GitHub, and show the URL to create a PR (like the git cli message Create a pull request on GitHub by visiting...). Regardless of the decided UX, it might be better to choose one and not overwhelm the user with options/choices here.

Related Share experiment fails if no remote specified in repo #2700
Related Persisting experiments from table: share as a branch and commit and share #2855
Use a single done event

The text was updated successfully, but these errors were encountered:

daavoo · 2023-01-14T08:04:30Z

With the current endpoint for live metrics, an existing experiment could be shared with 3 REST API calls:

start:

json={
    "type": "start",
    "repo_url": "STUDIO_REPO_URL",
    "baseline_sha": "BASELINE_SHA",
    "name": "EXP_NAME",
    "client": "vscode",  # I think `client` is just ignored by studio
},
headers={
    "Authorization": "token STUDIO_TOKEN",
    "Content-type": "application/json",
}

data

Include here metrics, params, and plots (only linear plots are accepted by the API).

The API was designed for sending incremental updates of the plots on each step, but it would still work if the full data is sent and step is set to the latest:

json={
    "type": "data",
    "repo_url": "STUDIO_REPO_URL",
    "baseline_sha": "BASELINE_SHA",
    "name": "EXP_NAME",
    "step": 2,  
    "metrics": {"metrics.json": {"data": {"step": 2, "foo": 3}}},
    "params": {"params.yaml": {"fooparam": 1}},
    "plots": {"plots/foo.tsv": {"data": 
        [{"step": 0, "foo": 1.0}, {"step": 1, "foo": 2.0}, {"step": 2, "foo": 3.0}]}
    },
    "client": "vscode",
},
headers={
    "Authorization": "token STUDIO_TOKEN",
    "Content-type": "application/json",
},

done:

json={
    "type": "done",
    "repo_url": "STUDIO_REPO_URL",
    "baseline_sha": "BASELINE_SHA",
    "name": "EXP_NAME",
    "client": "vscode",
},
headers={
    "Authorization": "token STUDIO_TOKEN",
    "Content-type": "application/json",
}

daavoo · 2023-01-14T08:05:29Z

Schema is defined in https://github.com/iterative/dvc-studio-client/blob/main/src/dvc_studio_client/schema.py

mattseddon · 2023-02-02T21:45:59Z

@daavoo how/where does a user get the STUDIO_TOKEN?

daavoo · 2023-02-03T09:22:09Z

@daavoo how/where does a user get the STUDIO_TOKEN?

From their profile in Studio UI: https://dvc.org/doc/studio/user-guide/projects-and-experiments/live-metrics-and-plots#set-up-an-access-token

daavoo · 2023-02-03T09:25:17Z

@mattseddon To clarify, STUDIO_REPO_URL is not the URL that you see in Studio UI and the format described in the current docs is outdated per https://github.com/iterative/studio/issues/4801

In the Python client, we try to set STUDIO_REPO_URL automatically from: git ls-remote --get-url

mattseddon · 2023-02-06T01:49:35Z

Sharing experiments from the extension to Studio

I can see from the docs that all that is needed to start live metrics to Studio is for the user to invoke exp run like this:

STUDIO_TOKEN=**** dvc exp run

@daavoo @dberenbaum what are the current plans for dvc-studio-client + DVC. I have some ideas/questions.

Authentication:

Is there any plan to have the DVC config support the STUDIO_TOKEN environment variable? This way users can simply save their token as an entry in a Git ignored .dvc/config.local and they won't have to bother with it again.

If the use of a token is supported in this way we could then add a CLI command which either:

prompts for the user's username/password for Studio and then fetches a token and saves it into their local config (or creates a new one if it doesn't exist).
or does the same thing but authenticates them through their browser.

Sharing experiments

Is there any plan to add functionality into exp push which will also push a completed experiment to Studio? Again if the DVC config supports a Studio token entry maybe this can be done by default and/or flag(s) can be added to make it happen.

The extension would be able to leverage the above functionality to effectively auth with Studio and push experiments without doing any chaining of commands/running custom code.

WDYT?

Note: If DVC starts supporting a STUDIO_TOKEN config value we would need to some flag(s) to exp run so that not all jobs are sent to Studio by default.

The obvious alternative to the above is for me to recreate the parts of dvc-studio-client mentioned by @daavoo here. Ideally, I don't think we should be supporting multi-language implementations of the same code. I would still have to build the auth flow and I think it should be replaced pretty quickly. IMO this feels like it would be a wasted effort. It would probably be better for someone to point me in the right direction(s) in the DVC codebase so that I can contribute there.

@dberenbaum could be a good idea for us to have a call to discuss this before the next cross-team meeting WDYT? I can be flexible to fit in with your TZ.

daavoo · 2023-02-06T14:11:39Z

Is there any plan to have the DVC config support the STUDIO_TOKEN environment variable? This way users can simply save their token as an entry in a Git ignored .dvc/config.local and they won't have to bother with it again.

I don't have a strong opinion but my feeling is that there are already a lot of existing tools/ways to handle environment variables and users might already have a preferred one to handle the usage of frequent variables

mattseddon · 2023-02-06T21:40:26Z

Ok, to get started I will build the capability within the extension and use a new VS Code config entry (dvc.studioToken) to store the required token. I'll post regular updates here to let everyone know where I'm up to. If anyone feels this is the wrong way to go then please LMK.

dberenbaum · 2023-02-06T22:17:08Z

I need to follow up here with my thoughts/plans so far. I'll try to write something thorough by tomorrow.

mattseddon · 2023-02-07T01:39:40Z

I've thrown together a quick prototype for a very interim auth solution at #3235.

dberenbaum · 2023-02-07T17:41:46Z

@mattseddon That looks really good as a starting point, although I think we do want to save the token in DVC as you suggested. I put a full proposal into https://github.com/iterative/studio/issues/5050. I'd suggest we discuss general product-facing questions there but maybe keep this or another issue open to discuss details that are only interesting to VS Code. WDYT?

mattseddon · 2023-02-13T01:14:29Z

Demo of basic auth flow (it is rough):

Screen.Recording.2023-02-13.at.11.54.51.am.mov

I think this will be (more or less) good enough for a one-time action once I've ironed it out but we can iterate over time.

As discussed previously the token will move back into DVC somewhere. It would be good to expose an endpoint in Studio that validates the token without having to send any data other than the token itself and a command in DVC that checks whether or not Studio is correctly "connected". This would mean the extension would know exactly when and when not to show any details regarding "Connect to Studio". We could also avoid issues created by users getting "stuck" not having a valid token and not being able to update it.

dberenbaum · 2023-02-14T16:58:54Z

@shcheklein Could Studio have a redirect so that one link would take you to either the token (if you are logged in) or the sign in page (if not)?

@mattseddon Can the connect screen provide a place to enter the token instead of having to take you back to the settings? Otherwise, LGTM as a first step.

mattseddon · 2023-02-14T19:26:39Z

updated demo:

Screen.Recording.2023-02-14.at.1.28.14.pm.mov

dberenbaum · 2023-02-14T20:37:02Z

Sorry @mattseddon, I missed the first time that you enter the token in the command palette. What's the difference in the updated demo? Regardless, I think it looks like a good enough start for now and we can refine later.

mattseddon · 2023-02-14T21:23:00Z

Sorry @mattseddon, I missed the first time that you enter the token in the command palette. What's the difference in the updated demo?

We are now saving the token in VS Code's SecretStorage and the add/remove commands are exposed outside of the "welcome screen".

Regardless, I think it looks like a good enough start for now and we can refine later.

I am now going to knock out "Share to Studio" as quickly as possible.

mattseddon · 2023-02-15T01:15:29Z

With the token in place sharing live metrics from the extension to Studio is seamless:

Screen.Recording.2023-02-15.at.11.58.58.am.mov

Do we want to add this as an option when the user has a token? "Run and Share", something like that? TBH I am not sure what value this adds to the local experience outside of allowing users to "work in the open". If all team members sent all experiments to Studio then everyone in that team would know exactly what experiments are being run and by who. Seems outside of the normal data science workflow but towards a best practice and better collaboration.

For the first iteration of this process, I am going to recreate parts of dvc-studio-client inside the extension. I do think that we should provide the option in exp push to push directly to Studio. Is this something that we are interested in? Giving users the ability to retro-actively share experiment results from the CLI? If it is then maybe diverting my effort to contributing that functionality inside DVC would be the best use of my time. WDYT?

mattseddon · 2023-02-15T02:11:05Z

Also found/ran into https://github.com/iterative/studio/issues/5009.

I think I could easily get bogged down here. For the time being/the first prototype, I will not send plot information.

Note: Sharing plot data outside of the happy path is definitely more tricky. E.g if a user changes a template/plot type locally for an experiment and then shares it with Studio what happens? Could we limit the types of plots sent to Studio to a few different basic plot types, do we have to send the contents of the dvc.yaml/templates to Studio with each experiment... 😢?

shcheklein · 2023-02-25T21:23:54Z

I think we need a clear way to enable / disable sharing the experiments as people run them (live sharing). As we discussed:

It can be a toggle in the side panel
It can be a toggle in the settings panel
It can be toggle in the experiments table itself / or under the table

But it should be visible, clear. I don't think that action in the command palette is enough for this.

When we first collect the token we should probably show this toggle (and enable by default?), we should also introduce a section on the Settings page that we already have with the token and with this toggle.

In the DVCLive snippet we should show a way to enable sharing via code.

dberenbaum · 2023-02-27T18:34:39Z

@shcheklein What's the user scenario you have in mind? I can imagine it could be useful if I have a long-running experiment and I or others need to check on it after I have closed my laptop, but I think that would be more of a niche scenario compared to something like training in CI where I have no other way to check on it easily. I want to make sure I understand what the goal is and whether it's driven by a particular user scenario or by a desire to show the feature.

dberenbaum · 2023-02-27T18:37:55Z

Despite what I wrote above, I agree it makes sense as a toggle more than an action, since it does not need to be specific to each experiment. It's probably more of a general workflow preference.

shcheklein · 2023-02-27T20:53:45Z

Yes, this primarily to expose the feature. But also, this is practical - I might run an experiment on a remote machine via SSH, or codespaces and want to share it still so that other people can track the progress. Or, let's say to compare it with something else that I have only in Studio, etc.

Since it's a low hanging fruit, I don't see any major concerns to enable this, but we can get more insights more usage at the end.

mattseddon · 2023-03-03T05:19:43Z

There are a couple of updates at #3387 & #3379.

Next steps (next week):

Once Share New Experiments Live is enabled start the queue with the required environment variables to share live results from queued experiments directly to Studio (need both STUDIO_TOKEN and STUDIO_REPO_URL).
Split into two options (Share New Workspace Experiments Live & Share New Queued Experiments Live). This is more for visibility than anything else.
Expose Open Studio Settings in the command palette.

dberenbaum · 2023-03-03T14:32:19Z

2. Split into two options (Share New Workspace Experiments Live & Share New Queued Experiments Live). This is more for visibility than anything else.

Sorry, I'm not following what you mean by "visibility" here or what this part is for. Otherwise, all makes sense to me, thanks!

mattseddon · 2023-03-03T19:55:53Z

Sorry, I'm not following what you mean by "visibility" here or what this part is for. Otherwise, all makes sense to me, thanks!

The current dvc.studio.shareExperimentsLive option will become dvc.studio.shareWorkspaceExperimentsLive & dvc.studio.shareQueuedExperimentsLive and there will be two checkboxes on the settings page instead of one. Users will be able to send none, one or both types. Does that make sense?

dberenbaum · 2023-03-03T20:04:37Z

I guess I was wondering more why we want to have two separate checkboxes?

mattseddon · 2023-03-03T21:59:34Z

If you don't think it is necessary to give that level of control and/or that it won't provide value then I won't do the work 🙏🏻.

dberenbaum · 2023-03-03T22:11:51Z

Up to @shcheklein. I just didn't see the motivation to have that granularity of control over live sharing.

shcheklein · 2023-03-03T22:19:10Z

Yep, I also don't see the need for this for now. We can keep it simpler.

omesser · 2023-03-06T13:24:05Z

I join the opinion that it's best to make this a simple user-facing feature of "live sharing experiments" (for everything). users will probably have the control they need toggling this on and off while running queues/workspace experiments. If this is used and more granularity is requested - we can always "complicate" this in the future 😄

daavoo · 2023-03-07T18:45:51Z

For the record @mattseddon with the latest Studio release, you should now be able to only send done event

mattseddon · 2023-03-08T04:12:23Z

#3422 will close this as all of the discussion/scoping is on the Studio side right now.

shcheklein added discussion 📦 product Needs product input or is being actively worked on A: experiments Area: experiments table webview and everything related labels Jan 10, 2023

dberenbaum mentioned this issue Jan 10, 2023

exp push: minimize errors from pushing to dvc remote iterative/dvc#8678

Closed

3 tasks

mattseddon self-assigned this Jan 31, 2023

shcheklein added the story Product feature aka epic. Discussion, progress, checkboxes for implementation, etc label Jan 31, 2023

shcheklein mentioned this issue Jan 31, 2023

Improve create branch experience #2701

Closed

2 tasks

mattseddon mentioned this issue Feb 7, 2023

Enable saving of Studio access token #3235

Merged

dberenbaum mentioned this issue Feb 23, 2023

exp push: studio integration iterative/dvc#9074

Closed

mattseddon mentioned this issue Mar 3, 2023

Add live share experiment to Studio #3387

Merged

This was referenced Mar 6, 2023

Have queue workers respect dvc.studio.shareExperimentsLive #3398

Merged

Expose Open Studio Settings in the command palette #3399

Merged

Switch add Studio access token to update when Studio is connected #3400

Merged

mattseddon mentioned this issue Mar 8, 2023

Send single event to share experiment to Studio #3422

Merged

mattseddon closed this as completed in #3422 Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sharing experiments #3077

Sharing experiments #3077

dberenbaum commented Jan 9, 2023 •

edited by shcheklein

Loading

daavoo commented Jan 14, 2023 •

edited

Loading

daavoo commented Jan 14, 2023

mattseddon commented Feb 2, 2023

daavoo commented Feb 3, 2023

daavoo commented Feb 3, 2023 •

edited

Loading

mattseddon commented Feb 6, 2023

daavoo commented Feb 6, 2023

mattseddon commented Feb 6, 2023

dberenbaum commented Feb 6, 2023

mattseddon commented Feb 7, 2023 •

edited

Loading

dberenbaum commented Feb 7, 2023

mattseddon commented Feb 13, 2023

dberenbaum commented Feb 14, 2023

mattseddon commented Feb 14, 2023

dberenbaum commented Feb 14, 2023

mattseddon commented Feb 14, 2023

mattseddon commented Feb 15, 2023

mattseddon commented Feb 15, 2023 •

edited

Loading

shcheklein commented Feb 25, 2023

dberenbaum commented Feb 27, 2023

dberenbaum commented Feb 27, 2023

shcheklein commented Feb 27, 2023

mattseddon commented Mar 3, 2023 •

edited

Loading

dberenbaum commented Mar 3, 2023

mattseddon commented Mar 3, 2023

dberenbaum commented Mar 3, 2023

mattseddon commented Mar 3, 2023

dberenbaum commented Mar 3, 2023

shcheklein commented Mar 3, 2023

omesser commented Mar 6, 2023

daavoo commented Mar 7, 2023

mattseddon commented Mar 8, 2023

Sharing experiments #3077

Sharing experiments #3077

Comments

dberenbaum commented Jan 9, 2023 • edited by shcheklein Loading

daavoo commented Jan 14, 2023 • edited Loading

daavoo commented Jan 14, 2023

mattseddon commented Feb 2, 2023

daavoo commented Feb 3, 2023

daavoo commented Feb 3, 2023 • edited Loading

mattseddon commented Feb 6, 2023

Sharing experiments from the extension to Studio

Authentication:

Sharing experiments

daavoo commented Feb 6, 2023

mattseddon commented Feb 6, 2023

dberenbaum commented Feb 6, 2023

mattseddon commented Feb 7, 2023 • edited Loading

dberenbaum commented Feb 7, 2023

mattseddon commented Feb 13, 2023

dberenbaum commented Feb 14, 2023

mattseddon commented Feb 14, 2023

dberenbaum commented Feb 14, 2023

mattseddon commented Feb 14, 2023

mattseddon commented Feb 15, 2023

mattseddon commented Feb 15, 2023 • edited Loading

shcheklein commented Feb 25, 2023

dberenbaum commented Feb 27, 2023

dberenbaum commented Feb 27, 2023

shcheklein commented Feb 27, 2023

mattseddon commented Mar 3, 2023 • edited Loading

dberenbaum commented Mar 3, 2023

mattseddon commented Mar 3, 2023

dberenbaum commented Mar 3, 2023

mattseddon commented Mar 3, 2023

dberenbaum commented Mar 3, 2023

shcheklein commented Mar 3, 2023

omesser commented Mar 6, 2023

daavoo commented Mar 7, 2023

mattseddon commented Mar 8, 2023

dberenbaum commented Jan 9, 2023 •

edited by shcheklein

Loading

daavoo commented Jan 14, 2023 •

edited

Loading

daavoo commented Feb 3, 2023 •

edited

Loading

mattseddon commented Feb 7, 2023 •

edited

Loading

mattseddon commented Feb 15, 2023 •

edited

Loading

mattseddon commented Mar 3, 2023 •

edited

Loading