-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
start: integrate (updated) params/metrics/plots to Experiments Trail #2925
Comments
I've added some discussion points and a tentative plan for updates to the Experiment trail for params/metrics/plots. Could you review? @shcheklein @dberenbaum @jorgeorpinel |
Looks good to me. We can probably skip the
Yup, I think we could probably skip adding the stage or address it briefly within the initial experiment setup.
It's probably too deep currently IMO. One way to simplify would be to start with an image file rather than a data file as the output, but it doesn't show off as much of what DVC can do, so I'm not sure. Probably easiest to start writing and then get feedback. |
Before we jump into coding (writing), could we outline here some specifics:
Also, what about If we are removing metrics and plots from the |
I have added a document-to-discuss to #2961. I believe it's better to discuss on a document with a concrete context. @shcheklein |
Agree to decreasing the story continuation from pipelines but it's OK if the code samples do continue it I think? You need to be aware of stage definitions (dvc.yaml) for params and metrics/plots usage. Agree to avoid stage definitions in this page (and migrate from
Why though? Could it cover both basic params/metrics as a stand-alone feature ant then move onto the more advanced (albeit probably more useful) exp-based use? Agree to use
Probably focus on metrics and just show that you can plot data-series metrics + mention image support.
DVCLive makes sense to use here but probably best to avoid explaining anything about checkpoints here (link or assume if needed). |
@iesahin I see some point that you added to the existing document, but they do not answer (at least I can't figure out this, maybe I'm missing something) questions I had. I would start with some very basics - where do we write this, how it is connected to the other parts, what do we cover. |
I believe even we discuss these here, we'll revise them after we have a concrete document. Writing the summary and the storyline is not that different than writing the document itself. #2961 is about iterating on these. The notes in the document are directed to specify these, actually.
We can cover it here.
Params/metrics/plots are more related to experimentation than data management. Moving the topic to exp. trail makes sense to me.
Data visualization is covered in the plots topic to some extent. The current documentation is not that deep, doesn't touch images as plots but these may be considered advanced. How detailed should we cover the visualization? What's the extent of the topic in your mind? @shcheklein I think visualization is closer to experimentation than model management. In the latter we'll touch these models/plots/metrics as artifacts and won't cover their content. At least this is what I understand from model management. |
I think code samples must continue on top of the experiments, not pipelines and they may need an overhaul. I don't intend to do unnecessary updates but if we'll base p/m/p on top of the experiments, most of the samples may need to be updated. |
The decision in #2496 was to create three trails.
Params/metrics/plot look closer to experiments in this. We can link/refer to pipelines in it, but these are mostly related with experimentation. As we progress towards Experiments are not an "advanced feature", at least we aim to make it as beginner-friendly as possible. The reason behind most "content-pruning" in the exp. trail was this, I believe. |
This is also ~what I have in mind, and possibly @dberenbaum will agree to this. WDYT @shcheklein ? |
This is also a good point to keep in mind. Thank you @jorgeorpinel |
What I have in mind as a structure in Get Started / Experiments Trail is something like this:
Each of these will be a standalone document that can be linked from other trails. So, when we want to discuss parameters in pipelines trail, we'll link to the parameters here. These can be sections on a single page or we can have |
I'll create a sample for separate structure in #2961. We can discuss after it. |
I've updated #2961 and split the current docs into separate documents. You can see the proposed structure in the deployment: https://dvc-org-iesahin-gs-metr-bvmgk2.herokuapp.com/doc/start/experiments |
My initial take - this is way too many sections. Even if every single of them is one page long, this structure is probably suboptimal. What can we merge? At some point we had an iteration of the existing Get started exactly like this - 7-8 sections (you can even see this by the tag names in the repo). After quite a lot of discussions and feedback we grouped them the way they now. E.g. intro - rename to be something similar to the other sections (Running) plots + visualization - one section, they are related checkpoints and dvclive - merge into "live metrics" or something? |
We have a user's guide section titled "Running Experiments", I thought it might be confusing to have an identical GS section. So, your take is merging params and metrics to the current https://dvc.org/doc/start/experiments, and creating a visualization page covering plots, and a dvclive with checkpoints. I think that's a good idea, but checkpoints already look a bit too much. What about
Basically we'll remove |
We will likely have some kind of release or push to specifically address deep learning scenarios in the next few months. That could include a get started page where dvclive and checkpoints are addressed, with more of a focus on a user problem than a set of features. |
@iesahin so, could summarize the new structure/make a screenshot pls? |
Except that if you want to introduce how to set them up that would be best in the pipelines trail I think. Why not have them in both e.g. definitions in Pipelines (link from exps), meaningful usage in Experiments (link from pipes) ?
Was that also decided in the spike? I had the impression each trail would be a single page, like https://dvc.org/doc/start/experiments now.
Feels like it could be a single (long due to tables and images) GS page. But so are we reworking the whole GS/Experiments trail again right after it was merged? I'm a bit confused, sorry 😅 |
p.s. I do think https://dvc.org/doc/start/experiments is missing that H2: it's the first topic after the intro and video, but not mentioned in the right-hand ToC: |
I have migrated the proposed draft to Notion. We can discuss the content & scope in https://www.notion.so/iterative/Experiments-Trail-Nov-21-0e2a492ba968405dbc0870adcaea3cc0 |
Hi. What is |
It was about to specify the content that are being written. I'm deleting that. |
We need to update https://dvc.org/doc/start/metrics-parameters-plots after the Experiments Trail.
Related #2479
Related #2496
Related #2574
Points to discuss
dvc run
will be rewritten indvc stage add
anddvc exp run
. Or, as the train stage is already there, we can just skip these "stage" discussions and move to params and metrics, plots within experiments.Tentative Plan
model.dense_units
parameter to Experiments Trail in a later section to show how to add paramsDecisions
The text was updated successfully, but these errors were encountered: