Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

start: use get-started-experiments for GS:Experiments tutorial #2497

Closed
wants to merge 8 commits into from

Conversation

iesahin
Copy link
Contributor

@iesahin iesahin commented May 22, 2021

This one is the branched version of #2491

Closes #2479

It doesn't contain any content changes but updates the commands with the new project in github.com/iterative/get-started-experiments .

The tables are currently code blocks but they will be updated with the screenshots instead.

@iesahin iesahin self-assigned this May 22, 2021
@shcheklein shcheklein temporarily deployed to dvc-org-iesahin-revise--9xkhvv May 22, 2021 19:19 Inactive
@shcheklein
Copy link
Member

There are some conflicts, also could we now close #2491 ?

@shcheklein
Copy link
Member

@jorgeorpinel please note, that we probably should be reviewing this - it's already a branched version.

@shcheklein shcheklein temporarily deployed to dvc-org-iesahin-revise--9xkhvv May 26, 2021 14:27 Inactive
@iesahin iesahin force-pushed the iesahin-revise-gs-experiments branch from 47c9cb7 to ed386eb Compare May 26, 2021 14:58
@shcheklein shcheklein temporarily deployed to dvc-org-iesahin-revise--9xkhvv May 26, 2021 14:59 Inactive
@shcheklein shcheklein temporarily deployed to dvc-org-iesahin-revise--9xkhvv May 26, 2021 15:18 Inactive
@shcheklein shcheklein temporarily deployed to dvc-org-iesahin-revise--9xkhvv May 26, 2021 15:35 Inactive
@iesahin iesahin changed the title [WIP] start: use get-started-experiments for GS:Experiments tutorial start: use get-started-experiments for GS:Experiments tutorial May 27, 2021
@iesahin
Copy link
Contributor Author

iesahin commented May 28, 2021

I think it may look better to use a terminal with the identical background-color when taking the screenshots. Is there a predefined color scheme for code blocks?

(And also the fonts.)

Could you point me the exact color/font styles for the code blocks? @julieg18 @rogermparent

Screen Shot 2021-05-28 at 11 15 03

@iesahin
Copy link
Contributor Author

iesahin commented May 28, 2021

I think the font is SF Mono and the color scheme is similar to Solarized Dark. This is what I get using these:

Screen Shot 2021-05-28 at 11 31 06

@iesahin
Copy link
Contributor Author

iesahin commented May 28, 2021

And this is after modifying the background to #40354d and foreground to #ccc

Screen Shot 2021-05-28 at 11 36 52

I'll use this one to redo the experiments and retake the screenshots

@shcheklein shcheklein temporarily deployed to dvc-org-iesahin-revise--9xkhvv May 28, 2021 10:23 Inactive
@iesahin
Copy link
Contributor Author

iesahin commented May 28, 2021

Replaced all screenshots to have a similar background, but font seems to be reverted to older. Need to do this once more I think.

Screen Shot 2021-05-28 at 13 24 48

@iesahin
Copy link
Contributor Author

iesahin commented May 28, 2021

I installed SF Mono Bold as well and it looks iTerm uses the bold font when it finds one.

Bold font and regular font, please let me know if it's better to replace. (It takes some time to do the experiments.)

Screen Shot 2021-05-28 at 13 31 40

Screen Shot 2021-05-28 at 13 30 36

@shcheklein
Copy link
Member

@iesahin Looks nice! from this screenshots I like the one that is not bold. No strong opinion here :)

@@ -192,29 +271,26 @@ Storage, HTTP, HDFS, etc.). The Git remote is often a central Git server

</details>

Experiments that have not been made persistent will not be stored or shared
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^^^ a few lines about - fix code block, lines should start with $

@iesahin
Copy link
Contributor Author

iesahin commented May 31, 2021

@iesahin Looks nice! from this screenshots I like the one that is not bold. No strong opinion here :)

I had to locate away from my Mac on the weekend. I'm taking the screenshots with Gnome Terminal but it doesn't look as good.

exp-ss-63714

This is probably about horizontal spacing config in iTerm. I used 0.95 or 0.90, but Gnome Terminal accepts only >1. Let's keep the current screenshots for the time being and I'd update theem in a later iteration.

@shcheklein shcheklein temporarily deployed to dvc-org-iesahin-revise--9xkhvv May 31, 2021 19:52 Inactive
Copy link
Contributor

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and most of my comments are minor grammatical suggestions.

I have two broad questions:

  1. Is the new baseline experiment necessary? I think the way you introduce this is better than the current approach, especially because it introduces dvc exp show earlier, but it doesn't seem well integrated with the other sections now (creates lots of duplication). I'd suggest either leaving this out for a separate PR or else making bigger changes to remove content from the existing doc.
  2. Can we reduce the repo complexity? The repo looks great, but if we now have the flexibility of having a repo specifically for this doc, I'd prefer we keep it as simple as possible because it makes the example commands, tables, etc. easier to read, and it's easier for user to look through the contents of the repo.

content/docs/start/experiments.md Outdated Show resolved Hide resolved
content/docs/start/experiments.md Outdated Show resolved Hide resolved
content/docs/start/experiments.md Outdated Show resolved Hide resolved
content/docs/start/experiments.md Outdated Show resolved Hide resolved
content/docs/start/experiments.md Outdated Show resolved Hide resolved
content/docs/start/experiments.md Outdated Show resolved Hide resolved
content/docs/start/experiments.md Outdated Show resolved Hide resolved
content/docs/start/experiments.md Show resolved Hide resolved
content/docs/start/experiments.md Show resolved Hide resolved
content/docs/start/experiments.md Show resolved Hide resolved
@iesahin
Copy link
Contributor Author

iesahin commented Jun 2, 2021

  1. Is the new baseline experiment necessary? I think the way you introduce this is better than the current approach, especially because it introduces dvc exp show earlier, but it doesn't seem well integrated with the other sections now (creates lots of duplication). I'd suggest either leaving this out for a separate PR or else making bigger changes to remove content from the existing doc.

I feel introducing dvc exp run at first with -S is too much. IMHO we need to tell that dvc exp run is a replacement to dvc repro, but doing this with --set-param is a bit distracting. I can remove that section but it looks like we're delving too early into parameters, before telling exp run in general.

You may be right about repeating exp run and exp show, over and over, but we are telling experimentation here, and experimenting usually involves such repeats with some minor changes.

  1. Can we reduce the repo complexity? The repo looks great, but if we now have the flexibility of having a repo specifically for this doc, I'd prefer we keep it as simple as possible because it makes the example commands, tables, etc. easier to read, and it's easier for user to look through the contents of the repo.

At first I wanted to set up many parameters to make a wiggle room for us and for the user. I think it's now safe to remove most of the parameters from dvc.yaml and have a clear view. But I also would like to show --include-params and --include-metrics, as in a realistic project these will most likely be required.

The code still will read the params from params.yaml, but they won't be considered as a dependency. Is this OK?

Copy link
Contributor

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel introducing dvc exp run at first with -S is too much. IMHO we need to tell that dvc exp run is a replacement to dvc repro, but doing this with --set-param is a bit distracting. I can remove that section but it looks like we're delving too early into parameters, before telling exp run in general.

You may be right about repeating exp run and exp show, over and over, but we are telling experimentation here, and experimenting usually involves such repeats with some minor changes.

I put a couple more comments on sections where I think the duplication doesn't make sense.

The code still will read the params from params.yaml, but they won't be considered as a dependency. Is this OK?

You mean you will change dvc.yaml but leave the rest of the repo the same?

content/docs/start/experiments.md Show resolved Hide resolved
content/docs/start/experiments.md Show resolved Hide resolved
@iesahin
Copy link
Contributor Author

iesahin commented Jun 3, 2021

Why remove the reference to metrics-parameters-plots? The rationale for including it is that users may otherwise be unfamiliar with how to specify parameters and metrics, which is a prerequisite for a lot of what is below.

Thank you. I began before that link was there and probably lost it in a rebase. Of course it should be there.

@shcheklein shcheklein temporarily deployed to dvc-org-iesahin-revise--9xkhvv June 3, 2021 14:56 Inactive
content/docs/start/experiments.md Show resolved Hide resolved
content/docs/start/experiments.md Show resolved Hide resolved
content/docs/start/experiments.md Show resolved Hide resolved
Comment on lines +45 to +50
```

Then you can `dvc pull` to get the dataset and run the commands in this
document. For detailed information on parameters and the project structure
please refer to the
[project repository](https://github.com/iterative/get-started-experiments)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```
Then you can `dvc pull` to get the dataset and run the commands in this
document. For detailed information on parameters and the project structure
please refer to the
[project repository](https://github.com/iterative/get-started-experiments)
$ dvc pull
```
> See `dvc pull` for more details.

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review based on the intro: I think we need to simplify the explanations, avoid low-level details and too many references to the example repo. It looks a bit like this doc is being rewritten to showcase the example repo but it should be the other way around: docs come first, make up command output if needed, and later update the example repo to match (enough). I don't even think it has to match 100%

Sorry if some of this overlaps with feedback from @dberenbaum. There's lots of unresolved comments, which may mean a) need to mark them resolved or b) the changes are too much. Maybe some are out of scope?

Some specific feedback and suggestions 👇

content/docs/start/experiments.md Show resolved Hide resolved
Comment on lines +59 to +73
In order to run a baseline experiment with the default parameters defined in
`params.yaml`:

```dvc
$ dvc exp run
'data/fashion-mnist/raw.dvc' didn't change, skipping
Stage 'prepare' didn't change, skipping
Stage 'preprocess' didn't change, skipping
Stage 'train' didn't change, skipping
Stage 'evaluate' didn't change, skipping
```

This resembles `dvc repro`. However, when using `dvc repro` we need to update
`params.yaml` manually, run the pipeline, if the results are worth it commit
them to DVC and Git. `dvc exp` automates this process through its subcommands.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing substantial seems to happen here? I'd either skip dvc pull in the preparation <details> and have some result here, or just don't add all this. Probably the latter (in agreement with #2497 (review)), as it seems out of scope for this PR. Also, GS != e2e tutorial. Can't cover everything

Suggested change
In order to run a baseline experiment with the default parameters defined in
`params.yaml`:
```dvc
$ dvc exp run
'data/fashion-mnist/raw.dvc' didn't change, skipping
Stage 'prepare' didn't change, skipping
Stage 'preprocess' didn't change, skipping
Stage 'train' didn't change, skipping
Stage 'evaluate' didn't change, skipping
```
This resembles `dvc repro`. However, when using `dvc repro` we need to update
`params.yaml` manually, run the pipeline, if the results are worth it commit
them to DVC and Git. `dvc exp` automates this process through its subcommands.

That said, some of the last paragraph, summarized as a single sentence, could potentially be preserved in the previous paragraph, or in the one after the code block below. Or maybe as a note (md block quote).

Comment on lines +78 to +79
$ dvc exp show --include-metrics categorical_accuracy \
--include-params model.name,model.cnn.conv_units
Copy link
Contributor

@jorgeorpinel jorgeorpinel Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make the commands and tables easier to read

Yeah. I'd just use plain exp show here and modify the sample output manually as needed. Later update the example repo. Again, docs writing should drive this process I think.

But actually, I don't think exp show would display anything here? No experiments have been run. @iesahin this change in the story also seems out of scope TBH (from line 75 to 86).

content/docs/start/experiments.md Show resolved Hide resolved
Co-authored-by: Dave Berenbaum <[email protected]>
@jorgeorpinel jorgeorpinel temporarily deployed to dvc-org-iesahin-revise--9xkhvv June 4, 2021 09:21 Inactive
@jorgeorpinel jorgeorpinel temporarily deployed to dvc-org-iesahin-revise--9xkhvv June 4, 2021 09:26 Inactive
@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Jun 4, 2021

Logistics. Please:

  • Mark resolved comments if possible.
  • git pull (I committed c4790b8).
  • Resolve conflicts with master.

Thanks

@shcheklein
Copy link
Member

shcheklein commented Jun 4, 2021

My take:

  • this is WIP (as designated by @iesahin ) this is not actually, sorry
  • let's not merge and do any significant changes until we have some clarify on the start: independent trails #2496 - it might turn out that it's good to have a section that introduces experiments on top of the existing old school "example-get-started"

@iesahin
Copy link
Contributor Author

iesahin commented Jun 8, 2021

  • it might turn out that it's good to have a section that introduces experiments on top of the existing old school "example-get-started"

I think we will want to show some significant experiment results in an experimentation document. With example-get-started, it might be a fruitless effort but we can keep this as is for the time being.

@shcheklein
Copy link
Member

I think we will want to show some significant experiment results in an experimentation document. With example-get-started, it might be a fruitless effort but we can keep this as is for the time being.

yes, but the first priority should be to show the possibility of this thing. We might btw rewrite the text a bit and when we do bigrams we can them first with dvc exp multiple times - this way we won't need a separate section on top of the example-get-started

@iesahin
Copy link
Contributor Author

iesahin commented Jul 8, 2021

We can close this in favor of #2574 now.

@iesahin iesahin closed this Jul 8, 2021
@iesahin iesahin deleted the iesahin-revise-gs-experiments branch July 14, 2021 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

start: A new GS:Experiments document with get-started-experiments
5 participants