Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates the GS:Experiments document with the new GS-Experiments project #2491

Closed

Conversation

iesahin
Copy link
Contributor

@iesahin iesahin commented May 19, 2021

This doesn't contain many content changes except removing the warning that dvc exp push doesn't work without dvc exp apply.

Will be rebasing shortly.

Closes #2479

@iesahin iesahin requested a review from jorgeorpinel May 19, 2021 11:11
@iesahin iesahin self-assigned this May 19, 2021
@shcheklein
Copy link
Member

It has +200 -100 - it means it got bigger - any reason for this?

The intention behind get started to keep it as simple and as short is possible. We should fight for lines here :)


> You can run these commands in the container we built for this tutorial. It has
> all the code and data required to run these examples.
> `docker run -it dvcorg/doc-start:experiments`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we designate pages that have Docker similar to Katacoda (may be make it more visible?) So that we don't repeat the same note over and over again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to be clear - it can be a note at the end, it would be great to unify though, put docker link into into frontametter and let engine generate it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we designate pages that have Docker similar to Katacoda (may be make it more visible?) So that we don't repeat the same note over and over again?

Yes, I prefer that too, and initially I put docker to the front matter, but I later I changed back to include in the content. I mentioned here to keep the PR isolated to this page. Katacoda links are specified in sidebar.json. We perhaps need an update for the sidebar code to show the links to docker.

I'm removing this completely for now.

```dvc
$ dvc exp show --include-metrics categorical_accuracy,precision,recall \
--include-params model.name,model.cnn.conv_units
┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use screenshots here? tables are not colored, the probably do not fit well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer screenshots too because copying these from an SSH terminal was an engineering problem. :) iTerm doesn't copy/paste correctly, script doesn't work either. I used redirection + rsync to get the output.

I agree it looks better with colors.

I'll update https://github.com/iterative/static then, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably it's better to use the repo itself like we do with other images here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some discussions around keeping PR size minimal. I prefer to put them into this repository as well.

│ ├── exp-98a96 │ 0.60405 │ 0.9608 │ 100 │ 64 │
│ └── exp-ad5b1 │ 0.56191 │ 0.93345 │ 50 │ 2 │
└───────────────┴──────────┴─────────┴────────────┴─────────────────┘
$ dvc exp show --no-timestamp \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those look a bit too complicated? can we simplify the project to keep only params that we actively show here? - just an idea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worked on this to find a good param-metric pair to present a more-or-less successful experiment. But at a certain level, increasing the number of units decreases the classification accuracy. Then I figured out, maybe it's rather acceptable to show an unsuccessful experiment. :)

@shcheklein
Copy link
Member

@iesahin please make it from a branch (not a fork, so that it is deployed into a preview env automatically)

@iesahin
Copy link
Contributor Author

iesahin commented May 20, 2021

@iesahin please make it from a branch (not a fork, so that it is deployed into a preview env automatically)

I wasn't aware that the forks have that limitation. I used the fork instead of putting iesahin/ to the branch. Let me pull and update. Sorry.

@iesahin
Copy link
Contributor Author

iesahin commented May 20, 2021

It has +200 -100 - it means it got bigger - any reason for this?

The intention behind get started to keep it as simple and as short is possible. We should fight for lines here :)

There is a saying here that goes like showing the death and aggreing on malaria, that probably has cognates in every culture. :)

Most of the increase is about the tables and larger set of queued commands. I didn't add any content. I'll do that later.

@shcheklein

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a few comments on the intro.

@iesahin please use branches directly on this repo in the future (you have write access) so they get deployed automatically and so you have the latest version. You're currently working on a relatively old one (pre-@casperdcl's edits in #2359) so please pull upstream master. (If the conflicts are too much, feel free to restart the PR but in that case please migrate any pending feedback.)

I have to agree that the growth in this doc seems unjustified by the PR's purpose (replacing the example repo) at first sight. Let's try to summarize: Leave out non-essential details and references; truncate repetitive terminal output samples (if possible even filter out exp show table cols to minimize horizontal scrolling).

p.s. I'd request @dberenbaum's review here since he wrote this doc recently. I'm also happy to keep helping but it's probably best to do one reviewer at a time on this one. Thanks!

Comment on lines +22 to +24
[ML pipelines](/doc/start/data-pipelines) and compare the changes. In this
section we'll use a [new Get Started project][dvcgs] to illustrate
experimentation features in DVC 2.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll use a new Get Started project

Is there a better phrase (meaningful to a new reader) ?

The [dvcgs] link seems to be missing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not even sure [fmnist]: ... type links mid-content work on our engine, did you run a local dev env? Please check the contrib guide if you haven't. Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference links are standard markdown. I use them to keep the content within 80 columns and not breaking the URL midway. Although MD engines are smart enough to glue a link like

[dvc add](https://dvc.org/doc/command
-reference/add)

some editors may not be. Also opening the browser, finding the exact link, and copy-paste to the document is usually a distraction so I put all links at the end. Sometimes I forget, but they are easier to spot than missing links in inline style.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this links might break or not supported by the link checker. I would double check this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this links might break or not supported by the link checker

I thought I knew how the link checker works but just now I tried to test this in #2534 and that check is skipped. Tried to look into the setup but wasn't able to figure it out quickly. I think @casperdcl wrote that right? Can you advise? Thanks 🙂

Copy link
Contributor

@casperdcl casperdcl Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote the original (bash) link checker but since then it was moved to (a TS/JS action) https://github.com/iterative/link-check so I'm not fully up-to-date on its internals yet.

CC @rogermparent

[can we support this?](https://example.com/split
-line-link)

and [this]?

[this]: https://example.com

Copy link
Contributor

@rogermparent rogermparent Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while since I've been able to work on the link check action, the link parsing is basically a glorified regexer that goes over added/modified lines of each changed file, looking for different patterns depending on the extension. It's very possible I overlooked mid-link newlines in the MD regex parser and it never came up.

It could be beneficial to just strip newlines before searching for links, so all patterns after don't need to be concerned. Theoretically it could accept some links that'd be broken in real MD parsers, but that's better than not recognizing valid ones.

The other styles I'm pretty certain I've never considered and didn't come up, but can very likely be added to the parsing logic for MD files as extra regex checks.

A more complete solution may be to use a full markdown parser like remark, but I'm not sure how well that works on out-of-context diffs and would require quite a bit more work than the other solutions other than more complete parity with MD parsers. Now that I think about it, using remark and grabbing links from the output may be a viable solution with a similar amount of work where the alternative is adding a new regex for every link type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and [this]?

[this]: https:://example.com

By support I suppose we mean that it'll check both this and https:://example.com, right? Or it can be some MD linter that check the first part of it.

Copy link
Contributor

@casperdcl casperdcl Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both this and https://example.com, right?

exactly

can be some MD linter that check the first part

the current ones (prettier/eslint) didn't find this 😢 (I'd started complaining because I'd noticed a broken/missing [reference] in an early version of this PR)

Copy link
Contributor

@jorgeorpinel jorgeorpinel Jun 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so I take it for now the link checker doesn't support reference links (nor URLS in 2 lines but Prettier wouldn't allow that anyway). Should we open an issue about that?

content/docs/start/experiments.md Show resolved Hide resolved
Comment on lines +36 to +40
In order to run a baseline experiment with the default parameters defined in
`params.yaml`:

```dvc
dvc exp run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess an important preparation step is missing since the project in this page switches completely. Maybe a <details> section with git clone ... and a 1-2 sentence explanation of the example project?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repository has these instructions but I wrote the cloning / venv / dvc pull to the document as well.

Copy link
Contributor

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it looks like a lot of additions and changes. Not only do we want to keep the tutorial short, but I also think some changes seem out of scope of this PR. I'd prefer to have this PR be as close as possible to a drop-in replacement of the new project without any editorial changes, and then have a separate PR with those editorial changes.

@iesahin
Copy link
Contributor Author

iesahin commented May 26, 2021

Checked the reviews here and updated the document for #2497. Closing this in favor of #2497.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

start: A new GS:Experiments document with get-started-experiments
6 participants