Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

live: initial docs draft #2227

Merged
merged 31 commits into from
Mar 9, 2021
Merged

live: initial docs draft #2227

merged 31 commits into from
Mar 9, 2021

Conversation

pared
Copy link
Contributor

@pared pared commented Feb 22, 2021

You may disregard these recommendations if you used the Edit on GitHub button from dvc.org to improve a doc in place.

❗ Please read the guidelines in the Contributing to the Documentation list if you make any substantial changes to the documentation or JS engine.

🐛 Please make sure to mention Fix #issue (if applicable) in the description of the PR. This causes GitHub to close it automatically when the PR is merged.

Please choose to allow us to edit your branch when creating the PR.

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

Matches iterative/dvc/pull/4902

@shcheklein shcheklein had a problem deploying to dvc-org-live-docs-l8ir7gutynb3 February 22, 2021 16:37 Failure
@shcheklein shcheklein temporarily deployed to dvc-org-live-docs-l8ir7gutynb3 February 22, 2021 16:42 Inactive
@pared pared mentioned this pull request Feb 22, 2021
@shcheklein shcheklein temporarily deployed to dvc-org-live-docs-ab0swizph76a February 24, 2021 01:19 Inactive
@shcheklein shcheklein temporarily deployed to dvc-org-live-docs-rtsrgkphj5qf February 24, 2021 01:19 Inactive
@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Mar 1, 2021

should also come up with a good name - "ML Logger - Dvclive"
if the pitch is a lightweight standalone logger with no dependencies, including DVC in the name seems confusing.

Like @dberenbaum mentioned, we did discuss this in some Slack thread. My take was that "LogML" or "TraceML" were the best options.

@dberenbaum
Copy link
Contributor

@pared Love that you added the new section for dvc usage!

I have an example repo in https://github.com/iterative/dvc-checkpoints-mnist that shows how to integrate dvclive with a checkpoints experiment. Maybe it's worth linking that here as an example of how it can be used in place of make_checkpoint with some additional benefits? Also, maybe we should include something similar in the checkpoints docs recommending people use dvclive to capture checkpoints?

Thoughts @pared @pmrowla @jorgeorpinel?

Copy link
Contributor

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few comments, but this is a great start! I'll reiterate that it's better to have some documentation than none, so don't take any of my comments as blockers.

Comment on lines 18 to 19
training summary during training. When used in DVC project user does not have
to call `dvclive.init` method.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When used in DVC project user does not have
to call dvclive.init method.

I'd probably leave this out since it's not that much of an incentive. I would either just leave the first sentence or add in some other benefit, like "When used in a DVC project, it can log metrics and create checkpoints during training that are tracked by DVC."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing, I guess the proper place for that is usage-with-dvc section


The training code (`train.py` file):

```python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should update the example repo I have with this example. Keras really makes for a clean, minimal example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but on the other hand we need to prepare Callback, while in torch everything nicely goes into the training loop

Comment on lines 60 to 61
In order to do that, we will need to provide proper
[`Callback`](https://keras.io/api/callbacks/) for `fit` method:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice, but would it be easier for people getting started to see a manual training loop first rather than a callback? This doesn't need to be a blocker, but maybe something to consider as an enhancement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if we link your torch example, it will be fine.
As i read manual loop tutorial for keras, it seems to me that we will occlude the main point of the tutorial with step by step execution (pushing data through model, calling optimizer, updating gradients). While in this case we have everything hidden under fit call. Callbacks are not that obvious concept, though I think its easier to understand it than reading through whole training loop.

Still, that makes me think that maybe we need not one example, but more? MNIST classifier targeted for DL frameworks users, and something different for, for example, classic ML practitioners? Maybe something with Iris dataset?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we will need multiple examples. I'm still not sure myself how dvclive would work for "classic" ML or whether that's a useful scenario to consider. For now, let's keep what you have, and if you think there's a natural way to link to the dvc-checkpoints-mnist example, we could have that as an additional use case. Otherwise, we can find a way to add that or a similar example in the future.

Let's use the code prepared in previous example and try to make it work with
dvc. Training file `train.py` content:

```python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned that we might want to link to https://github.com/iterative/dvc-checkpoints-mnist, but alternatively you could incorporate checkpoints in this example. That might be too much to dive right into, but I think most use cases for dvclive with dvc will include checkpoints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then maybe we need one more section? We integrated the project with DVC and dvclive, lets make some experiments?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes to introduce checkpoints would be to write the model out in the callback, and to add the model output to dvc with stage add --checkpoint model .... I don't see checkpoints as necessarily being tied to experiments. It adds versioning of the model output that aligns with the metrics dvclive is tracking.

However, as you mentioned below, let's think about how we can introduce multiple examples to cover all of this. I think the example you have is a great starting point.

Now, lets use dvc to run the project:

```dvc
$ dvc run -n train --live training_metrics -d train.py python train.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to use stage add && repro/exp run here? @jorgeorpinel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I will introduce that.

Comment on lines 110 to 125
In `dvc.yaml` there is new stage defined, containing information about the
`dvclive` outputs:

```bash
$ cat dvc.yaml

stages:
train:
cmd: python train.py
deps:
- train.py
live:
training_metrics:
summary: true
html: true
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using stage add above also gives you a reason to move this section up and separate it from the run/repro part and explanation of the dvclive outputs.

content/docs/dvclive/index.md Outdated Show resolved Hide resolved
Copy link
Contributor

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the changes around stage add are implemented, I think this should be ready to merge.

Let's use the code prepared in previous example and try to make it work with
dvc. Training file `train.py` content:

```python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes to introduce checkpoints would be to write the model out in the callback, and to add the model output to dvc with stage add --checkpoint model .... I don't see checkpoints as necessarily being tied to experiments. It adds versioning of the model output that aligns with the metrics dvclive is tracking.

However, as you mentioned below, let's think about how we can introduce multiple examples to cover all of this. I think the example you have is a great starting point.

Comment on lines 60 to 61
In order to do that, we will need to provide proper
[`Callback`](https://keras.io/api/callbacks/) for `fit` method:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we will need multiple examples. I'm still not sure myself how dvclive would work for "classic" ML or whether that's a useful scenario to consider. For now, let's keep what you have, and if you think there's a natural way to link to the dvc-checkpoints-mnist example, we could have that as an additional use case. Otherwise, we can find a way to add that or a similar example in the future.

@dberenbaum dberenbaum requested a review from jorgeorpinel March 3, 2021 16:20
Comment on lines 3 to 5
[dvclive](https://cml.dev) is an open-source python library for monitoring
[dvclive](https://cml.dev) is an open-source python library for monitoring the
progress of metrics during training of machine learning projects.

This comment was marked as resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

during training of machine learning models

Is this too specific? What's the broader use case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so? Are you asking whether there are non-ML use cases (not likely), or is there something else you think might be too specific?

Copy link
Contributor

@jorgeorpinel jorgeorpinel Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you asking whether there are non-ML use cases (not likely)

Non model-training use cases. E.g. feature extraction or any other stage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I think it's safe to say that dvclive is a narrower product built specifically for model training.

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the index @pared ☝️. That plus changing sidebar.json + Dave's review should be good enough as a first iteration to merge I think.

@jorgeorpinel jorgeorpinel temporarily deployed to dvc-org-live-docs-b9xyay1xiojb March 9, 2021 08:39 Inactive
@jorgeorpinel jorgeorpinel temporarily deployed to dvc-org-live-docs-b9xyay1xiojb March 9, 2021 08:42 Inactive
Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dvclive + DVC copy edits (will commit):

content/docs/dvclive/dvclive-with-dvc.md Outdated Show resolved Hide resolved
content/docs/dvclive/dvclive-with-dvc.md Outdated Show resolved Hide resolved
content/docs/dvclive/dvclive-with-dvc.md Outdated Show resolved Hide resolved
content/docs/dvclive/dvclive-with-dvc.md Outdated Show resolved Hide resolved
content/docs/dvclive/dvclive-with-dvc.md Outdated Show resolved Hide resolved
content/docs/dvclive/dvclive-with-dvc.md Outdated Show resolved Hide resolved
content/docs/dvclive/dvclive-with-dvc.md Outdated Show resolved Hide resolved
content/docs/dvclive/dvclive-with-dvc.md Outdated Show resolved Hide resolved
content/docs/dvclive/dvclive-with-dvc.md Outdated Show resolved Hide resolved
content/docs/dvclive/dvclive-with-dvc.md Outdated Show resolved Hide resolved
@jorgeorpinel jorgeorpinel temporarily deployed to dvc-org-live-docs-b9xyay1xiojb March 9, 2021 09:13 Inactive
@jorgeorpinel jorgeorpinel temporarily deployed to dvc-org-live-docs-b9xyay1xiojb March 9, 2021 09:14 Inactive
@jorgeorpinel
Copy link
Contributor

I have a few minor questions/notes starting at #2227 (review) (cc @pared please scroll through all the resolved comments) but otherwise LGTM — feel free to merge.

Comment on lines +93 to +98
The value passed to `--live` (`training_metrics`) became the directory `path`
for Dvclive to write logs in. Other supported command options for DVC
integration:

- `--live-no-summary` - passes `summary=False` to Dvclive.
- `--live-no-html` - passes `html=False` to Dvclive.
Copy link
Contributor

@jorgeorpinel jorgeorpinel Mar 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 4042082.

@shcheklein shcheklein temporarily deployed to dvc-org-live-docs-b9xyay1xiojb March 9, 2021 10:09 Inactive
@pared pared mentioned this pull request Mar 9, 2021
2 tasks
@shcheklein shcheklein temporarily deployed to dvc-org-live-docs-b9xyay1xiojb March 9, 2021 10:30 Inactive
@pared
Copy link
Contributor Author

pared commented Mar 9, 2021

@jorgeorpinel
I made note out of Further integrations as per your comment

Also I did a issue to fix up missing entries for stages and stage related commands

I am not authorized to merge

@jorgeorpinel jorgeorpinel merged commit 5e456a2 into master Mar 9, 2021
@jorgeorpinel jorgeorpinel deleted the live_docs branch March 10, 2021 19:15
jorgeorpinel added a commit that referenced this pull request Mar 28, 2021
jorgeorpinel added a commit that referenced this pull request Mar 28, 2021
jorgeorpinel added a commit that referenced this pull request Mar 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants