Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc: Enable model loading/saving based on checkpoint: true #191

Closed
daavoo opened this issue Nov 12, 2021 · 6 comments
Closed

dvc: Enable model loading/saving based on checkpoint: true #191

daavoo opened this issue Nov 12, 2021 · 6 comments
Labels
A: dvc DVC integration A: frameworks Area: ML Framework integration discussion requires active participation to reach a conclusion

Comments

@daavoo
Copy link
Contributor

daavoo commented Nov 12, 2021

Given a DVC stage where checkpoints are enabled:

stages:
  train:
    cmd: src/train.py
    deps:
    - data
    - src
    outs:
    - model.pth:
        checkpoint: true
    live:
      dvclive:
        summary: true
        html: true

Users need to also remember enabling checkpoints on the stage code.

Maybe we could treat the model_file argument of the integrations similar to how we treat the path argument:

A) If checkpoint: true is set in DVC and model_file is None in DVCLive: we use the value of the output as model_file .
B) If checkpoint: true is set in DVC and model_file is set in DVCLive but does not match: raise ConfigMismatchError.
C) If checkpoint: false, just use whichever value has model_file in DVCLive.

@daavoo daavoo added A: frameworks Area: ML Framework integration discussion requires active participation to reach a conclusion labels Nov 12, 2021
@daavoo
Copy link
Contributor Author

daavoo commented Nov 12, 2021

@pared @dberenbaum

@dberenbaum
Copy link
Collaborator

A) If checkpoint: true is set in DVC and model_file is None in DVCLive: we use the value of the output as model_file .

What do you mean by "we use the value of the output as model_file"? Does that mean infer model.pth here?

@daavoo
Copy link
Contributor Author

daavoo commented Nov 12, 2021

A) If checkpoint: true is set in DVC and model_file is None in DVCLive: we use the value of the output as model_file .

What do you mean by "we use the value of the output as model_file"? Does that mean infer model.pth here?

Yes. Make DVC store the output path in an env var, like it does for DVCLIVE_PATH

@pared
Copy link
Contributor

pared commented Nov 15, 2021

@daavoo That would work only in integrations, right?

@daavoo
Copy link
Contributor Author

daavoo commented Nov 15, 2021

@daavoo That would work only in integrations, right?

With the current implementation, yes.

@dberenbaum
Copy link
Collaborator

It makes sense to me.

It still seems slightly awkward to have to define the path in dvc.yaml. It would be great if this wasn't needed at all, and the model file was part of the live output by default, but that may be a separate issue.

@daavoo daavoo added the A: dvc DVC integration label Sep 24, 2022
@daavoo daavoo closed this as not planned Won't fix, can't repro, duplicate, stale Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: dvc DVC integration A: frameworks Area: ML Framework integration discussion requires active participation to reach a conclusion
Projects
None yet
Development

No branches or pull requests

3 participants