Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frameworks: Use log_artifact #465

Closed
daavoo opened this issue Feb 15, 2023 · 7 comments · Fixed by #500
Closed

frameworks: Use log_artifact #465

daavoo opened this issue Feb 15, 2023 · 7 comments · Fixed by #500
Assignees
Labels
A: frameworks Area: ML Framework integration A: log_artifact Area: `live.log_artifact` feature request p1-important Include in the next sprint

Comments

@daavoo
Copy link
Contributor

daavoo commented Feb 15, 2023

Once #378 is done, it would be nice to automatically use it under the hood in the frameworks where we currently support saving the model, i.e.:

if self.model_file:
if self.save_weights_only:
self.model.save_weights(self.model_file)
else:
self.model.save(self.model_file)

To:

        if self.model_file:
            if self.save_weights_only:
                self.model.save_weights(self.model_file)
            else:
                self.model.save(self.model_file)
            self.live.log_artifact(self.model_file)
@daavoo daavoo added the A: frameworks Area: ML Framework integration label Feb 15, 2023
@daavoo daavoo changed the title frameworks: Use log_artifact. frameworks: Use log_artifact Feb 15, 2023
@daavoo daavoo added A: log_artifact Area: `live.log_artifact` feature request labels Feb 15, 2023
@dberenbaum
Copy link
Collaborator

Even better would be if DVCLive can inspect if other callbacks are being used to save the model and use log_artifact to track those paths.

@dberenbaum
Copy link
Collaborator

See also #472: if we have a log_model method, obviously let's use that here instead of the generic log_artifact.

@dberenbaum dberenbaum added the p1-important Include in the next sprint label Mar 6, 2023
@dberenbaum dberenbaum added this to DVC Mar 9, 2023
@github-project-automation github-project-automation bot moved this to Backlog in DVC Mar 9, 2023
@dberenbaum dberenbaum moved this from Backlog to Todo in DVC Mar 9, 2023
@dberenbaum
Copy link
Collaborator

@daavoo Can you take this as a follow-up to #464?

@daavoo
Copy link
Contributor Author

daavoo commented Mar 9, 2023

@daavoo Can you take this as a follow-up to #464?

Yep! was already planning on it 🚀
I was also counting on including the type=model meta field as it doesn't really break anything to include it

@dberenbaum
Copy link
Collaborator

dberenbaum commented Mar 9, 2023

I was also counting on including the type=model meta field as it doesn't really break anything to include it

Thanks! First need to settle the discussion on how to handle type=model from iterative/mlem.ai#323 (comment) / https://iterativeai.slack.com/archives/CB41NAL8H/p1678197700155189.

@daavoo daavoo moved this from Todo to In Progress in DVC Mar 10, 2023
@daavoo daavoo linked a pull request Mar 14, 2023 that will close this issue
@github-project-automation github-project-automation bot moved this from In Progress to Done in DVC Mar 20, 2023
@SH2282000
Copy link

SH2282000 commented Aug 4, 2023

If the file is part of the outputs of a stage, then the self.live.log_artifact(self.model_file) will cause the following warning to be thrown:

WARNING:dvclive:Failed to dvc add model.h5: cannot update 'model.h5': not a data source

Therefore, what should be the best practice in terms of staging? Does the model.h5 should be stored as a standalone file, or how the model_file can be used without throwing warnings while still being part of the outs:

stages:
  train:
    [...]
    outs:
      - model.h5
      - [...]

Solutions found ()

In those solution, a model.h5.dvc is added (this is not what I want):

  • Adding the model.h5 via dvc add as a standalone (generating the model.h5.dvc).
  • Not saving the model (will automatically call the dvc add).

In this other solution, the file is being saved only after the training (this is not what I want either):

  • Using keras model.save() function only at the end of the training (losing the ability to save the model for each epoch).

@daavoo
Copy link
Contributor Author

daavoo commented Aug 7, 2023

r how the model_file can be used without throwing warnings while still being part of the outs:

Hi @SH2282000 , the dvc.yaml you shared is correct and the warnings should not appear in the latest version of DVCLive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: frameworks Area: ML Framework integration A: log_artifact Area: `live.log_artifact` feature request p1-important Include in the next sprint
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants