-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
log_artifact #378
Comments
@tapadipti @dmpetrov @aguschin I mentioned this in the conversation on the model registry. I think this workflow should be supported. It's a way for people to dump their models to Studio, or to DVC first and connect in Studio, etc. I think this is the basic approach that MLFlow, W&B, etc are taking and it makes sense to me:
Related - #305 (but I would not be making this about MLEM - I think it's an optional advanced feature for now there). |
🤔 top-level outputs so that they can be added to the |
So a dependency for this would be that the user has dvc set up, right? Should it also be possible to For registering new versions of an existing artifact, we'll need an option to provide an existing artifact identifier (and possibly a new version number).
It could also be a @shcheklein some thoughts about this simplifying the MR experience:
The users would have to leave the Studio UI and edit their ML code/training to start registering models. And they cannot register models after they have been created. Currently, there's dependency on GTO; with this there will be dependency on DVCLive. Also, they'll still need to add the Git repo in Studio (assuming we don't provide a Studio server endpoint to log artifacts). So, it'll probably not be simple.
Are you suggesting that we will provide a Studio server endpoint for all users to log their artifacts to? And Studio will maintain a database of all artifacts instead of users saving it to their Git repos? (This is how the live metrics are saved). |
Yes, it should be an available alternative for this. |
@shcheklein @tapadipti I was thinking the starting point here would be to still rely on DVC and Git here and save it locally by starting to track the artifact with DVC. Some follow-up steps could be:
What do you think?
@daavoo I'm fine to start with saving in .dvc files, especially since the artifacts themselves aren't likely to be inside the dvclive directory and it should be easier than adding a new dvc.yaml section and dvclive-based dvc.lock. |
Lowering to p2 since I think we still need to better scope what's needed here, and it doesn't break any current workflows |
The |
This will also be needed at some point to simplify the model registry onboarding experience (although we've not scoped it out yet). |
Okay, let's add it back if it's needed there with the limited scope you mention for now. I'd also like to address iterative/dvc#8986 if we do this to improve the transition to pipelines. |
* Early draft of log_artifact * Implement log_artifact method for live fix: #378 * tests: Create `mocked_dvc_repo` and `dvc_repo` fixtures. * log_artifact: Changes from review. - Don't use `commit` - Don't raise error - Add `test_log_artifact` * pylint fixes * mypy fixes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: David de la Iglesia Castro <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Similar to https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.log_artifact, dvclive can have a
log_artifact
method, which can dodvc add
to track an artifact.The text was updated successfully, but these errors were encountered: