-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add page about Studio REST API #4681
Conversation
Looks good @aguschin! Do we have a story to add it to the model registry docs and anywhere else it's needed for higher-level stories? Where do we plan to explain why it's needed (esp. compared to dvc get) and how to use it (model deployment workflow)? |
This comment was marked as resolved.
This comment was marked as resolved.
content/docs/studio/rest-api.md
Outdated
When your model is annotated in non-root `dvc.yaml` file (typical for monorepo | ||
case), model name will be constructed from two parts separated by colon: | ||
`path/to/dvc/yaml:model_name`. For example, take a loot at this | ||
[model from example-get-started-experiments repo](https://studio.iterative.ai/user/aguschin/models/VtQdva13kMSPsN_N8004aQ==/pool-segmentation/v1.0.1). | ||
Its full name that you need to use in API is `results/train:pool-segmentation`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an example for monorepo. @dberenbaum wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it required to include results/train
in this case?
Edit: okay, I think I know that the answer is yes, since artifact names are specific to the dvc.yaml file, right? If there are no conflicts, can we omit results/train
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I think we can omit that. We need to implement it on BE side I believe.
This comment was marked as resolved.
This comment was marked as resolved.
Reopening per #4809 (comment) |
How hard would it be to just return the http urls in this case to have a simple working example @amritghimire? |
Sorry I didn't understand. Do we want to add support for http/https urls? |
Yes if it's easy enough. How much effort do you estimate it would take? |
Should be probably 2 days effort. |
@dberenbaum Can you create an issue for this? |
Waiting for https://github.com/iterative/studio/issues/7127 and https://github.com/iterative/studio/issues/7383 before merging |
Per #4809 (comment), I added references and reworded the dvc artifacts CLI/API refs. @shcheklein PTAL 🙏. |
Studio [access token] and a Studio project configured with your [remote storage | ||
credentials]. It does not require the client to have those credentials. If you | ||
do not have a valid Studio token, or the artifact is not tracked in the model | ||
registry, DVC will fall back to downloading the artifact from the project's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: it is in either way is downloaded from the remote. I'm not it will be clear to users what is going on here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "DVC will fall back to using the project's default DVC remote config to access remote storage"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, may be I missing something then.
does it go into dvc get
mode here? cloning the repo, etc ... it requires access to GH then, etc?
even to take a step back ... can you run dvc artifacts get
outside a repository? when the only thing that is set is an env var with a studio token?
Should it be like:
dvc artifacts get
:
- not need for a repo
- no GH access needed
- no AWS/GCP/Azure access needed
- no need to know that path, git hashes, other low level details
- unfortunately DVC is still needed to be installed (heavy, etc)
- can be run as simple as:
dvc artifacts get model_name
If not studio token or artifact is not tracked in the model registry (what happens if there are not credentials in Studio btw?): it falls back to what exactly, could you clarify please?
- extracts the path to an artifact from the current repo / current revision? (means it needs a repo for example)
- fails if it runs outside the repo
- etc ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it go into
dvc get
mode here? cloning the repo, etc ... it requires access to GH then, etc?
Yes.
even to take a step back ... can you run
dvc artifacts get
outside a repository? when the only thing that is set is an env var with a studio token?
Yes.
There is no need to be in a repo (same is true for dvc get
) regardless of the "mode" dvc artifacts get
uses.
Another alternative to explain here is to link to dvc get
and say something like "DVC will fall back to its typical method to get files (see dvc get
)."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no token set, how does it resolve model name into a repo + a path? Or does it expect those to be provided?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we open a follow-up issue in studio to discuss adding it to the rest api?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why rest API though? would we need it first to be implemented there? ... yes, overall, let's do a follow up on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm misunderstanding then. Do you mean if I'm already inside a repo, we should default to that repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I mean no matter if I'm inside or outside, I assume that in a lot of case mode name is unique across many repos, there is no need further clarify that if Studio has a single model with that name. In some cases clarifying it's subdir should be enough (repo is not needed). In some cases repo is needed indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean if I'm already inside a repo, we should default to that repo?
Can be, I would for simplicity make independent on anything local.
TODO: