Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Provide a metadata endpoint with custom metadata #319

Closed
epa095 opened this issue Nov 27, 2018 · 9 comments
Closed

Feature request: Provide a metadata endpoint with custom metadata #319

epa095 opened this issue Nov 27, 2018 · 9 comments
Labels
area/language wrapper enhancement External API lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. Microservice API Python
Milestone

Comments

@epa095
Copy link

epa095 commented Nov 27, 2018

It would be useful for us (and probably others) to be able to provide a dictionary with metadata for a model, which can be reached through a separate endpoint from predict, e.g. /metadata. Typical metadata would be the feature-names of the model, their units of measure, model-parameters, creation time etc. Today it is possible to return tags as part of the return value from predict, but this means that one

  • becomes a bit careful with not providing too much information, since it is returned for every responce
  • can not figure out meta-data without knowing at least the signature of the model (the number of parameters) beforehand, since one needs to do a successful call to predict for this.
@ukclivecox
Copy link
Contributor

This makes sense. There is the ability in the current python wrappers to run a separate custom Flask app for which you have control over the endpoints.

However, something built into the default endpoint is probably worth it. We are rearranging the python wrappers to make them pip and conda installable but a PR for this would be welcome after that.

@epa095
Copy link
Author

epa095 commented Nov 27, 2018

The problem is actually not to get the data out of the python wrapper. In addition to the custom flask app solution you mention, one can hijack model_microservice.get_rest_microservice and add a new endpoint to the same Flask app as the rest of the program uses (and as such the same port). The problem is that the Java routers don't seem to forward requests to this endpoint.

@ukclivecox
Copy link
Contributor

Ah ok. Yes for /metadata to be a standard part of the API we would need a bigger change to add it to the proto buffer/OpenAPI specs and implement it across the wrappers, API gateway and Service Orchestrator in the same way as predict and feedback.

@janvdvegt
Copy link

I would also be very interested in this, there are many other types of things we want to ask our models outside of just predictions, for example metadata about the model but this could also be things like different types of formats for the predictions or get uncertainty of a certain prediction.

@judahrand
Copy link

Is this still something which is being considered / work would be accepted on? I have a similar need to be able to determine what a model expects before querying it. I would be happy to open a PR if someone can point me in the right direction.

@ryandawsonuk
Copy link
Contributor

We're interested in model metadata stores such as modeldb and the in-progress (quite early days at the moment) https://github.com/kubeflow/metadata

Would be very interesting to get feedback on whether the community would prefer metadata to be obtained from the model itself or from a metadata store

@axsaucedo
Copy link
Contributor

This would be a very interesting addition, however I think it would be important to understand a few use-cases of what the metadata would be used for.

The reason why this is important is because the level of granularity and the way the metadata would be provided would depend on what the use-case will be for.

For example, one use-case of metadata management in production is Apache Atlas, which basically allows you to manage all the metadata from the resources that you have in production, and allows you to also build taxonomies and hierarchical relationships to be able to manage what you have in production.

This would be different to having metadata that is accessible on a per-graph basis, as this perhaps wouldn't be used as much for model governance, and would be used for application-specific interactions with the model itself.

We have considered the use of metadata for example in the case of our Black Box Model explainers usecase, where in order for an explainer to be built, it needs to have access to metadata of the model, such as the classes for the categorical features, and the limits for numerical features.

Metadata is certainly an open problem, and in order for us to start tackling it we need to scope it down, so we need to make sure we have an understanding of some use-cases which then can be important enough so we can prioritise this piece of work.

@Jude188 I would suggest that before jumping into opening a PR, the first step would be for you to provide a design proposal for how this would look like beyond the extension in the Flask API, as well as an overview of the type of use-cases that it would enable (an example would be the explainer piece above).

At this point I would think that the metadata management from the Apache Atlas perspective that could open for ML model governance leveraging taxonomies could have more potential in the immediate term for enterprise-scale use of Seldon than the more granular counterpart, but that is why I think it would be important that this is fleshed out further before we (or someone from the community) can jump on it.

It would be worh pointing out something that @ryandawsonuk and @cliveseldon mentioned to me, this is that we are currently also involved in several side-projects that are also tackling this issue - that is, 1) the Kubeflow metadata project, and 2) the Kubeflow MLSpec project. These two which are indirectly relevant but still important to consider.

@seldondev
Copy link
Collaborator

Issues go stale after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
/lifecycle stale

@seldondev seldondev added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 17, 2020
@ukclivecox
Copy link
Contributor

We have ongoing metadata work so please follow those issues. #1671

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/language wrapper enhancement External API lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. Microservice API Python
Projects
None yet
Development

No branches or pull requests

7 participants