Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dynamic graph execution #1419

Closed
groszewn opened this issue Feb 7, 2020 — with Board Genius Sync · 3 comments
Closed

Support dynamic graph execution #1419

groszewn opened this issue Feb 7, 2020 — with Board Genius Sync · 3 comments

Comments

Copy link
Contributor

groszewn commented Feb 7, 2020

Currently, the inference graph must be statically defined within the SeldonDeployment. We have multiple models that would be reused across multiple different inference graphs, leading to an increase in resource usage (since each graph spins up its own underlying model pods). This also means we need to deploy a new inference graph to our cluster for any slightly modified graph that our consumers may need.

It would be great to be able to dynamically define the inference graph at request-time as opposed to deploy-time to decrease both the amount of resources used and production deployments needed. Some sort of model registry within the cluster could potentially be a way to discover what model services are available for use.

@groszewn groszewn added the triage Needs to be triaged and prioritised accordingly label Feb 7, 2020
@ukclivecox
Copy link
Contributor

Sounds interesting. Can you expand a bit on how you see this working?

Would this allow models to be running in multiple shared graphs?

@groszewn
Copy link
Contributor Author

groszewn commented Feb 9, 2020

I feel like this would be more of an additional service that leverages SeldonDeployments as opposed to an extension of the current executor. The mlgraph repo pretty accurately captures how I'm thinking about this.

I'm envisioning a registry of all deployed SeldonDeployments and their corresponding schema (depending on the outcome of #1420) that the orchestration service would be aware of and leverage. Users would pass in the required inputs and defined inference graph to be executed (likely reusing the graph structure in the SeldonDeployment specification).

For statically defined graphs, a dependency map that tracks enough information between graphs to ensure the exact same model service is truly intended to be used (runtime arguments, environment variables, resource requests/limits, mounted volumes, etc.) would become unwieldy pretty quickly. I would likely see the option for models to be running in multiple shared graphs as a feature of runtime-defined graphs only.

@ukclivecox ukclivecox removed the triage Needs to be triaged and prioritised accordingly label Feb 20, 2020
@ukclivecox ukclivecox added this to the 1.2 milestone Feb 20, 2020
@ukclivecox ukclivecox removed this from the 1.2 milestone Apr 23, 2020
@axsaucedo axsaucedo changed the title Support dynamic graph execution OSS-30: Support dynamic graph execution Apr 26, 2021
@axsaucedo axsaucedo changed the title OSS-30: Support dynamic graph execution Support dynamic graph execution Apr 28, 2021
@ukclivecox
Copy link
Contributor

This can be solved via Tempo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants