Best practice for multiple servings from one model #915

dtaniwaki · 2019-10-06T10:47:13Z

I'd like to serve multiple predictions from one model. I currently create multiple model files using one model and create multiple containers that serve a model from a different model file. In this method, the memory footprint of the loaded model becomes multiple of one model serving.

I came up with an idea of using metadata so I can call different methods in the predict method of a model. However, I think an endpoint serving different kinds of outputs is weird and hard to maintain the graphs.

Could someone tell me your idea or known best practice for this kind of case?

The text was updated successfully, but these errors were encountered:

ukclivecox · 2019-10-08T12:33:45Z

Its much harder to manage SeldonDeployments if the underlying resources are shared. It certainly could be that we implement some form of sharing for model servers that allow it but I would think this is more when 2 separate SeldonDeployments happen to use the same model. This seems to be your use case.

You could create create 3 SeldonDeployments for your use case. 1 there has the core model and 2 that act as proxy servers to the core model for each external use case. Would this fir your need?

ukclivecox · 2019-10-24T10:19:37Z

Please reopen with specific issue if this is still causing a blocker for you @dtaniwaki

ukclivecox added this to the 0.5.x milestone Oct 8, 2019

ukclivecox closed this as completed Oct 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best practice for multiple servings from one model #915

Best practice for multiple servings from one model #915

dtaniwaki commented Oct 6, 2019

ukclivecox commented Oct 8, 2019

ukclivecox commented Oct 24, 2019

Best practice for multiple servings from one model #915

Best practice for multiple servings from one model #915

Comments

dtaniwaki commented Oct 6, 2019

ukclivecox commented Oct 8, 2019

ukclivecox commented Oct 24, 2019