Serialization of pre-processing pipeline for CI/CD #1713

jhagege · 2020-04-20T01:02:14Z

Hi, thanks for the great library.
I noticed in your examples you serialize the preprocessing pipeline.
Does it assume that the pip dependencies of the preprocessing classes need to be the exact same version ?
I'm trying to think how to package the inference workflow inside a single Dockerfile as part of a CI/CD pipeline.
How can I guarantee that I have a self-contained Docker image with the correct exact dependencies and the serialized pre-processing pipeline.

Thanks for any insights.

ukclivecox · 2020-04-20T06:58:52Z

Hi @jhagege
Can you give some more details of the examples you are referring to and what you mean by "pre-processing pipeline"?

jhagege · 2020-04-20T15:00:56Z

@cliveseldon , thanks for your quick answer.

I was referring to the following:
from the outlier_combiner example.

I find the pattern elegant and I'm wondering how to take it one step further.
Each model created is defined by three "artifacts":

Source code (commit id)
Preprocessor pipeline (transforms)
Environment (python packages for example in a requirements.txt with their dependencies version).

I'd like to configure a CI pipeline to package all of those into some kind of an "uber-artifact", per model that is trained, so that it can provide an integrated environment for inference.

Thanks for any insights.

RafalSkolasinski · 2020-07-15T14:29:58Z

We are not really concentrating on training.

It seems that the best approach is to have a solid reproducible preparation of artefacts / trained models (kubeflow, dvc, pachyderm, ...) and then package these into a Docker image that you can deploy with Seldon.

Check our latest addition of model metadata: https://docs.seldon.io/projects/seldon-core/en/latest/reference/apis/metadata.html that allows one to make connection to the training source of the model.

jhagege · 2020-07-16T08:00:58Z

Thanks much, I'll review.

axsaucedo assigned RafalSkolasinski Apr 21, 2020

axsaucedo closed this as completed Jul 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serialization of pre-processing pipeline for CI/CD #1713

Serialization of pre-processing pipeline for CI/CD #1713

jhagege commented Apr 20, 2020

ukclivecox commented Apr 20, 2020

jhagege commented Apr 20, 2020

RafalSkolasinski commented Jul 15, 2020

jhagege commented Jul 16, 2020

Serialization of pre-processing pipeline for CI/CD #1713

Serialization of pre-processing pipeline for CI/CD #1713

Comments

jhagege commented Apr 20, 2020

ukclivecox commented Apr 20, 2020

jhagege commented Apr 20, 2020

RafalSkolasinski commented Jul 15, 2020

jhagege commented Jul 16, 2020