- Hosts the model as an RESTful API service that allows users to submit new data and receive predictions from
- Model training is not important here, it can be any simple example model or pretrained one from the web.
- Pipeline components/steps:
Preprocessor
- prepares dataModelScorer
- performs the inference task, requests the model APIPostprocessor
- saves inference results to a CSV file
- The pipeline should be modular and easily reused for additional model APIs or data sources
- Dockerization and Deployment:
- The entire inference pipeline should be packaged into Docker image(s)
- so it can be potentially used by different orchestration systems
- Describe how the dockerized pipeline could be integrated into different orchestration systems (Airflow, Kubeflow, Sagemaker, Vertex AI).
- The entire inference pipeline should be packaged into Docker image(s)
Implement (or describe) a monitoring and observability strategy:
- System to track the performance of the inference pipeline over time
- Set up rules to notify you of potential issues or performance degradation
- Monitor and analyze data to identify patterns and trends in pipeline behavior
- Testing
- Documentation
- Code Quality
- Fine-tune the selected model to improve its predictive performance.
- Evaluate the fine-tuned model's performance and compare it to the original model.
To create this Python dev environment from scratch run:
conda env create
To check conda envs:
conda env list && conda list
To update the env during development:
conda env update -n mlops-dev --prune
To recreate this env:
conda activate base &&
conda env remove -n mlops-dev &&
conda env create &&
conda env list &&
conda activate mlops-dev &&
conda list
Go to the ml-service directory to prepare and test a Model
artifact.
Build the ml-service
image with ModelServer
that will be serving a model through the REST API:
docker buildx build -t ml-service --progress plain -f ml-service.Dockerfile .
To verify the image:
docker image ls ml-service
To rebuild the image from scratch:
docker buildx build -t ml-service --progress plain --no-cache --pull -f ml-service.Dockerfile .
To run the containerized Inference Service:
docker run -it -p 8080:8080 ml-service
Once the service starts you can open the /metrics
endpoint in your browser: http://localhost:8080/metrics and observe how the endpoint behaves.
To test the REST API run the simple script (a cURL replacement on Windows):
python try_ml_service.py
The output should be:
<Response [200]>
{"predictions":[8,9,8]}
Go to the ml-pipelines directory to build and test the pipelines
components.
Build the ml-pipelines
image with the components that will be used in the batch prediction pipeline:
docker buildx build -t ml-pipelines --progress plain -f ml-pipelines.Dockerfile .
To verify the image:
docker image ls ml-pipelines
You can also run it to see if it works as expected:
docker run -it ml-pipelines
To rebuild the image from scratch:
docker buildx build -t ml-pipelines --progress plain --no-cache --pull -f ml-pipelines.Dockerfile .
To run the pipeline, first you need to start the Inference Service:
docker run -it -p 8080:8080 ml-service
Once the service is ready, run the batch pipeline:
python batch_inference_pipeline.py
The output should look like this:
(mlops-dev) ..\mlops-case-study>python batch_inference_pipeline.py
Starting the pipeline...
Pipeline: PreProcessor step starting...
Pipeline: PreProcessor step DONE!
Pipeline: ModelScore step starting...
Pipeline: ModelScore step DONE!
Pipeline: PostProcessor step starting...
Pipeline: PostProcessor step DONE!
Pipeline DONE!
- Logging, emitting metrics, proper app instrumentation and monitoring -> to be discussed
- Testing - showed a proper structure -> details to be discussed
- Code Quality
- focused on a good repo layout and code structure
- DRY needed (lack of time)
- details to be discussed
- Documentation
- skipped (lack of time) included only some docstrings -> to be discussed
- I recommend performing HPO with Optuna framework