kiosk-tf-serving
uses TensorFlow Serving to serve deep learning models over gRPC and REST APIs. A configuration file can be automatically created using python write_config_file.py
to allow any model found in a (AWS or GCS) storage bucket to be served.
TensorFlow serving will host all versions of all models in the bucket via RPC and REST APIs.
This repository is part of the DeepCell Kiosk. More information about the Kiosk project is available through Read the Docs and our FAQ page.
Build the docker image by running
docker build --pull -t $(whoami)/kiosk-tf-serving -f docker/Dockerfile.server .
Run the docker image by running
# write the configuration files for a given bucket
python write_config_model.py --storage-bucket=$STORAGE_BUCKET
# mount the config files and run the image
docker run --gpus=1 -it \
-v $PWD:/config \
-e PORT=8500 \
-e REST_API_PORT=8501 \
-p 8500:8500 \
-p 8501:8501 \
$(whoami)/kiosk-tf-serving:latest
The kiosk-tf-serving
can be configured using environmental variables in a .env
file.
Name | Description | Default Value |
---|---|---|
STORAGE_BUCKET |
REQUIRED: Cloud storage bucket address (e.g. "gs://bucket-name" ). |
"" |
PORT |
Port to listen on for gRPC API. | 8500 |
REST_API_PORT |
Port to listen on for HTTP/REST API. | 8501 |
REST_API_TIMEOUT |
Timeout in ms for HTTP/REST API calls. | 30000 |
GRPC_CHANNEL_ARGS |
Optional channel args for the gRPC API. | "" |
MODEL_PREFIX |
Prefix of model directory in the cloud storage bucket. | "/models" |
MODEL_CONFIG_FILE |
Path of the model configuration file written by write_config_file.py . |
"/kiosk/tf-serving/models.conf" |
ENABLE_BATCHING |
Whether to enable batching in TensorFlow Serving. | true |
MAX_BATCH_SIZE |
Maximum number of items in a batch. | 1 |
MAX_ENQUEUED_BATCHES |
Number of jobs to keep in queue to be processed. Jobs may take a long time if this value is too high. | 128 |
BATCH_TIMEOUT_MICROS |
The maximum amount of time in ms to wait before executing a batch. | 0 |
BATCHING_CONFIG_FILE |
Path of the batching configuration file created by write_config_file.py . |
"/kiosk/tf-serving/batching_config.txt" |
PROMETHEUS_MONITORING_ENABLED |
If true , a monitoring configuration file is written. |
true |
PROMETHEUS_MONITORING_PATH |
Prometheus scraping endpoint used if PROMETHEUS_MONITORING_ENABLED . |
"/monitoring/prometheus/metrics" |
MONITORING_CONFIG_FILE |
Path of the monitoring configuration file created by write_config_file.py . |
"/kiosk/tf-serving/monitoring_config.txt" |
TF_CPP_MIN_LOG_LEVEL |
The log level of TensorFlow Serving. | 0 |
We welcome contributions to the kiosk-console and its associated projects. If you are interested, please refer to our Developer Documentation, Code of Conduct and Contributing Guidelines.
This software is license under a modified Apache-2.0 license. See LICENSE for full details.
Copyright © 2018-2022 The Van Valen Lab at the California Institute of Technology (Caltech), with support from the Paul Allen Family Foundation, Google, & National Institutes of Health (NIH) under Grant U24CA224309-01. All rights reserved.