Please checkout branch v2 for converting new models
Please checkout branch v3 for converting models to TensorRT for fastest inference
# clone this repo
git clone https://github.com/kamalkraj/stable-diffusion-tritonserver.git
cd stable-diffusion-tritonserver
# clone model repo from huggingface
git lfs install
git clone https://huggingface.co/kamalkraj/stable-diffusion-v1-4-onnx
Unzip the model weights
cd stable-diffusion-v1-4-onnx
tar -xvzf models.tar.gz
docker build -t tritonserver .
docker run -it --rm --gpus all -p8000:8000 -p8001:8001 -p8002:8002 --shm-size 16384m \
-v $PWD/stable-diffusion-v1-4-onnx/models:/models tritonserver \
tritonserver --model-repository /models/
Install tritonclient
and run the notebook for inference.
pip install "tritonclient[http]"
- ONNX conversion script from - harishanand95/diffusers