-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Torchserve changes image bytes compared to using local inference #2054
Comments
Hi, this might be a long shot but the jpeg standard is loose enough so two compliant decoder can result in different image on pixel level. Otherwise, did you check the libjpeg versions are equivalent between docker and local? And did you check if opening the image/bytestream outside of TorchServe but inside the docker gives the same result/image as locally? If it does not we get a lot of unknowns out if the equation. |
Thanks very much for this suggestion! I am using an Ubuntu WSL environment to test locally. I'll check the libjpeg versions. And will also check on opening the image. |
I think the libjpeg difference might be it! The versions are different In torchserve
In local WSL env
The correct result comes from the WSL environment where Pillow is installed from conda, I think since that's where the original author of the model installed Pillow from. I think what I might look into next is how to relink Torchserve to use python from conda since that seems like the quickest way to resolve this version mismatch. |
this was the issue ^ |
I am also trying to implement a handler for yolov5 model. But I am getting error that the response object type is not supported. Can you please tell me what is the response format? or if you can share your code. |
🐛 Describe the bug
I am getting slightly but significantly different results when running inference with Torchserve vs locally, due to the image input being slightly modified somewhere within the Torchserve environment. This is a bit of an involved issue so apologies for the long explanation, any help is much appreciated.
Below is the image I am using for inference. When running inference locally, I open this with PIL.Image.open()
When I load the image with a custom preprocess handler as a bytearray and open it with PIL, I get the above image, but with slight differences within the Torchserve environment. I've highlighted these by setting any nonzero difference to 1 or -1
These differences occur before any torch specific image transforms are applied from what I can tell. I've also made sure that the torchserve and local environement have the same numpy and PIL versions. these are the only non-standard libraries I can tell are being used in the preprocess handler up until the error occurs.
Below is my preprocess handler where I save out the intermediate preprocessed result that has differences. As can be seen, the only operation is
load_image
andio.BytesIO
. load_image eventually just callsimage = Image.open()
after checking that the image is in RGB mode and does no have rotations.load_image()
My question is if there are other handler steps that could be occurring before preprocess?
Error logs
There are no tracebacks from the torchserve container. Before or during the preprocess handler.
Installation instructions
This is my Dockerfile
Model Packaing
full custom handler: https://gist.github.com/rbavery/351563cd36e23216243d3587c14a0a55
model packaging step. The custom handler and non-torchserve local test uses torch hub to load the model.
config.properties
I don't think I changed any of these, I start the server within docker with
torchserve --start --model-store /app/model_store --no-config-snapshots --models mdv5=/app/megadetectorv5-yolov5-1-batch-1280-1280.mar
Versions
I'm using torchserve via Docker so not sure this applies. the container is torchserve:0.5.3-cpu
Repro instructions
Below is copied from the readme. The s3 bucket with weights aren't publicly accessible so I'm more looking to document the issue and check in to ask if this could be related to image processing steps that occur before the preprocess handler.
Setup Instructions
Download weights and torchscript model
From this directory, run:
Export yolov5 weights as torchscript model
first, clone and install yolov5 dependencies and yolov5 following these instructions: https://docs.ultralytics.com/tutorials/torchscript-onnx-coreml-export/
Then, if running locally, make sure to install the correct version of torch and torchvision, the same versions used to save the torchscript megadetector model, we need to use these to load the torchscript model. Check the Dockerfile for versions.
Size needs to be same as in mdv5_handler.py for good performance. Run this from this directory
this will create models/megadetectorv5/md_v5a.0.0.torchscript
Run model archiver
first,
pip install torch-model-archiver
then,The .mar file is what is served by torchserve.
Serve the torchscript model with torchserve
Return prediction in normalized coordinates with category integer and confidence score
Possible Solution
No response
The text was updated successfully, but these errors were encountered: