-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support json serialize all kind's of huggingface pipelines inputs/outputs #692
Conversation
57178f2
to
d548ea6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @pepesi ,
Nice spot! Thanks a lot for contributing this one.
Changes look good to me! 👍
Before merging though, would you be able to provide a test case that covers the issue that this PR fixes?
I fixed the lint error and add some tests for the NumpyEncoder, but I can't run the test successfully in my local environment because of the error |
26c1562
to
92e3b72
Compare
Tests passed on my Desktop |
92e3b72
to
d02db95
Compare
After test more huggingface models, I think the NumpyEncoder should renamed to |
d3e97e8
to
42e1df7
Compare
A small script tested passed in my local, it seems ok now, but I won't add it to tests, because I think it's so heavy
|
42e1df7
to
553ec92
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @pepesi ,
Massive thanks for the effort you're putting to make the encoder for HuggingFace models fully complete! It's certainly not an easy task! 🚀 💪
|
a17834e
to
120301e
Compare
WIP |
bb07b1e
to
5814182
Compare
5814182
to
6b480bf
Compare
601af5c
to
02a8314
Compare
84f6be5
to
a3176e7
Compare
What changes in this PR I'm using Seldon and MLServer in a project, that is aimed to fast deploy and experience any Hugginface model; when I test with some visual-related models, I go some JSON serialize error like this #656, At first, I just want fix the JSON serialize error. But when I get further, I get more trouble with inputs; changes:
May not be compatible A remedy here, change the is_sinle default value to None, then change NumpyCodec`s default behavior, if is_single is None, set is_single to True; but which may confusing @adriangonz any advise here? |
Hey @pepesi , Thanks a lot for the time you've put on this PR - I know it hasn't been an easy journey! It's grown quite massively, so it will take us a bit of time to review it. Particularly, considering that there are breaking changes to some of the existing functionality. We will do our best though! |
Dockerfile
Outdated
@@ -25,7 +25,7 @@ ENV MLSERVER_MODELS_DIR=/mnt/models \ | |||
RUN apt-get update && \ | |||
apt-get -y --no-install-recommends install \ | |||
unattended-upgrades \ | |||
libgomp1 libgl1-mesa-dev libglib2.0-0 build-essential && \ | |||
libgomp1 libgl1-mesa-dev libglib2.0-0 build-essential ffmpeg && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have now moved to a different base image for Docker, but it should already install ffmpeg (although in a diff way). Therefore, feel free to pick up master
's version of the Dockerfile
when rebasing 👍
mlserver/batch_processing.py
Outdated
if request_input.parameters is not None: | ||
new_input._parameters.update(request_input.parameters.dict()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering about this change and exact motivation behind. It is true that currently there is no option to set parameters on the inputs and request as a whole but I am worried that modifying triton's internal parameters (as indicated by _
) could lead to unexpected breaking changes in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @pepesi ,
I've finally found the time to make a first pass at this one.
Before anything else, I have to say this PR is really impressive! The attention to detail shown, and the thoroughness, is incredible! Massive thanks for spending the time on these changes! 🚀
I've added a couple comments below. Would be great to hear your thoughts on those ones. Besides that, I'm not 100% convinced on the introduction of the is_single
parameter and wanted to hear your reasoning behind why it is required.
My view is that codecs should always operate with lists of things (i.e. batches), and then is up to the runtime to deal with these. This simplifies other features like batching, as well as simplifies the code, which doesn't need to deal with multiple types (i.e. type T or list of type T).
if JSONCodec.can_encode(data): | ||
return JSONCodec | ||
if ImageCodec.can_encode(data): | ||
return ImageCodec | ||
if ConversationCodec.can_encode(data): | ||
return ConversationCodec | ||
return find_input_codec_by_payload(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we decorate them with @register_input_codec
or @register_request_codec
we should then be able to find them with find_input_codec(payload=data)
(as in, they'll go into the general codec registry, which is not a bad thing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some data may encode by multiple codecs, but there is no priority with codecs, mlserver.codecs._find_codec_by_payload
just return the first codecs matched; I want to find codecs in order with priority, and that's the reason;
If my understanding of the encoder registry is wrong, please let me know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. Ok, that makes sense. In that case, it may be worth dropping a comment on that method to briefly explain that reasoning (i.e. and avoid people changing that in the future).
|
||
@register_input_codec | ||
class ImageCodec(InputCodecTy): | ||
ContentType = CONTENT_TYPE_IMAGE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason why we need separate constants for these values? As in, instead of just keeping the actual value here and referring to ImageCodec.ContentType
everywhere else?
|
||
|
||
@register_input_codec | ||
class ImageCodec(InputCodecTy): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that, to help future maintainers, we should split these codecs into their own files. As in, instead of keeping them all under mlserver_huggingface/codecs.py
, we should create a new mlserver_huggingface/codecs
package (i.e. folder), and then have:
mlserver_huggingface/codecs/image.py
mlserver_huggingface/codecs/conversation.py
mlserver_huggingface/codecs/json.py
- ....
Each of these files could also have any codec-specific helpers (like get_img_bytes
, which could live in mlserver_huggingface/codecs/image.py
).
This will also help with the code review 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your suggestion is right. I will revise it according to that
mlserver/batch_processing.py
Outdated
if request_input.parameters is not None: | ||
new_input._parameters.update(request_input.parameters.dict()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
We initially considered leveraging the _parameters
field, but then decided not to, in order to avoid using Tritonclient's internal interface. Instead of that, the general advise when using mlserver infer
is to just configure the content_type
through the model's metadata (i.e. the model-settings.json
file).
Would be great to hear your thoughts behind this change though.
Firstly I agree with the opinion, codecs should always operate with lists of things (i.e. batches), and then is up to the runtime to deal with these. Before that, I didn't realize that it was a principle to follow. So I think I will remove And then, Let me explain why I add For due to adding the parameter is_single, triton httpclient request need to pass it to the server also, that why I modify InferInput._parameters. as @RafalSkolasinski said, it may cause breaking changes. |
Got it! Thanks for that explanation @pepesi . In that case, probably best is to handle the Looking forward to next batch of changes! Keep up the great work! We are getting closer. 🚀 |
0a99402
to
766a0cd
Compare
766a0cd
to
adc1424
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great @pepesi ! Thanks a lot for making those changes.
I know this has been a long journey - what started as a fix for numpy, is now a full-blown update to the HF runtime - but it's such a great addition! Massive thanks for all the effort that has gone behind this contribution. 🚀
I think the changes looks great. I've added a minor question, but I don't think it's necessarily a blocker. It would be great to have your thoughts on that one. Besides that, once tests are green, this should be good to go ahead! 👍
adc1424
to
c10674b
Compare
Thanks a lot for making those changes @pepesi. Once again, massive thanks for all the effort you've put into this one. PR looks great and tests are green, so this should be good to go! 🚀 |
huggingface_runtime output JSON serializer does not support NumPy basic datatypes when the data is a dict value