-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PandasCodec unexpected behavior #737
Comments
Hey @pablobgar , Thanks for providing that high level of detail! 👍 The Therefore, my guess is that either the PS: within an |
Hi @adriangonz Thanks for your quick response!! This is the content of i_request: {'id': None, 'parameters': {'content_type': 'pd', 'headers': None}, 'inputs': [{'name': 'col1', 'shape': [2], 'datatype': 'BYTES', 'parameters': None, 'data': [b'test_string', b'test_string']}, {'name': 'col2', 'shape': [2], 'datatype': 'BYTES', 'parameters': None, 'data': [b'test_string', b'test_string']}], 'outputs': None} And this is inference_request_g: model_name: "test"
parameters {
key: "content_type"
value {
string_param: "pd"
}
}
inputs {
name: "col1"
datatype: "BYTES"
shape: 2
contents {
bytes_contents: "test_string"
bytes_contents: "test_string"
}
}
inputs {
name: "col2"
datatype: "BYTES"
shape: 2
contents {
bytes_contents: "test_string"
bytes_contents: "test_string"
}
} |
Hey @pablobgar , Right, that seems aligned with what was described on my comment above. The Could you try explicitly setting from mlserver.codecs import StringCodec
for input in i_request.inputs:
input.parameters = Parameters(content_type=StringCodec.ContentType) |
This is the resulting request: model_name: "arte"
parameters {
key: "content_type"
value {
string_param: "pd"
}
}
inputs {
name: "col1"
datatype: "BYTES"
shape: 2
parameters {
key: "content_type"
value {
string_param: "str"
}
}
contents {
bytes_contents: "test_string"
bytes_contents: "test_string"
}
}
inputs {
name: "col2"
datatype: "BYTES"
shape: 2
parameters {
key: "content_type"
value {
string_param: "str"
}
}
contents {
bytes_contents: "test_string"
bytes_contents: "test_string"
}
} But i have the same result on the server: col1 col2
0 b'test_string' b'test_string'
1 b'test_string' b'test_string' |
Hey @pablobgar, Thanks for trying that out. Instead of calling the |
Hi @adriangonz I have the same result with It seems that decoding is not performed in any case when the content type is bytes and PandasCodec is used. |
The decoding of the individual columns should occur within the MLServer/mlserver/codecs/utils.py Lines 129 to 130 in 6bb9836
So it's strange why that still doesn't work for you... Just to be on the safe side, I've added a small test case on the PR linked below, which should handle the same use case. Do you see any differences with your custom code? |
Hi @adriangonz Now it is working correctly if I add This is my code: data = pd.DataFrame(
data={
"col1": ["test_string", "test_string"],
"col2": ["test_string", "test_string"],
}
)
i_request = PandasCodec.encode_request(data)
i_request.parameters = types.Parameters(content_type="pd")
for input in i_request.inputs:
input.parameters = types.Parameters(content_type=StringCodec.ContentType) But then where I see a problem is the |
Hey @pablobgar , Thanks for the update! On the last point, I totally agree. And TBH, the We'll keep this issue open to tackle this last point. |
Hi,
I am having some problems with PandasCodec and grpc requests.
On the client side, I have the following code to send a pandas df:
And I have this code fragment to receive it on the server side:
I get this output:
The same dataframe, but with binary strings. I was expecting to get as a result the same dataframe that I have in the client, since I am using the same codec for encoding and decoding.
Is this an expected behavior? Maybe I am doing something wrong?
Thanks for your help.
The text was updated successfully, but these errors were encountered: