Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: schema validation is fragile #2299

Open
1 task done
zhengbuqian opened this issue Oct 15, 2024 · 3 comments
Open
1 task done

[Enhancement]: schema validation is fragile #2299

zhengbuqian opened this issue Oct 15, 2024 · 3 comments
Assignees
Labels
kind/enhancement New feature or request

Comments

@zhengbuqian
Copy link
Collaborator

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

currently we validate server schema by comparing with user provided schema

if server_schema != schema:
.

with doc in doc out, we introduced tokenizer_params in params, which is a dict in user input, but a json string in server response. direct comparing will cause a failure.

now the tokenizer_params is simple so I used a temp resolution in #2298 to convert the json string back to a dict, but that will likely fail after we introduced more configs in tokenizer_params: keys in json may be reordered and the resulting dict will no longer equal.

Why is this needed?

No response

Anything else?

No response

@zhengbuqian zhengbuqian added the kind/enhancement New feature or request label Oct 15, 2024
@zhengbuqian
Copy link
Collaborator Author

/assign @zhengbuqian
/assign @XuanYang-cn

@zhengbuqian
Copy link
Collaborator Author

this should be fixed before the Milvus 2.5 release

sre-ci-robot pushed a commit that referenced this issue Oct 15, 2024
also fix schema comparison of tokenizer_params: tokenizer_params
returned by the server is in string format, while the user may provide
it in dict format. But this fix is not perfect and will likely fail when
tokenizer_params become more complicate due to possible json key
reordering.

@XuanYang-cn any idea? tracking this issue in
#2299

Signed-off-by: Buqian Zheng <[email protected]>
@XuanYang-cn
Copy link
Contributor

XuanYang-cn commented Oct 17, 2024

@zhengbuqian we could impl an __eq__ func in schema to cutomize what's need to compare and what could be ignored.

  • Is Function going to change of the same collection? If so, then probably we should ignore it when validate schema.
  • How does tokenizer_params looks like in Milvus? Please give an example of a classic tokenizer_params THX

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants