-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support self-hosted embedding service via BentoML #324
base: main
Are you sure you want to change the base?
Conversation
BentoML does not support loading pydantic models from URLs; output will be a normal dictionary.
Looks like the embed function is not compatible |
The client return back a dictionary, but we can load that dict into a pydantic model if needed. It seems like this |
Co-authored-by: shaun <[email protected]>
Co-authored-by: Aaron Pham <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I still run into errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @parano. The port inside container has to be 3000
Thanks @Shaunwei, will look into this |
Hi - any updates to this PR? |
Hey you can try out this patch |
This PR adds an option for RealChar to use a self-hosted embedding service powered by SentenceBert and BentoML.
By default, this integration uses the docker image published here
The default model that comes with the docker image is all-MiniLM-L6-v2. RealChar users may customize it to use a different text embedding model based on their needs. Check out the source code for the embedding service here: https://github.com/bentoml/sentence-embedding-bento,
TODO: