-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add text embedding serving #214
Conversation
The only thing I would add is an option to include a function which transforms the output attribute as a part of the serving. I'm not sure HF has this, but it seems common with sentence transformer embeddings |
Thanks for the great work! A beginner question about this - if someone could help. This line - However, if I'm trying to get CLIP text embedding with {:ok, %{model: text_model, params: text_params, spec: text_spec}} =
Bumblebee.load_model({:hf, clip_model_name},
module: Bumblebee.Text.ClipText,
architecture: :base
)
my_model = text_model
|> Axon.nx(& &1.pooled_state)
|> Axon.dense(512, use_bias: false, name: "text_projection") Here, the output of my_model_2 =
text_model
|> Axon.nx(& &1.pooled_state)
|> Axon.dense(512, use_bias: false, name: "text_projection")
|> Axon.nx(fn x -> %{my_output_attribute: x} end) And then pass
|
I appreciate the suggestions @jonatanklosko and @seanmor5 — I've implemented those changes! @seanmor5 I know folks will sometimes apply L2 normalization to their embeddings, so I added support for it; let me know if there were other functions you had in mind. And @rajrajhans, thank you for bringing that up. For models like CLIP that have a projection head, it would be helpful to have the option to directly retrieve the model output. My guess is that for most models, the pooled state (as an attribute of the output) would be used as the embedding, so perhaps it would be best to add a non-default option |
This is looking great @coderrg! I was curious about the postprocessing and mean pooling. Should this be a postprocessing function or is this out of scope? |
Thanks @trodrigu — that makes sense. I've added support for Also, since mean pooling and other functions applied to output embeddings are not mutually exclusive, I changed |
Awesome, thanks for the feedback everyone! @coderrg I left a couple more comments, but it's looking great :D |
Thanks @jonatanklosko, I've implemented your suggestions. I also added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, a couple final comment and we should be good to go!
Co-authored-by: Jonatan Kłosko <[email protected]>
Co-authored-by: Jonatan Kłosko <[email protected]>
Co-authored-by: Jonatan Kłosko <[email protected]>
Co-authored-by: Jonatan Kłosko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks!
Resolves #206