Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add text embedding serving #214

Merged
merged 17 commits into from
Jun 2, 2023
Merged

Add text embedding serving #214

merged 17 commits into from
Jun 2, 2023

Conversation

coderrg
Copy link
Contributor

@coderrg coderrg commented May 26, 2023

Resolves #206

@jonatanklosko
Copy link
Member

@coderrg awesome thanks, just one nitpick from me!

@seanmor5 you've been using embeddings, so please check if this matches what you'd expect :D

@seanmor5
Copy link
Contributor

The only thing I would add is an option to include a function which transforms the output attribute as a part of the serving. I'm not sure HF has this, but it seems common with sentence transformer embeddings

@rajrajhans
Copy link
Contributor

Thanks for the great work!

A beginner question about this - if someone could help.

This line - encoder.(params, input)[output_attribute] mandatorily extracts output_attribute from encoder.(params, input).

However, if I'm trying to get CLIP text embedding with textual_projection, then my model would look something like -

{:ok, %{model: text_model, params: text_params, spec: text_spec}} =
        Bumblebee.load_model({:hf, clip_model_name},
          module: Bumblebee.Text.ClipText,
          architecture: :base
        )

my_model = text_model
           |> Axon.nx(& &1.pooled_state)
           |> Axon.dense(512, use_bias: false, name: "text_projection")

Here, the output of my_model would be the actual Nx tensor output that I would like. But, since the text embedding always tries to extract output_attribute, I will have to add another layer to shift the actual output inside an object

my_model_2 =
        text_model
        |> Axon.nx(& &1.pooled_state)
        |> Axon.dense(512, use_bias: false, name: "text_projection")
        |> Axon.nx(fn x -> %{my_output_attribute: x} end)

And then pass :my_output_attribute as the value for output_attribute option.

  1. Is there another way around this? Please correct me if I'm wrong.
  2. Maybe having output_attribute optional here would be a good way out?

@coderrg
Copy link
Contributor Author

coderrg commented May 27, 2023

I appreciate the suggestions @jonatanklosko and @seanmor5 — I've implemented those changes! @seanmor5 I know folks will sometimes apply L2 normalization to their embeddings, so I added support for it; let me know if there were other functions you had in mind. And @rajrajhans, thank you for bringing that up. For models like CLIP that have a projection head, it would be helpful to have the option to directly retrieve the model output.

My guess is that for most models, the pooled state (as an attribute of the output) would be used as the embedding, so perhaps it would be best to add a non-default option :none for output_attribute to account for the projection head case (e.g., CLIP). I've gone ahead and done this for now, but I can also make output_attribute optional altogether if that would be preferred. Curious to hear what other folks think.

@trodrigu
Copy link

This is looking great @coderrg! I was curious about the postprocessing and mean pooling. Should this be a postprocessing function or is this out of scope?

@coderrg
Copy link
Contributor Author

coderrg commented May 28, 2023

Thanks @trodrigu — that makes sense. I've added support for mean_pooling as an embedding function based on your example since it seems useful for sentence transformers. Does my implementation line up with what you'd expect?

Also, since mean pooling and other functions applied to output embeddings are not mutually exclusive, I changed embedding_function to embedding_functions so that it takes a list of functions to be applied to the output embedding in order. The supported functions are now :l2_normalization and :mean_pooling.

@jonatanklosko
Copy link
Member

Awesome, thanks for the feedback everyone!

@coderrg I left a couple more comments, but it's looking great :D

@coderrg
Copy link
Contributor Author

coderrg commented Jun 1, 2023

Thanks @jonatanklosko, I've implemented your suggestions. I also added nil as the default option for :output_pool and :embedding_processor — please let me know if I misunderstood that suggestion, though.

Copy link
Member

@jonatanklosko jonatanklosko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, a couple final comment and we should be good to go!

lib/bumblebee/text.ex Outdated Show resolved Hide resolved
lib/bumblebee/text.ex Outdated Show resolved Hide resolved
lib/bumblebee/text/text_embedding.ex Outdated Show resolved Hide resolved
test/bumblebee/text/text_embedding_test.exs Outdated Show resolved Hide resolved
test/bumblebee/text/text_embedding_test.exs Show resolved Hide resolved
Copy link
Member

@jonatanklosko jonatanklosko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

@jonatanklosko jonatanklosko merged commit 23de64b into elixir-nx:main Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add text embedding serving
5 participants