Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Fastembed Ranker - update readme and fix example #1198

Merged
merged 1 commit into from
Nov 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 32 additions & 5 deletions integrations/fastembed/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
**Table of Contents**

- [Installation](#installation)
- [Usage](#Usage)
- [License](#license)

## Installation
Expand All @@ -33,7 +34,7 @@ embedding = text_embedder.run(text)["embedding"]

```python
from haystack_integrations.components.embedders.fastembed import FastembedDocumentEmbedder
from haystack.dataclasses import Document
from haystack import Document

embedder = FastembedDocumentEmbedder(
model="BAAI/bge-small-en-v1.5",
Expand All @@ -50,24 +51,50 @@ from haystack_integrations.components.embedders.fastembed import FastembedSparse

text = "fastembed is supported by and maintained by Qdrant."
text_embedder = FastembedSparseTextEmbedder(
model="prithvida/Splade_PP_en_v1"
model="prithivida/Splade_PP_en_v1"
)
text_embedder.warm_up()
embedding = text_embedder.run(text)["embedding"]
embedding = text_embedder.run(text)["sparse_embedding"]
```

```python
from haystack_integrations.components.embedders.fastembed import FastembedSparseDocumentEmbedder
from haystack.dataclasses import Document
from haystack import Document

embedder = FastembedSparseDocumentEmbedder(
model="prithvida/Splade_PP_en_v1",
model="prithivida/Splade_PP_en_v1",
)
embedder.warm_up()
doc = Document(content="fastembed is supported by and maintained by Qdrant.", meta={"long_answer": "no",})
result = embedder.run(documents=[doc])
```

You can use `FastembedRanker` by importing as:

```python
from haystack import Document

from haystack_integrations.components.rankers.fastembed import FastembedRanker

query = "Who is maintaining Qdrant?"
documents = [
Document(
content="This is built to be faster and lighter than other embedding libraries e.g. Transformers, Sentence-Transformers, etc."
),
Document(content="fastembed is supported by and maintained by Qdrant."),
]

ranker = FastembedRanker(model_name="Xenova/ms-marco-MiniLM-L-6-v2")
ranker.warm_up()
reranked_documents = ranker.run(query=query, documents=documents)["documents"]

print(reranked_documents[0])

# Document(id=...,
# content: 'fastembed is supported by and maintained by Qdrant.',
# score: 5.472434997558594..)
```

## License

`fastembed-haystack` is distributed under the terms of the [Apache-2.0](https://spdx.org/licenses/Apache-2.0.html) license.
2 changes: 1 addition & 1 deletion integrations/fastembed/examples/ranker_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
reranked_documents = ranker.run(query=query, documents=documents)["documents"]


print(reranked_documents["documents"][0])
print(reranked_documents[0])

# Document(id=...,
# content: 'fastembed is supported by and maintained by Qdrant.',
Expand Down