Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ElasticsearchWarning: text_expansion is deprecated. Use sparse_vector instead. #42

Open
kazdam opened this issue Sep 9, 2024 · 4 comments
Labels

Comments

@kazdam
Copy link

kazdam commented Sep 9, 2024

I'm using elasticsearch 8.15 and ElasticsearchStore which generates the following warnimg. Can you suggest how to mitigate this warning other than ignoring it. Is it ignorable?

venv/lib/python3.11/site-packages/langchain_elasticsearch/vectorstores.py:883: ElasticsearchWarning: text_expansion is deprecated. Use sparse_vector instead.
  hits = self._store.search(

This appears to come from this line of my code:

vector_store.similarity_search_with_score(
            query=question, 
            doc_builder=custom_doc_builder,
            filter=filter,
            k=n_results,
        )

The following are the current package levels.

% pip freeze |grep -E '(langchain|elastic)'
elastic-transport==8.15.0
elasticsearch==8.15.0
langchain==0.2.14
langchain-community==0.2.12
langchain-core==0.2.34
langchain-elasticsearch==0.2.2
langchain-huggingface==0.0.3
langchain-text-splitters==0.2.2
@miguelgrinberg
Copy link
Collaborator

First of all, you should be using the SparseVectorStrategy class. From your report my guess is that you are using the older SparseRetrievalStrategy.

Aside from that, the text_expansion query is now deprecated. It continues to be available and working, but you'll see the warning. We have not updated this package to the sparse_vector query, but we will and at that point the warning will go away.

@kazdam
Copy link
Author

kazdam commented Sep 10, 2024

Thanks for your response. Yes, I am already using the SparseVectorStrategy. I should have pasted that earlier for completeness.

To ingest I am doing:

strategy = SparseVectorStrategy(model_id='.elser_model_2_linux-x86_64')
vector_store = ElasticsearchStore.from_documents(
                documents=documents,
                index_name=collection_name,
                es_connection=elastic_client,
                strategy=strategy,
                bulk_kwargs={'request_timeout': 50000}
            )

and during search, I do the following:

vector_store = ElasticsearchStore(
                index_name=collection_name,
                es_connection=elastic_client,
                strategy=self.strategy
            )

Would you suggest a different API? I started to try the lower level ones that elasticsearch publishes in their tutorials but being new at this, I appear to be making mistakes. I was going to suppress the warning but then I will not know if it becomes a problem.

@sh0umik
Copy link

sh0umik commented Oct 3, 2024

As for the latest doc till date i am getting the same error using the latest sdk

from llama_index.vector_stores.elasticsearch import AsyncSparseVectorStrategy

sparse_vector_store = ElasticsearchStore(
    es_url="http://localhost:9200",  # for Elastic Cloud authentication see above
    index_name="movies_sparse",
    retrieval_strategy=AsyncSparseVectorStrategy(model_id=".elser_model_2"),
)

@miguelgrinberg
Copy link
Collaborator

miguelgrinberg commented Oct 3, 2024

The langchain and llamaindex integrations for Elasticsearch still use text_expansion, which is deprecated (but continues to be available). The update to the newer sparse_vector is planned for a future release. See elastic/elasticsearch-py#2657 for more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants