Search experiences are all around us, empowering us to quickly find the documents, websites, products and answers that we're looking for.
For years, search engines have employed complex machine learning techniques to more deeply understand what users are searching for and to help them find it. These techniques enable search engines to semantically understand a user's query, to know that Debussy was a classical musician, that a dog is a pet, and that Python and JavaScript are programming languages.
In catalog search, semantically understanding the intent and extracting the entities in the query is crucial to provide useful results. Whenever the users type "blue girl cardigan", they expect the search engine to correctly identify the color as well as the specific product type and their intent to get back only female products.
Though these techniques help us find what we're looking for, they have not been available to enterprises who want to build their own semantic search capabilities.
Azure Cognitive Search aims to provide semantic understanding capabilities to search, so that any enterprise can build much more natural search experiences.
This sample demonstrates how we can use word embeddings from Large Language Models (LLMs) with Azure Cognitive Search to create search experiences that understand the relationships between the entities in a query and the products in a catalog.
The project combines OpenAI embedding models and Azure Cognitive Search to enhance search relevancy by adding a semantic understanding layer in query composition. Whenever a user searches for something, it is crucial to understand his/her intent to provide a tailored result set, considering both the entities in the query and the semantic similarity to the available products.
The solution creates an embedding representation of a product catalog to improve search relevancy by applying an implicit semantic classification step whenever a user submits a new query. The code extracts product data from a search index, computes and stores the embeddings based on OpenAI model (ada, babbage or curie) and applies the same embedding technique to every new unseen query coming to the engine. The query embeddings are evaluated against the embeddings matrix to determine the semantic similarity score and return possible filters criteria, boosting criteria or query expansion with confidence thresholds.
This project also converts natural language into Lucene queries using a few-shot approach and OpenAI text generation models as an alternative for query composition.
The code is split into two different logical components to facilitate re-use and an easy integration in any application layer of your choice:
- docker-api: core component computing the embedding matrix for the product catalog and exposing an API for query embedding and semantic scoring
- docker-web: a sample experimentation UI to visualize embeddings results and threshold to test your dataset and defining the right filtering, boosting and query expansion logic
Both components are dockerized for easy deployment on multiple platforms and integration with existing applications.
The purpose of this repository is to grow the understanding of using Large Language Models in Azure Cognitive Search by providing an example of implementation and references to support the Microsoft Build conference in 2022. It is not intended to be a released product. Therefore, this repository is not for discussing OpenAI API, Azure Cognitive Search or requesting new features.
Let's see how semantic understanding changes the results set with a sample query.
The "girl yellow cardigan" query is analyzed and three entities are extracted as the most probable match, also with their attribute key. So, "yellow" is classified as color, "cardigan" as a product type and "girl" clearly indicates the need for "female" as product gender.
Using this information, a filter on the product gender is applied to return just products classified as "Female" and a boost is pushing "yellow" products on top of the results set. You can easily compare the results in the following image.
- Azure Cognitive Search with Standard SKU and Semantic Search
- Python 3.8+
- Docker
- Open AI API Key to make API calls
git clone
the repo:git clone https://github.com/microsoft/azure-search-query-classification
and open the project folder- Create a
.env
file in the root directory of the project, copying the contents of the.env.example
file See Configure the .env file section - Gather the projects' dependencies for the backend component
- Windows users: Run
pip install -r docker-api\code\requirements.txt
- MacOs users: Run
pip install -r docker-api/code/requirements.txt
- Windows users: Run
- Gather the projects' dependencies for the frontend component
- Windows users: Run
pip install -r docker-web\code\requirements.txt
- MacOs users: Run
pip install -r docker-web/code/requirements.txt
- Windows users: Run
- Run
python local-run.py
to serve the backend and launch the web application. NOTE for MacOs users: by default no alias for python is defined. In this case, please runpython3 local-run.py
- Check whether Docker is running
git clone
the repo:git clone https://github.com/microsoft/azure-search-query-classification
and open the project folder- Create a
.env
file in the root directory of the project, copying the contents of the.env.example
file See Configure the .env file section - Create the server docker image
- Windows users: Run
docker build docker-api\. -t YOUR_REGISTRY/YOUR_REPO/YOUR_API:TAG
- MacOs users: Run
docker build docker-api/. -t YOUR_REGISTRY/YOUR_REPO/YOUR_API:TAG
- Windows users: Run
- Create the client docker image
- Windows users: Run
docker build docker-web\. -t YOUR_REGISTRY/YOUR_REPO/YOUR_WEB:TAG
- MacOs users: Run
docker build docker-web/. -t YOUR_REGISTRY/YOUR_REPO/YOUR_WEB:TAG
- Windows users: Run
- Modify the
docker-compose.yml
to point to your images tags as defined in step 3 and step 4 - Run
docker compose up
to serve the backend and launch the web application.
Please use your own settings in the fields marked as "TO UPDATE" in the Note column in the following table.
App Setting | Value | Note |
---|---|---|
search_service | YOUR_AZURE_COGNITIVE_SEARCH_SERVICE | TO UPDATE Azure Cognitive Search service name e.g. https://XXX.search.windows.net use just XXX. |
index_name | YOUR_AZURE_COGNITIVE_SEARCH_INDEX | TO UPDATE A new index that will be created by the code in your Azure Cognitive Search resource |
api_key | YOUR_AZURE_COGNITIVE_SEARCH_API_KEY | TO UPDATE Azure Cognitive Search Admin Key |
api_version | 2021-04-30-Preview | Azure Cognitive Search API Version |
LoadSampleData | true | Load sample data in Azure Cognitive Search index |
sample_data_url | Link to download sample data if LoadSampleData is true | |
SentenceTransformer | msmarco-distilbert-dot-v5, all-mpnet-base-v2, nq-distilbert-base-v1, all-MiniLM-L6-v2 | List of all the models for compute the embeddings. You can add any Sentence Transformers |
GPT_3 | text-search-curie-query-001, text-search-babbage-query-001, text-search-ada-query-001 | List of all the models for compute the embeddings. You can add any GPT-3 embedding model |
OPENAI_API_KEY | YOUR_OPENAI_API_KEY | TO UPDATE OpenAI GPT-3 Key |
fields | [ |
JSON version of all the fileds to be used for computing embeddings |
select | articleId,type,name,description,quality,style,gender,colors | List of all Azure Cognitive Search index fields to visualize in the UI results table |
searchFields | articleId,type,name,description,quality,style,gender,colors | List of all Azure Cognitive Search fields to search in |
boost | type,name,quality,style,gender,colors | List of all Azure Cognivite Search fields to be used for attribute boosting in the UI |
filters | type,name,quality,style,gender,colors | List of all Azure Cognivite Search fields to be used for attribute filtering in the UI |
prod_url | http://api:80 or http://localhost:8000 | URL for the Server side component (docker-api microservices) as defined in the docker compose file When executing locally without Docker, use http://localhost:8000 instead |
gpt3_prompt | girl yellow cardian -> $search=girl yellow cardigan&$filter=color eq 'Yellow' and productTypeName eq 'Cardigan' and productGender eq 'Female'\nblu man t-shirt -> $search=blu man t-shirt&$filter=color eq 'Blue' and productTypeName eq 'T-shirt' and productGender eq 'Male'\nblack hoodie -> $search=black hoodie&$filter=color eq 'Black' and productTypeName eq 'Hoodie'\ncotton black pant -> $search=cotton black pant&$filter=color eq 'black' and productTypeName eq 'Black' and productQuality eq 'cotton'\nlight blue cotton polo shirt -> $search=light blue cotton polo shirt&$filter=color eq 'Light Blue' and productQuality eq 'Cotton' and productTypeName eq 'Polo Shirt'\ngreen cashmere polo shirt -> $search=green cashmere polo shirt&$filter=color eq 'Green' and productQuality eq 'Cashmere' and productTypeName eq 'Polo Shirt'\n | OpenAI Prompt for query generation |
Example queries:
- girl yellow cardigan
- women white belted coat
When running the application, you can use the UI to tweak the filters and boosts logic according to your needs.
You can enable filtering and/or boosting using the menu on the left-hand side or modify the threshold for each field in the "Threshold definition" section.
In the following image, you can compare the results on the left-hand side ("Keyword-based Search") with those on the right-hand side ("Search with Semantic Understanding"), using the following settings:
- Boost on extracted colors with confidence > 0.85 (light blue marker)
- Filter on extracted gender with confidence > 0.75 (green marker)
The UI provides a sample playground to tweak the threshold parameters and define your own logic to be embedded in your application query layer.
The core of the repo is using the encode function from sentence_transformers library and get_embeddings from openai library to compute the embedding representations of the product attributes available in a sample product catalog.
# docker-api\code\utilities\utils.py
def compute_terms_embeddings(model,terms, path=os.getcwd()):
if model['family'] == SentenceTransformer:
start_time = datetime.datetime.now()
terms_embedding = model['model'].encode(list(map(lambda x: x['value'], terms)))
end_time = datetime.datetime.now()
difference_in_ms = (end_time - start_time).total_seconds()*1000
logging.info("terms", '(encoded in', difference_in_ms, 'ms)')
np.save(os.path.join(path, 'embeddings' , model['name']), terms_embedding)
return terms_embedding
elif model['family'] == 'GPT-3':
df_terms = pd.DataFrame(terms)
start_time = datetime.datetime.now()
df_terms['gpt_3_search'] = df_terms.value.apply(lambda x: get_embedding(x, engine= model['name']))
end_time = datetime.datetime.now()
difference_in_ms = (end_time - start_time).total_seconds()*1000
logging.info("terms", '(encoded in', difference_in_ms, 'ms)')
df_terms.to_csv(os.path.join(path, 'embeddings' , f"{model['name']}.csv"))
return df_terms
Whenever a user submits a query, the same libraries are used to compute the embeddings for the whole sentence and the single words in it and semantically rank this terms list against all available product attributes.
# docker-api\code\utilities\utils.py
def process_query(query, terms_embedding, model, terms):
if model['family'] == SentenceTransformer:
start_time = datetime.datetime.now()
query_embedding = model['model'].encode(query)
end_time = datetime.datetime.now()
difference_in_ms = (end_time - start_time).total_seconds()*1000
logging.info(query, '(encoded in', difference_in_ms, 'ms)')
logging.info(f'Using cos_similarity for {model["name"]}')
scores = util.cos_sim(query_embedding, terms_embedding).numpy()[0]
terms_score = []
counter = 0
for score in scores:
terms_score.append(
{
"term": terms[counter]['value'],
"key" : terms[counter]['key'],
"score" : str(score),
"query" : query,
"model_name" : model['name']
}
)
counter+=1
return sorted(terms_score, key = lambda i: i['score'],reverse=True)
else:
start_time = datetime.datetime.now()
query_embedding = get_embedding(query, engine=model['name'])
end_time = datetime.datetime.now()
difference_in_ms = (end_time - start_time).total_seconds()*1000
logging.info(query, '(encoded in', difference_in_ms, 'ms)')
terms_embedding['score'] = terms_embedding.gpt_3_search.apply(lambda x: cosine_similarity(x, query_embedding))
res = terms_embedding.sort_values('score', ascending=False).head(5)
res.drop(['gpt_3_search'], axis=1, inplace=True)
res['query'] = query
res['model_name'] = model['name']
return res.to_dict('records')
docker build docker-api\. -t YOUR_REGISTRY/YOUR_REPO/YOUR_API:TAG
docker run -p 80:80 --env-file .env -t YOUR_REGISTRY/YOUR_REPO/YOUR_API:TAG
docker build docker-web\. -t YOUR_REGISTRY/YOUR_REPO/YOUR_WEB:TAG
docker run -p 80:80 --env-file .env -t YOUR_REGISTRY/YOUR_REPO/YOUR_WEB:TAG
To debug the web application, you can debug with VSCode debugger.
api.py
is the main entry point for the app, it uses FastAPI to serve RESTful APIs.utilities
is the module with utilities to extract product catalog data, compute embeddings and the semantic similarity score for a new query
ui.py
is the entry to bootstrap the Streamlit web applicationutilities
is the module with utilities for interact with the search index and the server-side
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.