This repository contains code for a Knowledge Assistant system that leverages language models to generate responses to user queries based on a provided knowledge base. The system uses OpenAI's GPT-3.5 Turbo model for language generation and Pinecone for efficient vector similarity search.
-
Tokenization: The
token_len
function calculates the number of tokens in a given text using the specified tokenizer. -
Document Splitting: The
split_docs
function splits a list of documents into chunks based on the specified parameters, such as chunk size and overlap.
-
Inserting Data: The
insert_data_to_pinecone
function inserts data into the Pinecone index. It generates embeddings for document chunks, assigns UUIDs, and creates metadata before inserting into the index. -
Querying with Filtering: The
query_pinecone_with_filter
function queries the Pinecone index using embeddings from a user query. It then reranks the results using Cohere based on relevance scores.
- Response Generation: The
generate_LLM_response
function generates a response using a language model. It constructs a template for the assistant's role and incorporates user queries and knowledge base texts. The GPT model is used for response generation.
-
Setup:
- Set up the required environment variables for OpenAI and Pinecone API keys.
- Install the necessary Python packages.
-
Data Preparation:
- Prepare your knowledge base text.
- Use the provided functions for tokenization and document splitting.
-
Pinecone Indexing:
- Insert the data into Pinecone for efficient similarity search.
-
User Interaction:
- Use the
query_pinecone_with_filter
function to get relevant documents based on user queries. - Use the
generate_LLM_response
function to generate responses from the language model.
- Use the
- OpenAI GPT-3.5 Turbo
- Pinecone
- Cohere
- Tiktoken
- Langchain (custom library for document processing and interaction with Pinecone)
This code is provided under the MIT License.