Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QueryBot: Text retrieval is incomplete #67

Open
reka opened this issue Jul 14, 2024 · 2 comments
Open

QueryBot: Text retrieval is incomplete #67

reka opened this issue Jul 14, 2024 · 2 comments

Comments

@reka
Copy link
Contributor

reka commented Jul 14, 2024

Some responses from QueryBot sound like it hasn't retrieved all the relevant chunks for answering the prompt.
According to debugging, it retrieved only 4 RetrievedMessage instances instead of the configured 20.

Test Case: What is the relation between the classes SimpleBot, ChatBot, and QueryBot?

Steps to reproduce:

  1. Create the QueryBot instance.
from llamabot import QueryBot
from pathlib import Path

bot = QueryBot(
  system_prompt="You are an expert Python developer explaining the llamabot repository to an expert Python dev, who isn't familiar with the codebase.", 
  collection_name="llamabot_bot_classes",
  document_paths=[
    Path("/home/user/llamabot/llamabot/bot/simplebot.py"),
    Path("/home/user/llamabot/llamabot/bot/chatbot.py"),
    Path("/home/user/llamabot/llamabot/bot/querybot.py"),
  ],
  model_name="ollama/mistral"
) # This creates a new embedding
  1. Ask a question about the relationship between the Bot classes.
q1 = """
What is the relation between the classes SimpleBot, ChatBot, and QueryBot?
"""
r1 = bot(q1)
r1

The responses in 2 test runs both sounded like QueryBot didn't retrieve some relevant parts of the text:

  • Response 1: "SimpleBot is not explicitly defined in this code snippet,"
  • Response 2: "Although there are no classes named ChatBot or QueryBot in this code snippet,"
@reka
Copy link
Contributor Author

reka commented Jul 14, 2024

Test Run 1

The provided code defines a class named QueryBot, which inherits from two other classes: SimpleBot and ChatUIMixin.

  1. SimpleBot is not explicitly defined in this code snippet, but it seems to be a custom class that takes some parameters like system_prompt, temperature, model_name, stream_target, etc., and has a __call__ method for querying the documents within its document store.

  2. ChatUIMixin is also not explicitly defined, but it appears to be a mixin that provides some functionality related to chat interfaces, as indicated by its name. It has an initializer (__init__) and seems to be responsible for handling the initial message of the conversation.

The QueryBot class itself is designed to use the DocumentStore to answer questions about a document. It initializes a LanceDBDocStore with a given collection name, adds documents to it if provided, and has a method (__call__) for querying the documents based on a given query. The results are returned as an AIMessage object.

In summary, SimpleBot and ChatUIMixin are base classes providing some functionality that is extended by the QueryBot class to create a bot capable of answering questions about a document stored in its DocumentStore.

AIMessage(content=' The provided code defines a class named QueryBot, which inherits from two other classes: SimpleBot and ChatUIMixin.\n\n1. SimpleBot is not explicitly defined in this code snippet, but it seems to be a custom class that takes some parameters like system_prompt, temperature, model_name, stream_target, etc., and has a __call__ method for querying the documents within its document store.\n\n2. ChatUIMixin is also not explicitly defined, but it appears to be a mixin that provides some functionality related to chat interfaces, as indicated by its name. It has an initializer (__init__) and seems to be responsible for handling the initial message of the conversation.\n\nThe QueryBot class itself is designed to use the DocumentStore to answer questions about a document. It initializes a LanceDBDocStore with a given collection name, adds documents to it if provided, and has a method (__call__) for querying the documents based on a given query. The results are returned as an AIMessage object.\n\nIn summary, SimpleBot and ChatUIMixin are base classes providing some functionality that is extended by the QueryBot class to create a bot capable of answering questions about a document stored in its DocumentStore.', role='assistant')

Test Run 2

AIMessage(content=" In this code snippet, SimpleBot is the main class that interacts with a language model to generate responses based on user input. It takes in various parameters such as the model name, temperature, API key, etc., and has methods for creating a response from a list of messages, streaming the response to stdout or a Panel app, and handling user input.\n\nThe _make_response function is used within the SimpleBot class to create a response based on the given messages using the Llama model. It takes in a bot instance and a list of messages as arguments and returns a response object.\n\nAlthough there are no classes named ChatBot or QueryBot in this code snippet, it's possible that they could be subclasses or variations of SimpleBot with different functionalities or configurations. For example, ChatBot might handle chat-based interactions, while QueryBot could be designed for answering specific questions or queries. However, without more context or additional code, it's difficult to say for certain.", role='assistant')

For Test Run 2, see also the debug output
debug_output_67_text_retrieval_incomplete.txt

The debug output was created with this git diff:

(llamabot-example) reka@reka-laptop:~/reka/llamabot$ git diff
diff --git a/llamabot/bot/querybot.py b/llamabot/bot/querybot.py
index 4da38f2..648c056 100644
--- a/llamabot/bot/querybot.py
+++ b/llamabot/bot/querybot.py
@@ -1,4 +1,5 @@
 """Class definition for QueryBot."""
+
 import contextvars
 from pathlib import Path
 from typing import Optional
@@ -69,8 +70,12 @@ class QueryBot(SimpleBot, ChatUIMixin):
         retrieved_messages = retreived_messages.union(
             self.lancedb_store.retrieve(query, n_results)
         )
+        print(len(retrieved_messages))
+        for i in retrieved_messages:
+            print(i)
         retrieved = [RetrievedMessage(content=chunk) for chunk in retrieved_messages]
         messages.extend(retrieved)
+        print(retrieved)
         messages.append(HumanMessage(content=query))
         if self.stream_target == "stdout":
             response: AIMessage = self.stream_stdout(messages)

@reka
Copy link
Contributor Author

reka commented Jul 14, 2024

Looking at the LanceDB database, the problem might be with the chunking.
The 3 .py files were split into 10 chunks, none of them contains the class definition for QueryBot.
llamabot-bot-classes-documents-all.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant