Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use chainlit to index our codebases (PMAT, PMA) and provide chatbot-based documentation #554

Open
kongzii opened this issue Nov 13, 2024 · 1 comment
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers

Comments

@kongzii
Copy link
Contributor

kongzii commented Nov 13, 2024

Personally, I think that having proper documentation and tutorial is key, but this could provide a low-hanging-fruit kind of way to let users know how to use our tools + it can be automatically re-indexed.

https://github.com/Chainlit/chainlit

@kongzii kongzii added the documentation Improvements or additions to documentation label Nov 13, 2024
@kongzii kongzii self-assigned this Nov 16, 2024
@gabrielfior gabrielfior added the good first issue Good for newcomers label Nov 18, 2024
@kongzii kongzii removed their assignment Nov 21, 2024
@kongzii
Copy link
Contributor Author

kongzii commented Nov 21, 2024

I gave it a quick try. It's not bad, but would need more time to get it working fully. Also, writing more docstrings across the codebases would help a lot:

Screenshot by Dropbox Capture

Code to reproduce:

import os
import openai
import chainlit as cl

from llama_index.core import (
    Settings,
    StorageContext,
    VectorStoreIndex,
    load_index_from_storage,
)
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core.query_engine.retriever_query_engine import RetrieverQueryEngine
from dotenv import load_dotenv
from llama_index.core import VectorStoreIndex
from llama_index.readers.github import GithubRepositoryReader, GithubClient
import os

load_dotenv()

Settings.llm = OpenAI(
    model="gpt-4o",
    temperature=0.1,
    max_tokens=4096,
    streaming=True,
    system_prompt="""You are assistant helping to build prediction market agents.
Agents are based on the PMAT (prediction-market-agent-tooling) library with implementations in PMA (prediction-market-agent repository).
Assume that any question is in context of prediction market agent building.
""",
)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-large")
Settings.context_window = 8196

try:
    storage_context = StorageContext.from_defaults(persist_dir="./storage")
    index = load_index_from_storage(storage_context)
except:
    github_token = os.environ.get("GITHUB_TOKEN")
    owner = "gnosis"

    github_client = GithubClient(github_token=github_token)

    documents_pmat = GithubRepositoryReader(
        github_client=github_client,
        owner=owner,
        repo="prediction-market-agent-tooling",
        filter_directories=(
            [
                "prediction_market_agent_tooling",
                "tests",
                "tests_integration",
                "tests_integration_with_local_chain",
            ],
            GithubRepositoryReader.FilterType.INCLUDE,
        ),
    ).load_data(branch="main")
    documents_pma = GithubRepositoryReader(
        github_client=github_client,
        owner=owner,
        repo="prediction-market-agent",
        filter_directories=(
            [
                "prediction_market_agent",
                "tests",
            ],
            GithubRepositoryReader.FilterType.INCLUDE,
        ),
    ).load_data(branch="main")

    all_documents = documents_pmat + documents_pma

    index = VectorStoreIndex.from_documents(all_documents)
    index.storage_context.persist("./storage")


@cl.on_chat_start
async def start():
    query_engine = index.as_query_engine(
        streaming=True,
        similarity_top_k=5,
    )
    cl.user_session.set("query_engine", query_engine)

    await cl.Message(
        author="Assistant", content="Hello! Im an AI assistant. How may I help you?"
    ).send()


@cl.on_message
async def main(message: cl.Message):
    query_engine = cl.user_session.get("query_engine")  # type: RetrieverQueryEngine

    msg = cl.Message(content="", author="Assistant")

    res = await cl.make_async(query_engine.query)(message.content)

    for token in res.response_gen:
        await msg.stream_token(token)
    await msg.send()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants