Besties: AI group chat

Technical overview

In a group chat with humans, it's a common scenario where you raise a topic and people start discussing it, during which you can interrupt / contribute to the conversation at any time.

This is not trivial with LLMs, because:

"Chat completion" models are usually trained on datasets with only two speakers ("user"/"human" and "assistant"/"AI").
Chatbot UI usually disables the input field while the AI is responding, which prevents humans from interrupting the AI participants.

This project demonstrates how you can solve these problems by:

manipulating the chat history before each LLM inference to inform the model about different speakers.
looping through AI participants in a separate thread, which frees up the main thread for user input.

Usage

Pre-requisites

There are a couple of things you have to do manually before you can start using the chatbot.

Clone the repository (how).
Install the required binary, standalone programs. These are not Python packages, so they aren't managed by pyproject.toml.
Self-serve a text embedding model. This model "translates" your text into numbers, so that the computer can understand you.
Choose a way to serve a large language model (LLM). You can either use OpenAI's API or self-host a local LLM with Ollama.

No need to explicitly install Python packages. uv, the package manager of our choice, will implicitly install the required packages when you boot up the chatbot for the first time.

Install the required binary programs

These are the binary programs that you need to have ready before running besties:

Written in Python, this project uses the Rust-based package manager uv. It does not require you to explicitly create a virtual environment.
As aforementioned, if you decide to self-host a LLM, install Ollama.

If you are on macOS, you can install these programs using Homebrew:

brew install uv ollama

Self-serve an embedding model

Ensure that you have a local Ollama server running:

ollama serve

and then:

ollama pull nomic-embed-text

Bring your own large language model (LLM)

The easiest (and perhaps highest-quality) way would be to provide an API key to OpenAI. Simply add OPENAI_API_KEY=sk-... to a .env file in the project root.

With the absence of an OpenAI API key, the chatbot will default to using Ollama, a program that serves LLMs locally. Ensure that your local Ollama server has already downloaded the llama3.1 model. If you haven't (or aren't sure), run ollama pull llama3.1.

Running the Chatbot

Create a separate terminal for each command:

Start serving Ollama (for locally inferencing embedding & language models) by running ollama serve. It should be listening at http://localhost:11434/v1.
Start serving Phoenix (for debugging thought chains) by running uv run phoenix serve.
Finally, start serving the chatbot by running uv run chainlit run main.py -w.

Troubleshooting

If you see:

  File ".../llvmlite-0.43.0.tar.gz/ffi/build.py", line 142, in main_posix
    raise RuntimeError(msg) from None
RuntimeError: Could not find a `llvm-config` binary. There are a number of reasons this could occur, please see: https://llvmlite.readthedocs.io/en/latest/admin-guide/install.html#using-pip for help.
error: command '.../bin/python' failed with exit code 1

Then run:

brew install llvm

If your uv run phoenix serve command fails with:

Traceback (most recent call last):
  File "besties/.venv/bin/phoenix", line 5, in <module>
    from phoenix.server.main import main
  File "besties/.venv/lib/python3.11/site-packages/phoenix/__init__.py", line 12, in <module>
    from .session.session import (
  File ".venv/lib/python3.11/site-packages/phoenix/session/session.py", line 41, in <module>
    from phoenix.core.model_schema_adapter import create_model_from_inferences
  File ".venv/lib/python3.11/site-packages/phoenix/core/model_schema_adapter.py", line 11, in <module>
    from phoenix.core.model_schema import Embedding, Model, RetrievalEmbedding, Schema
  File ".venv/lib/python3.11/site-packages/phoenix/core/model_schema.py", line 554, in <module>
    class ModelData(ObjectProxy, ABC):  # type: ignore
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

then you can work around the problem for now by serving Arize Phoenix from a Docker container:

docker run -p 6006:6006 -p 4317:4317 -i -t arizephoenix/phoenix:latest

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.chainlit		.chainlit
.idea		.idea
public		public
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
DEVELOPMENT.md		DEVELOPMENT.md
README.md		README.md
chainlit.md		chainlit.md
demo_for_issue.py		demo_for_issue.py
justfile		justfile
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Besties: AI group chat

Technical overview

Usage

Pre-requisites

Install the required binary programs

Self-serve an embedding model

Bring your own large language model (LLM)

Running the Chatbot

Troubleshooting

About

Releases

Packages

Languages

colabear-info/agent

Folders and files

Latest commit

History

Repository files navigation

Besties: AI group chat

Technical overview

Usage

Pre-requisites

Install the required binary programs

Self-serve an embedding model

Bring your own large language model (LLM)

Running the Chatbot

Troubleshooting

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages