vana-com · tnunamak · Mar 6, 2024 · Mar 4, 2024 · Mar 5, 2024 · Mar 6, 2024
diff --git a/README.md b/README.md
@@ -1,53 +1,89 @@
 <div align="center">
   <img alt="selfie" src="./docs/images/hero.png" height="300px">
   <br>
-  <a href="https://discord.gg/GhYDaDqENx" target="_blank"><img alt="selfie" src="https://dcbadge.vercel.app/api/server/GhYDaDqENx?style=flat&compact=true"></a>
+  <a href="https://discord.gg/GhYDaDqENx" target="_blank"><img alt="Join our Discord" src="https://dcbadge.vercel.app/api/server/GhYDaDqENx?style=flat&compact=true"></a>
 
 [//]: # (  <a href="https://vana.com/" target="_blank"><img alt="selfie" src="https://assets-global.website-files.com/62dfa5318bb52f5fea8dc489/62dfb34210f09278d8bce721_Vana_Logo.svg" style="background-color: #dbff00; padding: 5px; height: 20px; border-radius: 2px"></a>)
 </div>
 
 # Selfie
 
-Bring your personal data to life! Selfie offers OpenAI-compatible APIs that bring your data into LLM awareness. Selfie also empowers you to directly search your data with natural language. Selfie runs 100% locally by default to keep your data private.
+[Jump to Quick Start](#quick-start)
 
-<img alt="selfie-augmentation" src="./docs/images/playground-augmentation.png" width="100%">
+Imagine AI that is not just smart, but personal. Selfie turns your data into APIs for text generation and natural language search that can power chatbots, storytelling experiences, games, and more.
+
+Selfie is a local-first, open-source project that runs on your device.
+
+## Core Features
+
+Selfie offers a more personalized interaction between you and the digital world via:
+
+* **Text Completions:** Mix your data into text completions using any OpenAI-compatible tool (like [OpenAI SDKs](https://platform.openai.com/docs/libraries) and [SillyTavern](https://sillytavernai.com)) or the API.
+* **Simple Data Import**: Quickly import any text file, with enhanced support for messaging platform exports.
+* **Use Any LLM**: Use local (default) or hosted LLMs from OpenAI, Replicate, etc.
+* **Direct Queries**: Search your data with natural language.
+
+### Web-Based UI
+
+Selfie comes with a local UI for importing and interacting with your data.
+
+**Personalized Chat**
+
+<img alt="selfie-augmentation" src="./docs/images/playground-use-data.png" height="300px">
+
+**Semantic Search**
+
+<img alt="selfie-search" src="./docs/images/playground-search.png" height="250px">
+
+### Full API Support
+
+Selfie provides a full API for OpenAI-style text completions and search.
+
+```bash
+curl -X POST 'http://localhost:8181/v1/chat/completions' \
+-H 'Content-Type: application/json' \
+-d '{
+  "messages": [{"role": "user", "content": "As Alice, what is your proudest garden achievement?"}]
+}' | jq '.choices[0].message.content'
+
+# "I grew a 10-pound tomato!"
+```
+
+[Jump to API Usage](#api-usage-guide)
+
+[//]: # (TODO: build out integration recipes)
+[//]: # (*Check out [Integration Recipes]&#40;#integration-recipes&#41; for some example of what you can do with Selfie.*)
+
+[//]: # (* Load data using any [LlamaHub loader]&#40;https://llamahub.ai/?tab=loaders&#41;.)
+[//]: # (* Easy deployment with Docker and pre-built executables.)
 
 ## Quick Start
 
+For MacOS and Linux:
+
 1. Install [python](https://www.python.org) 3.9+, [poetry](https://python-poetry.org), and [Node.js](https://nodejs.org).
 2. Clone or [download](https://github.com/vana-com/selfie/archive/refs/heads/main.zip) the repository.
 3. Run `start.sh`.
 4. http://localhost:8181 will open in your default web browser.
-
-> **Tip**: Python 3.11 is recommended.
+5. Head to the Add Data page in the UI and follow the instructions.
+6. Chat with your data in the Playground, connect it to a tool like SillyTavern, or integrate it with your own application.
 
 > **Tip**: On macOS you can run `brew install poetry nodejs` with [brew](https://brew.sh).
 
-## Features
-
-* Mix your data into text completions using OpenAI-compatible clients like [OpenAI SDKs](https://platform.openai.com/docs/libraries) and [SillyTavern](https://sillytavernai.com).
-* Quickly drop in any text file, with enhanced support for conversations exported from messaging platforms.
-* Runs locally by default to keep your data private.
-* Unopinionated compatibility with hosted LLMs from OpenAI, Replicate, etc.
-* APIs for directly and selectively querying your data in natural language.
-
-[//]: # (* Load data using any [LlamaHub loader]&#40;https://llamahub.ai/?tab=loaders&#41;.)
-[//]: # (* Easy deployment with Docker and pre-built executables.)
+For Windows, please follow the instructions in [Installation](#installation).
 
 ## Overview
 
-Selfie is designed to compose well with tools on both sides of the text generation process. You can think of it as middleware that intelligently mixes your data into a request.
+Selfie's core feature is personalized text generation. You can think of it as middleware that intelligently mixes your data into a request.
 
 A typical request:
 ```
-Application -(request)-> LLM
+Application --prompt--> LLM
 ```
 
 A request through Selfie:
 ```
-Application -(request)-> Selfie -(request x data)-> LLM
-                            |
-                        Your Data
+Application --prompt--> Selfie --prompt+data--> LLM
 ```
 
 On the application side, Selfie exposes text generation APIs, including OpenAI-compatible endpoints.
@@ -59,6 +95,8 @@ On the LLM side, Selfie uses tools like LiteLLM and txtai to support forwarding
 
 For most users, the easiest way to install Selfie is to follow the [Quick Start](#quick-start) instructions. If that doesn't work, or if you just want to install Selfie manually, follow the detailed instructions below.
 
+> **Tip**: Python 3.11 is recommended.
+
 <details>
 <summary>Manual Installation</summary>
 
@@ -196,7 +234,7 @@ Selfie can be used to augment text generation in a variety of applications. Here
 
 ### Powering the OpenAI SDK
 
-The OpenAI SDK is a popular way to access OpenAI's text generation models. You can use Selfie to augment the text completions that the SDK generates by setting the `apiBase` and `apiKey` parameters.
+The OpenAI SDK is a popular way to access OpenAI's text generation models. You can use Selfie to augment the text completions that the SDK generates simply by setting the `apiBase` and `apiKey` parameters.
 
 ```js
 import OpenAI from 'openai';
@@ -206,17 +244,16 @@ const openai = new OpenAI({
   apiKey: ''
 });
 
-const name = 'Alice';
 const chatCompletion = await openai.chat.completions.create({
-  // model: 'TheBloke/Mistral-7B-Instruct-v0.2-GGUF/mistral-7b-instruct-v0.2.Q5_K_M.gguf', // Optionally, customize the model used
   messages: [
-    { role: 'system', content: `Write ${name}'s next reply in a fictional chat with ${name} and their friends.` },
-    { role: 'user', content: 'Favorite ice cream?' },
+    { role: 'system', content: `Write Alice's next reply.` },
+    { role: 'user', content: 'What are your favorite snacks?' },
   ]
-} as any);
+});
 
 console.log(chatCompletion.choices[0].message.content);
-// "Alice enjoys Bahn Mi and Vietnamese coffee."
+
+// "I enjoy Bahn Mi and Vietnamese coffee."
 ```
 
 ### Powering SillyTavern

diff --git a/docs/images/playground-search.png b/docs/images/playground-search.png
diff --git a/docs/images/playground-use-data.png b/docs/images/playground-use-data.png
diff --git a/pre-commit.yaml b/pre-commit.yaml
diff --git a/scripts/llama-cpp-python-cublas.sh b/scripts/llama-cpp-python-cublas.sh
@@ -21,76 +21,36 @@ detect_cpu_arch() {
     echo $CPU_ARCH
 }
 
-detect_platform() {
-    OS_NAME=$(uname -s)
-    OS_ARCH=$(uname -m)
-    if [ "$OS_NAME" == "Linux" ]; then
-        PLATFORM="manylinux_2_31_x86_64"
-    elif [ "$OS_NAME" == "Darwin" ]; then
-        PLATFORM="macosx_$(sw_vers -productVersion | cut -d. -f1-2)_$(uname -m)"
-    else
-        PLATFORM="unsupported"
-    fi
-    echo $PLATFORM
-}
 
 detect_gpu_acceleration() {
     CUDA_VERSION=""
     ROCM_VERSION=""
     ACCELERATION="cpu"
 
     if command -v nvcc &> /dev/null; then
-        CUDA_VERSION=$(nvcc --version | grep "release" | awk '{print $6}' | cut -d'.' -f1-2 | sed 's/[^0-9]//g')
+        CUDA_VERSION=$(nvcc --version | awk '/release/ {print $5}' | cut -d',' -f1 | tr -cd '[0-9]')
+
         ACCELERATION="cu$CUDA_VERSION"
     elif command -v rocm-info &> /dev/null; then
-        ROCM_VERSION=$(rocm-info | grep -oP 'Version:\s+\K[0-9.]+')
+        ROCM_VERSION=$(rocm-info | awk '/Version:/ {print $2}' | tr -d '.')
         ACCELERATION="rocm$ROCM_VERSION"
     elif [ "$(uname -s)" == "Darwin" ]; then
-        ACCELERATION="metal"
+        ACCELERATION="cpu"
     fi
 
     echo "$ACCELERATION"
 }
 
-detect_latest_accelerated_version() {
+get_index_url() {
     CPU_ARCH=$(detect_cpu_arch)
-    PLATFORM=$(detect_platform)
     ACCELERATION=$(detect_gpu_acceleration)
-    PYTHON_VERSION=$(python --version 2>&1 | grep -oP 'Python \K[0-9]+\.[0-9]+')
-    PYTHON_VERSION_CONCATENATED=$(echo $PYTHON_VERSION | tr -d '.')  # Convert to e.g., 311
 
-    URL="https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/${CPU_ARCH}/${ACCELERATION}/llama-cpp-python/"
-    LATEST_WHEEL=$(curl -s $URL | grep -oP "href=\"\K(.*?cp${PYTHON_VERSION_CONCATENATED}.*?${PLATFORM}.*?\.whl)" | sort -V | tail -n 1)
-
-    if [ -z "$LATEST_WHEEL" ]; then
-        echo "No suitable wheel file found for the current configuration."
-        exit 1
-    fi
-
-    echo "$LATEST_WHEEL"
+    echo "https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/${CPU_ARCH}/${ACCELERATION}"
 }
 
-check_and_install() {
-    LATEST_WHEEL=$(detect_latest_accelerated_version)
-    if [ -z "$LATEST_WHEEL" ]; then
-        echo "WARNING: Unable to find a compatible wheel file, installing an unaccelerated version."
-        python -m pip install llama-cpp-python
-    fi
 
-    WHL_FILE=$(basename "$LATEST_WHEEL")
-    LATEST_VERSION=$(echo "$WHL_FILE" | grep -oP "llama_cpp_python-\K([0-9]+\.[0-9]+\.[0-9]+(\+[a-z0-9]+)?)")
-
-    INSTALLED_VERSION=$(pip list --format=freeze | grep "llama_cpp_python==" | cut -d'=' -f3 || echo "")
-
-    if [ "$INSTALLED_VERSION" = "$LATEST_VERSION" ]; then
-        echo "The latest version of llama-cpp-python ($LATEST_VERSION) is already installed."
-    else
-        echo "Installing the latest version of llama-cpp-python ($LATEST_VERSION) for your system ($INSTALLED_VERSION) is installed)"
-        python -m pip install --prefer-binary --force-reinstall "$LATEST_WHEEL"
-    fi
-}
-
-echo "Checking for llama-cpp-python installation..."
-check_and_install
+echo "Installing accelerated llama-cpp-python..."
+poetry run python -m pip install llama-cpp-python --prefer-binary --force-reinstall --extra-index-url="$(get_index_url)"
 
 echo "Installation complete. Please check for any errors above."
+
diff --git a/selfie-ui/src/app/components/Chat.tsx b/selfie-ui/src/app/components/Chat.tsx
@@ -35,20 +35,21 @@ export const Chat = ({
 
   const chatMessageStyle = {
     default: {
-      ai: { bubble: { backgroundColor: 'oklch(var(--b2))', color: 'oklch(var(--bc))' } }, // Slightly darker base color for AI bubble
+      ai: { bubble: { backgroundColor: 'oklch(var(--b3))', color: 'oklch(var(--bc))' } },
+      user: { bubble: { backgroundColor: 'oklch(var(--p))', color: 'oklch(var(--pc))' } },
     },
-    loading: {
-      bubble: { backgroundColor: 'oklch(var(--b2))', color: 'oklch(var(--bc))' },
-    }
+    // loading: {
+    //   bubble: { backgroundColor: 'oklch(var(--b3))', color: 'oklch(var(--bc))' },
+    // }
   };
 
   const chatInputStyle = {
     styles: {
       container: {
-        backgroundColor: 'oklch(var(--b3))', // Even more darker base color for input container
+        backgroundColor: 'oklch(var(--b3))',
         border: 'unset',
-        color: 'oklch(var(--bc))' // Base content color
-      }
+        color: 'oklch(var(--bc))'
+      },
     },
     placeholder: { text: "Say anything here...", style: { color: 'oklch(var(--bc))' } } // Use base-200 color for placeholder
   };

diff --git a/selfie-ui/src/app/components/DocumentTable/DocumentTable.tsx b/selfie-ui/src/app/components/DocumentTable/DocumentTable.tsx
@@ -1,4 +1,4 @@
-import React, { useEffect, useMemo, useState } from 'react';
+import React, { useCallback, useEffect, useMemo, useState } from 'react';
 import { FaRegTrashAlt } from 'react-icons/fa';
 import { ChevronDownIcon, ChevronUpIcon } from "@heroicons/react/20/solid";
 import { Document } from "@/app/types";
@@ -80,13 +80,13 @@ const DocumentTable: React.FC<DocumentTableProps> = ({ data, onDeleteDocuments,
         return acc;
       }, {});
     });
-  }, [data]);;
+  }, [data]);
 
   useEffect(() => {
     onSelectionChange(Object.keys(selectedRows).filter((id) => selectedRows[id]));
   }, [selectedRows, onSelectionChange]);
 
-  const toggleAllRowsSelected = () => {
+  const toggleAllRowsSelected = useCallback(() => {
     if (allRowsSelected) {
       setSelectedRows({});
     } else {
@@ -96,7 +96,7 @@ const DocumentTable: React.FC<DocumentTableProps> = ({ data, onDeleteDocuments,
       });
       setSelectedRows(newSelectedRows);
     }
-  };
+  }, [allRowsSelected, data, setSelectedRows])
 
   const handleSelectRow = (id: string) => {
     setSelectedRows((prev) => ({

diff --git a/selfie-ui/src/app/components/Playground/PlaygroundChat.tsx b/selfie-ui/src/app/components/Playground/PlaygroundChat.tsx
@@ -49,7 +49,7 @@ const PlaygroundChat = ({ disabled = false, hasIndexedDocuments = true }: { disa
 
               <input
                 type="checkbox"
-                className="toggle mx-2 toggle-sm"
+                className="toggle mx-2 toggle-sm toggle-primary"
                 title={!hasIndexedDocuments ? 'Add and index some documents to enable augmentation.' : ''}
                 disabled={disabled}
                 checked={hasIndexedDocuments && !disableAugmentation}

diff --git a/selfie-ui/src/app/components/Playground/PlaygroundQuery.tsx b/selfie-ui/src/app/components/Playground/PlaygroundQuery.tsx
@@ -123,7 +123,7 @@ const PlaygroundQuery = () => {
                 <span className="label-text">Include Summary</span>
                 <input
                   type="checkbox"
-                  className="toggle toggle-sm"
+                  className="toggle toggle-sm toggle-primary"
                   checked={includeSummary}
                   onChange={(e) => setIncludeSummary(e.target.checked)}
                 />

diff --git a/selfie/embeddings/__init__.py b/selfie/embeddings/__init__.py
@@ -58,7 +58,6 @@ async def completion_async(prompt):
 
             self.character_name = character_name
             self.embeddings = Embeddings(
-                path="sentence-transformers/all-MiniLM-L6-v2",
                 sqlite={"wal": True},
                 # For now, sqlite w/the default driver is the only way to use WAL.
                 content=True

diff --git a/start.sh b/start.sh
@@ -8,7 +8,9 @@ else
 fi
 
 MISSING_DEPENDENCIES=""
-command -v python >/dev/null 2>&1 || MISSING_DEPENDENCIES="${MISSING_DEPENDENCIES} Python (https://www.python.org/downloads/)\n"
+
+PYTHON_COMMAND=$(command -v python3 || { command -v python &>/dev/null && python --version 2>&1 | grep -q "Python 3" && echo python; })
+[ -z "$PYTHON_COMMAND" ] && MISSING_DEPENDENCIES="${MISSING_DEPENDENCIES} Python 3 (https://www.python.org/downloads/)\n"
 command -v poetry >/dev/null 2>&1 || MISSING_DEPENDENCIES="${MISSING_DEPENDENCIES} Poetry (https://python-poetry.org/docs/#installation)\n"
 command -v yarn >/dev/null 2>&1 || MISSING_DEPENDENCIES="${MISSING_DEPENDENCIES} Yarn (https://yarnpkg.com/getting-started/install)\n"
 
@@ -17,25 +19,20 @@ if [ ! -z "$MISSING_DEPENDENCIES" ]; then
     exit 1
 fi
 
+if command -v nvcc &>/dev/null || command -v rocm-info &>/dev/null || [ "$(uname -m)" = "arm64" ]; then
+    GPU_FLAG="--gpu"
+else
+    GPU_FLAG=""
+fi
+
 echo "Installing Python dependencies with Poetry..."
 poetry check || poetry install
 
 echo "Building UI with Yarn..."
 ./scripts/build-ui.sh
 
-ACCELERATION_FLAG=""
-
 echo "Running llama-cpp-python-cublas.sh to enable hardware acceleration..."
 ./scripts/llama-cpp-python-cublas.sh
 
-LLAMA_CPP_VERSION=$(poetry run pip list --format=freeze | grep "llama_cpp_python==" | cut -d'=' -f3)
-
-if [[ $LLAMA_CPP_VERSION == *"-gpu"* ]]; then
-    echo "Accelerated version of llama_cpp_python detected. Enabling GPU support."
-    ACCELERATION_FLAG="--gpu"
-else
-    echo "No accelerated version of llama_cpp_python detected. Running without GPU support."
-fi
-
 echo "Running selfie..."
-poetry run python -m selfie $ACCELERATION_FLAG
+poetry run python -m selfie $GPU_FLAG