Skip to content

Commit

Permalink
✨ feat: files and knowledge base (lobehub#3487)
Browse files Browse the repository at this point in the history
* ✨ feat: add files and knowledge base

Update edge.ts

Update test.yml

🎨 chore: fix locale

Update index.tsx

测试 pgvector workflow

* 💄 style: improve upload detail

* ✨ feat: support delete s3 file when delete files

* 💄 style: improve chunks in message

* ♻️ refactor: refactor the auth method

* ✨ feat: support use user client api key

* 💄 style: fix image list in mobile

* ✨ feat: support file upload on mobile

* ✅ test: fix test

* fix vercel build

* docs: update docs

* 👷 build: improve docker

* update i18n
  • Loading branch information
arvinxx authored Aug 21, 2024
1 parent d8950b2 commit 6574c01
Show file tree
Hide file tree
Showing 352 changed files with 20,334 additions and 1,727 deletions.
3 changes: 2 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ jobs:

services:
postgres:
image: postgres:16
image: pgvector/pgvector:pg16
env:
POSTGRES_PASSWORD: postgres
options: >-
Expand Down Expand Up @@ -39,6 +39,7 @@ jobs:
DATABASE_DRIVER: node
NEXT_PUBLIC_SERVICE_MODE: server
KEY_VAULTS_SECRET: LA7n9k3JdEcbSgml2sxfw+4TV1AzaaFU5+R176aQz4s=
NEXT_PUBLIC_S3_DOMAIN: https://example.com

- name: Upload Server coverage to Codecov
uses: codecov/codecov-action@v4
Expand Down
2 changes: 2 additions & 0 deletions Dockerfile.database
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ COPY --from=builder /deps/node_modules/drizzle-orm /app/node_modules/drizzle-orm
# Copy database migrations
COPY --from=builder /app/src/database/server/migrations /app/migrations
COPY --from=builder /app/scripts/migrateServerDB/docker.cjs /app/docker.cjs
COPY --from=builder /app/scripts/migrateServerDB/errorHint.js /app/errorHint.js

## Production image, copy all the files and run next
FROM base
Expand All @@ -107,6 +108,7 @@ ENV HOSTNAME="0.0.0.0" \

# General Variables
ENV ACCESS_CODE="" \
APP_URL="" \
API_KEY_SELECT_MODE="" \
DEFAULT_AGENT_CONFIG="" \
SYSTEM_AGENT="" \
Expand Down
65 changes: 65 additions & 0 deletions docs/self-hosting/advanced/knowledge-base.zh-CN.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# 知识库/文件上传

LobeChat 支持文件上传/知识库管理。该功能依赖于以下核心技术组件,了解这些组件将有助于你成功部署和维护知识库系统。

## 核心组件

### 1. PostgreSQL 与 PGVector

PostgreSQL 是一个强大的开源关系型数据库系统,而 PGVector 是其扩展,为向量操作提供支持。

- **用途**:存储结构化数据和向量索引
- **部署建议**:使用官方 Docker 镜像可以快速部署 PostgreSQL 和 PGVector

示例部署脚本:

```
docker run -p 5432:5432 -d --name pg -e POSTGRES_PASSWORD=mysecretpassword pgvector/pgvector:pg16
```

- **注意事项**:确保分配足够的资源以处理向量操作

### 2. S3 兼容的对象存储

S3(或兼容 S3 协议的存储服务)用于存储上传的文件。

- **用途**:存储原始文件
- **选项**:可以使用 AWS S3、MinIO 或其他兼容 S3 协议的存储服务
- **注意事项**:配置适当的访问权限和安全策略

### 3. OpenAI Embedding

OpenAI 的嵌入(Embedding)服务用于将文本转化为向量表示。

- **用途**:生成文本的向量表示,用于语义搜索
- **注意事项**
- 需要有效的 OpenAI API 密钥
- 实施适当的 API 调用限制和错误处理机制

### 4. Unstructured.io(可选)

Unstructured.io 是一个强大的文档处理工具。

- **用途**:处理复杂的文档格式,提取结构化信息
- **应用场景**:处理 PDF、Word 等非纯文本格式的文档
- **注意事项**:评估处理需求,根据文档复杂度决定是否部署

## 部署注意事项

1. **数据安全**:确保所有组件都有适当的安全措施,特别是涉及敏感数据时。

2. **性能优化**

- 为 PostgreSQL 和 PGVector 配置足够的计算资源
- 优化 S3 存储的访问策略和缓存机制

3. **可扩展性**:设计架构时考虑未来可能的数据增长和用户增加。

4. **监控与维护**

- 实施日志记录和监控系统
- 定期备份数据库和对象存储

5. **合规性**:确保部署符合相关的数据保护法规和隐私政策。

通过正确配置和集成这些核心组件,您可以为 LobeChat 构建一个强大、高效的知识库系统。每个组件都在整体架构中扮演着关键角色,共同支持高级的文档管理和智能检索功能。
16 changes: 13 additions & 3 deletions locales/ar/chat.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
"pin": "تثبيت",
"pinOff": "إلغاء التثبيت",
"rag": {
"referenceChunks": "مراجع",
"userQuery": {
"actions": {
"delete": "حذف الاستعلام",
Expand Down Expand Up @@ -155,9 +156,18 @@
},
"updateAgent": "تحديث معلومات المساعد",
"upload": {
"actionFiletip": "تحميل المستند",
"actionTooltip": "تحميل الصورة",
"disabled": "النموذج الحالي لا يدعم التعرف على الرؤية، يرجى تغيير النموذج المستخدم",
"action": {
"fileUpload": "رفع ملف",
"folderUpload": "رفع مجلد",
"imageDisabled": "النموذج الحالي لا يدعم التعرف على الصور، يرجى تغيير النموذج لاستخدامه",
"imageUpload": "رفع صورة",
"tooltip": "رفع"
},
"clientMode": {
"actionFiletip": "رفع ملف",
"actionTooltip": "رفع",
"disabled": "النموذج الحالي لا يدعم التعرف على الصور وتحليل الملفات، يرجى تغيير النموذج لاستخدامه"
},
"preview": {
"prepareTasks": "تحضير الأجزاء...",
"status": {
Expand Down
2 changes: 2 additions & 0 deletions locales/ar/components.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@
"chunks": {
"embeddingStatus": {
"empty": "لم يتم تحويل كتل النص بالكامل إلى متجهات، مما سيؤدي إلى عدم توفر وظيفة البحث الدلالي، لتحسين جودة البحث، يرجى تحويل كتل النص إلى متجهات",
"error": "فشل في تحويل البيانات إلى متجهات",
"errorResult": "فشل في تحويل البيانات إلى متجهات، يرجى التحقق والمحاولة مرة أخرى. سبب الفشل:",
"processing": "يتم تحويل كتل النص إلى متجهات، يرجى الانتظار",
"success": "تم تحويل جميع كتل النص الحالية إلى متجهات"
},
Expand Down
16 changes: 13 additions & 3 deletions locales/bg-BG/chat.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
"pin": "Закачи",
"pinOff": "Откачи",
"rag": {
"referenceChunks": "Цитирани източници",
"userQuery": {
"actions": {
"delete": "Изтрий Query",
Expand Down Expand Up @@ -155,9 +156,18 @@
},
"updateAgent": "Актуализирай информацията за агента",
"upload": {
"actionFiletip": "Загрузите файл",
"actionTooltip": "Качи изображение",
"disabled": "Текущият модел не поддържа визуално разпознаване. Моля, превключи моделите, за да използваш тази функция.",
"action": {
"fileUpload": "Качване на файл",
"folderUpload": "Качване на папка",
"imageDisabled": "Текущият модел не поддържа визуално разпознаване, моля, превключете модела и опитайте отново",
"imageUpload": "Качване на изображение",
"tooltip": "Качване"
},
"clientMode": {
"actionFiletip": "Качване на файл",
"actionTooltip": "Качване",
"disabled": "Текущият модел не поддържа визуално разпознаване и анализ на файлове, моля, превключете модела и опитайте отново"
},
"preview": {
"prepareTasks": "Подготовка на парчета...",
"status": {
Expand Down
2 changes: 2 additions & 0 deletions locales/bg-BG/components.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@
"chunks": {
"embeddingStatus": {
"empty": "Текстовите блокове все още не са напълно векторизирани, което ще доведе до недостъпност на семантичното търсене. За подобряване на качеството на търсенето, моля, векторизирайте текстовите блокове.",
"error": "Неуспешна векторизация",
"errorResult": "Неуспешна векторизация, моля проверете и опитайте отново. Причина за неуспеха:",
"processing": "Текстовите блокове се векторизират, моля, бъдете търпеливи.",
"success": "Текущите текстови блокове са напълно векторизирани."
},
Expand Down
16 changes: 13 additions & 3 deletions locales/de-DE/chat.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
"pin": "Anheften",
"pinOff": "Anheften aufheben",
"rag": {
"referenceChunks": "Referenzstücke",
"userQuery": {
"actions": {
"delete": "Abfrage löschen",
Expand Down Expand Up @@ -155,9 +156,18 @@
},
"updateAgent": "Assistenteninformationen aktualisieren",
"upload": {
"actionFiletip": "Laden Sie die Datei hoch",
"actionTooltip": "Bild hochladen",
"disabled": "Das aktuelle Modell unterstützt keine visuelle Erkennung. Bitte wechseln Sie das Modell, um es zu verwenden.",
"action": {
"fileUpload": "Datei hochladen",
"folderUpload": "Ordner hochladen",
"imageDisabled": "Das aktuelle Modell unterstützt keine visuelle Erkennung. Bitte wechseln Sie das Modell, um diese Funktion zu nutzen.",
"imageUpload": "Bild hochladen",
"tooltip": "Hochladen"
},
"clientMode": {
"actionFiletip": "Datei hochladen",
"actionTooltip": "Hochladen",
"disabled": "Das aktuelle Modell unterstützt keine visuelle Erkennung und Dateianalyse. Bitte wechseln Sie das Modell, um diese Funktionen zu nutzen."
},
"preview": {
"prepareTasks": "Vorbereitung der Teile...",
"status": {
Expand Down
2 changes: 2 additions & 0 deletions locales/de-DE/components.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@
"chunks": {
"embeddingStatus": {
"empty": "Textblöcke sind noch nicht vollständig vektorisiert, was die Funktion der semantischen Suche beeinträchtigen kann. Um die Suchqualität zu verbessern, vektorisieren Sie die Textblöcke.",
"error": "Vektorisierung fehlgeschlagen",
"errorResult": "Vektorisierung fehlgeschlagen, bitte überprüfen Sie und versuchen Sie es erneut. Grund für das Scheitern:",
"processing": "Textblöcke werden vektorisiert, bitte haben Sie Geduld.",
"success": "Alle aktuellen Textblöcke sind vektorisiert."
},
Expand Down
16 changes: 13 additions & 3 deletions locales/en-US/chat.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
"pin": "Pin",
"pinOff": "Unpin",
"rag": {
"referenceChunks": "Reference Source",
"userQuery": {
"actions": {
"delete": "Delete Query Rewrite",
Expand Down Expand Up @@ -155,9 +156,18 @@
},
"updateAgent": "Update Assistant Information",
"upload": {
"actionFiletip": "Update File",
"actionTooltip": "Upload Image",
"disabled": "The current model does not support visual recognition. Please switch models to use this feature.",
"action": {
"fileUpload": "Upload File",
"folderUpload": "Upload Folder",
"imageDisabled": "The current model does not support visual recognition. Please switch models to use this feature.",
"imageUpload": "Upload Image",
"tooltip": "Upload"
},
"clientMode": {
"actionFiletip": "Upload File",
"actionTooltip": "Upload",
"disabled": "The current model does not support visual recognition and file analysis. Please switch models to use this feature."
},
"preview": {
"prepareTasks": "Preparing chunks...",
"status": {
Expand Down
2 changes: 2 additions & 0 deletions locales/en-US/components.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@
"chunks": {
"embeddingStatus": {
"empty": "Text chunks have not been fully vectorized, which will render the semantic search feature unavailable. To improve search quality, please vectorize the text chunks.",
"error": "Vectorization failed",
"errorResult": "Vectorization failed, please check and try again. Reason for failure:",
"processing": "Text chunks are being vectorized, please be patient.",
"success": "All current text chunks have been vectorized."
},
Expand Down
16 changes: 13 additions & 3 deletions locales/es-ES/chat.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
"pin": "Fijar",
"pinOff": "Desfijar",
"rag": {
"referenceChunks": "Fragmentos de referencia",
"userQuery": {
"actions": {
"delete": "Eliminar reescritura de consulta",
Expand Down Expand Up @@ -155,9 +156,18 @@
},
"updateAgent": "Actualizar información del asistente",
"upload": {
"actionFiletip": "Sube el archivo",
"actionTooltip": "Subir imagen",
"disabled": "El modelo actual no admite reconocimiento visual. Por favor, cambia de modelo para usar esta función",
"action": {
"fileUpload": "Subir archivo",
"folderUpload": "Subir carpeta",
"imageDisabled": "El modelo actual no soporta reconocimiento visual, por favor cambie de modelo para usar esta función",
"imageUpload": "Subir imagen",
"tooltip": "Subir"
},
"clientMode": {
"actionFiletip": "Subir archivo",
"actionTooltip": "Subir",
"disabled": "El modelo actual no soporta reconocimiento visual ni análisis de archivos, por favor cambie de modelo para usar esta función"
},
"preview": {
"prepareTasks": "Preparando fragmentos...",
"status": {
Expand Down
2 changes: 2 additions & 0 deletions locales/es-ES/components.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@
"chunks": {
"embeddingStatus": {
"empty": "Los bloques de texto aún no están completamente vectorizados, lo que hará que la función de búsqueda semántica no esté disponible. Para mejorar la calidad de búsqueda, por favor vectorice los bloques de texto.",
"error": "Error de vectorización",
"errorResult": "Error de vectorización, por favor verifica y vuelve a intentarlo. Motivo del fallo:",
"processing": "Los bloques de texto están siendo vectorizados, por favor, tenga paciencia.",
"success": "Todos los bloques de texto actuales han sido vectorizados."
},
Expand Down
16 changes: 13 additions & 3 deletions locales/fr-FR/chat.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
"pin": "Épingler",
"pinOff": "Désépingler",
"rag": {
"referenceChunks": "Références",
"userQuery": {
"actions": {
"delete": "Supprimer la réécriture de la requête",
Expand Down Expand Up @@ -155,9 +156,18 @@
},
"updateAgent": "Mettre à jour les informations de l'agent",
"upload": {
"actionFiletip": "Télécharger le fichier",
"actionTooltip": "Télécharger une image",
"disabled": "Le modèle actuel ne prend pas en charge la reconnaissance visuelle. Veuillez changer de modèle pour utiliser cette fonctionnalité.",
"action": {
"fileUpload": "Télécharger un fichier",
"folderUpload": "Télécharger un dossier",
"imageDisabled": "Le modèle actuel ne prend pas en charge la reconnaissance visuelle, veuillez changer de modèle pour l'utiliser",
"imageUpload": "Télécharger une image",
"tooltip": "Télécharger"
},
"clientMode": {
"actionFiletip": "Télécharger un fichier",
"actionTooltip": "Télécharger",
"disabled": "Le modèle actuel ne prend pas en charge la reconnaissance visuelle et l'analyse de fichiers, veuillez changer de modèle pour l'utiliser"
},
"preview": {
"prepareTasks": "Préparation des morceaux...",
"status": {
Expand Down
2 changes: 2 additions & 0 deletions locales/fr-FR/components.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@
"chunks": {
"embeddingStatus": {
"empty": "Les blocs de texte n'ont pas encore été entièrement vectorisés, ce qui rendra la fonction de recherche sémantique indisponible. Pour améliorer la qualité de la recherche, veuillez vectoriser les blocs de texte.",
"error": "Échec de la vectorisation",
"errorResult": "Échec de la vectorisation, veuillez vérifier et réessayer. Raison de l'échec :",
"processing": "Les blocs de texte sont en cours de vectorisation, veuillez patienter.",
"success": "Tous les blocs de texte sont maintenant vectorisés."
},
Expand Down
16 changes: 13 additions & 3 deletions locales/it-IT/chat.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
"pin": "Fissa in alto",
"pinOff": "Annulla fissaggio in alto",
"rag": {
"referenceChunks": "Citazioni di riferimento",
"userQuery": {
"actions": {
"delete": "Elimina la Query riscritta",
Expand Down Expand Up @@ -155,9 +156,18 @@
},
"updateAgent": "Aggiorna informazioni assistente",
"upload": {
"actionFiletip": "Carica il file",
"actionTooltip": "Carica immagine",
"disabled": "Il modello attuale non supporta il riconoscimento visivo, si prega di cambiare modello prima di utilizzarlo",
"action": {
"fileUpload": "Carica file",
"folderUpload": "Carica cartella",
"imageDisabled": "Il modello attuale non supporta il riconoscimento visivo, si prega di cambiare modello per utilizzare questa funzione",
"imageUpload": "Carica immagine",
"tooltip": "Carica"
},
"clientMode": {
"actionFiletip": "Carica file",
"actionTooltip": "Carica",
"disabled": "Il modello attuale non supporta il riconoscimento visivo e l'analisi dei file, si prega di cambiare modello per utilizzare questa funzione"
},
"preview": {
"prepareTasks": "Preparazione dei blocchi...",
"status": {
Expand Down
2 changes: 2 additions & 0 deletions locales/it-IT/components.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@
"chunks": {
"embeddingStatus": {
"empty": "I blocchi di testo non sono stati completamente vettorizzati, il che comporterà l'impossibilità di utilizzare la funzione di ricerca semantica. Per migliorare la qualità della ricerca, si prega di vettorizzare i blocchi di testo.",
"error": "Errore di vettorizzazione",
"errorResult": "Vettorizzazione fallita, controlla e riprova. Motivo del fallimento:",
"processing": "I blocchi di testo sono in fase di vettorizzazione, ti preghiamo di attendere",
"success": "Attualmente tutti i blocchi di testo sono stati vettorizzati"
},
Expand Down
Loading

0 comments on commit 6574c01

Please sign in to comment.