Skip to content

Commit

Permalink
feat: rag with internet
Browse files Browse the repository at this point in the history
  • Loading branch information
leoguillaumegouv committed Oct 2, 2024
1 parent a492159 commit 3fbe2c0
Show file tree
Hide file tree
Showing 65 changed files with 2,203 additions and 1,624 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -200,4 +200,5 @@ terraform.rc
**/config.yml
.DS_Store
.idea
.vscode
.vscode
*.json
30 changes: 30 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Changelog

Tous les changements notables de l'application sont documentés dans ce fichier.

**Légende :**
- 💣 Breaking changes
- 🎉 New features
- 🐛 Bug fixes
- 📚 Documentation
- 🧪 Tests
- 🤖 CI/CD
- 🔄 Refactoring
- ❌ Deprecated

## [Alpha] - 2024-10-01

- 💣 Les collections sont appelées dorénavant par leur collection ID et non plus par leur nom
- 💣 Le endpoint POST `/v1/files` ne créer plus de collection si elle n'existe pas
- 🎉 Le endpoint POST `/v1/files` accepte maintenant tous les paramètres du chunking
- 🎉 Ajout de rôles utilisateur et admin pour la création de collection publiques
- 🎉 Ajout de la collection "internet" qui permet d'effectuer une recherche sur internet pour compléter la réponse du modèle
- 🎉 Affichage des sources dans le chat UI
- 🐛 Les erreurs sont remontées de manière plus claire dans l'upload de fichiers
- 🔄 Les modèles pydantic des endpoints sont harmonisés et sont plus restrictifs
- 🔄 Les clients sont instanciés par une classe ClientsManager
- 🧪 Ajout de tests unitaires
- 📚 Ajout d'un tutoriel pour l'import de bases de connaissances
- ❌ Les fichiers Docx ne sont plus supportés dans l'upload de fichiers
- ❌ Suppression de l'upload de plusieurs fichiers dans une seule requête
- ❌ Suppression de l'endpoint POST `/v1/chunks` pour récupérer plusieurs chunks en une seule requête
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Merci avant chaque pull request, de vérifier le bon déploiement de votre API
2. Executez les tests unitaires

```bash
PYTHONPATH=. pytest -v --exitfirst app/tests --base-url http://localhost:8080/v1 --api-key API_KEY
PYTHONPATH=. pytest -v --exitfirst app/tests --base-url http://localhost:8080/v1 --api-key-user API_KEY_USER --api-key-admin API_KEY_ADMIN --log-cli-level=INFO
```

# Linter
Expand Down
26 changes: 20 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,32 +8,46 @@ Albert API est une API open source d'IA générative développée par Etalab. El

### OpenAI conventions

En ce basant sur les conventions définies par OpenAI, l'API Albert expose des endpoints qui peuvent être appelés avec le [client officiel python d'OpenAI](https://github.com/openai/openai-python/tree/main). Ce formalisme permet d'intégrer facilement l'API Albert avec des bibliothèques tierces comme [Langchain](https://www.langchain.com/) ou [LlamaIndex](https://www.llamaindex.ai/).
En se basant sur les conventions définies par OpenAI, l'API Albert expose des endpoints qui peuvent être appelés avec le [client officiel python d'OpenAI](https://github.com/openai/openai-python/tree/main). Ce formalisme permet d'intégrer facilement l'API Albert avec des bibliothèques tierces comme [Langchain](https://www.langchain.com/) ou [LlamaIndex](https://www.llamaindex.ai/).

## ⚙️ Fonctionnalités

### Converser avec un modèle de langage (chat memory)

L'API Albert permet de converser avec différents modèles de langage.

<a target="_blank" href="https://colab.research.google.com/github/etalab-ia/albert-api/blob/main/docs/tutorials/chat_completions.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

Albert API intègre nativement la mémorisation des messages pour les conversations sans surcharger d'arguments le endpoint `/v1/chat/completions` par rapport à la documentation d'OpenAI. Cela consiste à envoyer à chaque requête au modèle l'historique de la conversation pour lui fournir le contexte.

### Accéder à plusieurs modèles de langage (multi models)

L'API Albert permet d'accéder à un ensemble de modèles de langage et d'embeddings grâce à une API unique.

<a target="_blank" href="https://colab.research.google.com/github/etalab-ia/albert-api/blob/main/docs/tutorials/models.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

Grâce à un fichier de configuration (*[config.example.yml](./config.example.yml)*) vous pouvez connecter autant d'API de modèles que vous le souhaitez. L'API Albert se charge de mutualiser l'accès à tous ces modèles dans une unique API. Vous pouvez obtenir la liste des différents modèles accessibles en appelant le endpoint `/v1/models`.
### Interroger vos documents (RAG)

### Interroger des documents (RAG)
L'API Albert permet d'interroger des documents dans une base vectorielle. Ces documents sont classés dans des collections. Vous pouvez créer vos collections privées et utilisé les collections publiques déjà existantes. Enfin une collection "internet" permet d'effectuer une recherche sur internet pour compléter la réponse du modèle.

<a target="_blank" href="https://colab.research.google.com/github/etalab-ia/albert-api/blob/main/docs/tutorials/retrival_augmented_generation.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

### Importer sa base de connaissances dans Albert (knowledge database)

L'API Albert permet d'importer sa base de connaissances dans une base vectorielle. Cette base vectorielle peut ensuite être utilisée pour faire de la RAG (Retrieval Augmented Generation).

<a target="_blank" href="https://colab.research.google.com/github/etalab-ia/albert-api/blob/main/docs/tutorials/import_knowledge_database.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

## 🧑‍💻 Contribuez au projet

Albert API est un projet open source, vous pouvez contribuez au projet, veuillez lire notre [guide de contribution](./CONTRIBUTING.md).
Albert API est un projet open source, vous pouvez contribuez au projet, veuillez lire notre [guide de contribution](./CONTRIBUTING.md).

## Installation

Pour déployer l'API Albert sur votre propre infrastructure, suivez la [documentation](./docs/deployment.md).
18 changes: 9 additions & 9 deletions app/endpoints/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,22 @@
import httpx

from app.schemas.chat import ChatCompletion, ChatCompletionChunk, ChatCompletionRequest
from app.schemas.config import LANGUAGE_MODEL_TYPE
from app.schemas.security import User
from app.utils.lifespan import clients
from app.utils.security import check_api_key
from app.utils.variables import LANGUAGE_MODEL_TYPE

router = APIRouter()


@router.post("/chat/completions")
async def chat_completions(request: ChatCompletionRequest, user: str = Security(check_api_key)) -> Union[ChatCompletion, ChatCompletionChunk]:
async def chat_completions(request: ChatCompletionRequest, user: User = Security(check_api_key)) -> Union[ChatCompletion, ChatCompletionChunk]:
"""Completion API similar to OpenAI's API.
See https://platform.openai.com/docs/api-reference/chat/create for the API specification.
"""

request = dict(request)
client = clients["models"][request["model"]]
client = clients.models[request["model"]]
if client.type != LANGUAGE_MODEL_TYPE:
raise HTTPException(status_code=400, detail="Model is not a language model")

Expand All @@ -31,12 +32,11 @@ async def chat_completions(request: ChatCompletionRequest, user: str = Security(

# non stream case
if not request["stream"]:
async_client = httpx.AsyncClient(timeout=20)
response = await async_client.request(method="POST", url=url, headers=headers, json=request)
print(response.text)
response.raise_for_status()
data = response.json()
return ChatCompletion(**data)
async with httpx.AsyncClient(timeout=20) as async_client:
response = await async_client.request(method="POST", url=url, headers=headers, json=request)
response.raise_for_status()
data = response.json()
return ChatCompletion(**data)

# stream case
async def forward_stream(url: str, headers: dict, request: dict):
Expand Down
23 changes: 17 additions & 6 deletions app/endpoints/chunks.py
Original file line number Diff line number Diff line change
@@ -1,33 +1,44 @@
from fastapi import APIRouter, Security
from uuid import UUID

from fastapi import APIRouter, HTTPException, Security
from qdrant_client.http.models import Filter, HasIdCondition

from app.helpers._vectorstore import VectorStore
from app.helpers import VectorStore
from app.schemas.chunks import Chunk, ChunkRequest, Chunks
from app.utils.lifespan import clients
from app.utils.security import check_api_key
from app.schemas.security import User

router = APIRouter()


@router.get("/chunks/{collection}/{chunk}")
async def get_chunk(collection: str, chunk: str, user: str = Security(check_api_key)) -> Chunk:
async def get_chunk(collection: UUID, chunk: str, user: User = Security(check_api_key)) -> Chunk:
"""
Get a single chunk.
"""
collection = str(collection)
vectorstore = VectorStore(clients=clients, user=user)
ids = [chunk]
filter = Filter(must=[HasIdCondition(has_id=ids)])
chunks = vectorstore.get_chunks(collection_name=collection, filter=filter)
try:
chunks = vectorstore.get_chunks(collection_id=collection, filter=filter)
except AssertionError as e:
raise HTTPException(status_code=400, detail=str(e))
return chunks[0]


@router.post("/chunks/{collection}")
async def get_chunks(collection: str, request: ChunkRequest, user: str = Security(check_api_key)) -> Chunks:
async def get_chunks(collection: UUID, request: ChunkRequest, user: User = Security(check_api_key)) -> Chunks:
"""
Get multiple chunks.
"""
collection = str(collection)
vectorstore = VectorStore(clients=clients, user=user)
ids = request.chunks
filter = Filter(must=[HasIdCondition(has_id=ids)])
chunks = vectorstore.get_chunks(collection_name=collection, filter=filter)
try:
chunks = vectorstore.get_chunks(collection_id=collection, filter=filter)
except AssertionError as e:
raise HTTPException(status_code=400, detail=str(e))
return Chunks(data=chunks)
79 changes: 44 additions & 35 deletions app/endpoints/collections.py
Original file line number Diff line number Diff line change
@@ -1,71 +1,80 @@
from typing import Optional, Union
from typing import Literal, Optional, Union
import uuid
from uuid import UUID


from fastapi import APIRouter, Response, Security, HTTPException
from fastapi import APIRouter, HTTPException, Response, Security
from fastapi.responses import JSONResponse

from app.helpers import VectorStore
from app.schemas.collections import Collection, Collections, CreateCollectionRequest
from app.schemas.collections import Collection, CollectionRequest, Collections
from app.schemas.security import User
from app.utils.lifespan import clients
from app.utils.security import check_api_key
from app.utils.variables import PUBLIC_COLLECTION_TYPE, INTERNET_COLLECTION_ID

router = APIRouter()


@router.post("/collections")
async def create_collection(request: CollectionRequest, user: User = Security(check_api_key)) -> Response:
"""
Create a new collection.
"""
vectorstore = VectorStore(clients=clients, user=user)
collection_id = str(uuid.uuid4())
try:
vectorstore.create_collection(
collection_id=collection_id, collection_name=request.name, collection_model=request.model, collection_type=request.type
)
except AssertionError as e:
raise HTTPException(status_code=400, detail=str(e))

return JSONResponse(status_code=201, content={"id": collection_id})


@router.get("/collections/{collection}")
@router.get("/collections")
async def get_collections(collection: Optional[str] = None, user: str = Security(check_api_key)) -> Union[Collection, Collections]:
async def get_collections(
collection: Optional[Union[UUID, Literal["internet"]]] = None, user: User = Security(check_api_key)
) -> Union[Collection, Collections]:
"""
Get list of collections.
Args:
collection (str): ID of the collection.
"""

internet_collection = Collection(
id=INTERNET_COLLECTION_ID,
name=INTERNET_COLLECTION_ID,
model=None,
type=PUBLIC_COLLECTION_TYPE,
description="Use this collection to search on the internet.",
)
if collection == "internet":
return internet_collection

collection_ids = [str(collection)] if collection else []
vectorstore = VectorStore(clients=clients, user=user)
try:
collection_ids = [collection] if collection else []
data = vectorstore.get_collection_metadata(collection_ids=collection_ids)
except AssertionError as e:
# TODO: return a 404 error if collection not found
raise HTTPException(status_code=400, detail=str(e))

if collection:
return data[0]

data.append(internet_collection)
return Collections(data=data)


@router.delete("/collections/{collection}")
async def delete_collections(collection: Optional[str] = None, user: str = Security(check_api_key)) -> Response:
async def delete_collections(collection: UUID, user: User = Security(check_api_key)) -> Response:
"""
Delete private collections.
Args:
collection (str, optional): ID of the collection. If not provided, all collections for the user are deleted.
Delete a collection.
"""
collection = str(collection)
vectorstore = VectorStore(clients=clients, user=user)
try:
vectorstore.delete_collection(collection_id=collection)
except AssertionError as e:
raise HTTPException(status_code=400, detail=str(e))

return Response(status_code=204)


@router.post("/collections")
async def create_collection(request: CreateCollectionRequest, user: str = Security(check_api_key)) -> Response:
"""
Create a new private collection.
Args:
request (CreateCollectionRequest): Request body.
"""
vectorstore = VectorStore(clients=clients, user=user)

try:
vectorstore.create_collection(
collection_id=request.id, collection_name=request.name, collection_model=request.model, collection_type=request.type
)
except AssertionError as e:
raise HTTPException(status_code=400, detail=str(e))

return Response(status_code=201)
21 changes: 16 additions & 5 deletions app/endpoints/completions.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,33 @@
from fastapi import APIRouter, Security
from fastapi import APIRouter, HTTPException, Security
import httpx

from app.schemas.completions import CompletionRequest, Completions
from app.schemas.security import User
from app.utils.lifespan import clients
from app.utils.security import check_api_key
from app.utils.variables import LANGUAGE_MODEL_TYPE

router = APIRouter()


@router.post("/completions")
async def completions(request: CompletionRequest, user: str = Security(check_api_key)) -> Completions:
async def completions(request: CompletionRequest, user: User = Security(check_api_key)) -> Completions:
"""
Completion API similar to OpenAI's API.
See https://platform.openai.com/docs/api-reference/completions/create for the API specification.
"""

request = dict(request)
client = clients.models[request["model"]]

client = clients["models"][request["model"]]
response = client.completions.create(**request)
if client.type != LANGUAGE_MODEL_TYPE:
raise HTTPException(status_code=400, detail="Model is not a language model")

return response
url = f"{client.base_url}completions"
headers = {"Authorization": f"Bearer {client.api_key}"}

async with httpx.AsyncClient(timeout=20) as async_client:
response = await async_client.request(method="POST", url=url, headers=headers, json=request)
response.raise_for_status()
data = response.json()
return Completions(**data)
22 changes: 16 additions & 6 deletions app/endpoints/embeddings.py
Original file line number Diff line number Diff line change
@@ -1,25 +1,35 @@
from fastapi import APIRouter, HTTPException, Security
import httpx

from app.schemas.config import EMBEDDINGS_MODEL_TYPE
from app.schemas.embeddings import Embeddings, EmbeddingsRequest
from app.schemas.security import User
from app.utils.lifespan import clients
from app.utils.security import check_api_key
from app.utils.variables import EMBEDDINGS_MODEL_TYPE

router = APIRouter()


# @ TODO pass to async with httpsx
@router.post("/embeddings")
async def embeddings(request: EmbeddingsRequest, user: str = Security(check_api_key)) -> Embeddings:
async def embeddings(request: EmbeddingsRequest, user: User = Security(check_api_key)) -> Embeddings:
"""
Embedding API similar to OpenAI's API.
See https://platform.openai.com/docs/api-reference/embeddings/create for the API specification.
"""

request = dict(request)
client = clients["models"][request["model"]]
client = clients.models[request["model"]]
if client.type != EMBEDDINGS_MODEL_TYPE:
raise HTTPException(status_code=400, detail=f"Model type must be {EMBEDDINGS_MODEL_TYPE}")
response = client.embeddings.create(**request)

return response
url = f"{client.base_url}embeddings"
headers = {"Authorization": f"Bearer {client.api_key}"}

if not client.check_context_length(model=request["model"], messages=request["messages"]):
raise HTTPException(status_code=400, detail="Context length too large")

async with httpx.AsyncClient(timeout=20) as async_client:
response = await async_client.request(method="POST", url=url, headers=headers, json=request)
response.raise_for_status()
data = response.json()
return Embeddings(**data)
Loading

0 comments on commit 3fbe2c0

Please sign in to comment.