Releases: zylon-ai/private-gpt
v0.6.2
0.6.2 (2024-08-08)
We are excited to announce the release of PrivateGPT 0.6.2, a “minor” version, which brings significant enhancements to our Docker setup, making it easier than ever to deploy and manage PrivateGPT in various environments.
Key Improvements
Our latest version introduces several key improvements that will streamline your deployment process:
- Simplified cold-start with a better Docker Compose Integration: Easily manage multiple services with improved Docker Compose configurations, reducing complexity and increasing efficiency.
- Environment-Specific Profiles: Tailor your setup to different environments, including CPU, CUDA (Nvidia GPU), and MacOS, ensuring optimal performance and compatibility in one click.
- Pre-built Docker Hub Images: Take advantage of ready-to-use Docker images for faster deployment and reduced setup time. More information can be found here.
Docker Demo
This demo will give you a firsthand look at the simplicity and ease of use that our tool offers, allowing you to get started with PrivateGPT + Ollama quickly and efficiently.
demo-docker.mp4
Get Started Quickly
To quickly get started with PrivateGPT 0.6.2 using Docker Compose, including our pre-built profiles, please visit our Quickstart Guide for more information how to run PrivateGPT.
We hope these improvements enhance your experience and streamline your deployment process. Thank you for your continued support!
Bug Fixes
v0.6.1
v0.6.0
0.6.0 (2024-08-02)
What's new
Introducing Recipes!
Recipes are high-level APIs that represent AI-native use cases. Under the hood, recipes execute complex pipelines to get the work done.
With the introduction of the first recipe, summarize
, our aim is not only to include that useful use case in PrivateGPT but also getting the project ready to onboard community-built recipes!
Summarization Recipe
summarize
is the first recipe
included in PrivateGPT. The new API lets users summarize ingested documents, customize the resulting summary and use it as streaming. Read the full documentation here.
POST /v1/summarize
Improved cold-start
We've put a lot of effort to run PrivateGPT from a fresh clone as straightforward as possible, defaulting to Ollama, auto-pulling models, making the tokenizer optional...
More models and databases support
Support for Gemini (both LLM and Embeddings) and for Milvus and Clickhouse vector databases.
Breaking changes
-
The minimum required Python version is now 3.11.9, and Poetry must be >= 1.7.1. However, we recommend updating to Poetry 1.8.3. Instructions for updating:
-
Python 3.11.9:
-
Before proceeding, make sure
pyenv
is installed on your system. If it isn't, you can install it by following the instructions on the PrivateGPT documentation. -
Use
pyenv
to install the specific version of Python.pyenv install 3.11.9
-
Verify the installation by running
python --version
in your terminal.
-
-
Poetry 1.8.3:
-
Update Poetry if already installed:
poetry self update 1.8.3
-
Verify the installation by running
poetry --version
in your terminal.
-
-
-
Default LLM model to LLaMA 3.1 for both Ollama and Llamacpp local setups. If you want to keep on using v0.5.0 defaults, place this settings-legacy.yaml file next to your settings.yaml file and run privateGPT with
PGPT_PROFILES=legacy make run
. Learn more about profiles here. -
Default Embeddings to
nomic-embed-text
for both Ollama and Llamacpp local setups. This embeddings model may work with a different dimension than the one you were using before, making it incompatible with already ingested files. If you want to keep on using v0.5.0 defaults to not lose your ingested files, place this settings-legacy.yaml file next to your settings.yaml file and run privateGPT withPGPT_PROFILES=legacy make run
. Learn more about profiles here. As an alternative, if you prefer to start fresh, you can just wipe your existing vector database by removing thelocal_data
folder.
Full Changelog
Features
- bump dependencies (#1987) (b687dc8)
- docs: add privategpt-ts sdk (#1924) (d13029a)
- docs: Fix setup docu (#1926) (067a5f1)
- docs: update doc for ipex-llm (#1968) (19a7c06)
- docs: update documentation and fix preview-docs (#2000) (4523a30)
- llm: add progress bar when ollama is pulling models (#2031) (cf61bf7)
- llm: autopull ollama models (#2019) (20bad17)
- llm: Support for Google Gemini LLMs and Embeddings (#1965) (fc13368)
- make llama3.1 as default (#2022) (9027d69)
- prompt_style applied to all LLMs + extra LLM params. (#1835) (e21bf20)
- recipe: add our first recipe
Summarize
(#2028) (8119842) - vectordb: Milvus vector db Integration (#1996) (43cc31f)
- vectorstore: Add clickhouse support as vectore store (#1883) (2612928)
Bug Fixes
- "no such group" error in Dockerfile, added docx2txt and cryptography deps (#1841) (947e737)
- config: make tokenizer optional and include a troubleshooting doc (#1998) (01b7ccd)
- docs: Fix concepts.mdx referencing to installation page (#1779) (dde0224)
- docs: Update installation.mdx (#1866) (c1802e7)
- ffmpy dependency (#2020) (dabf556)
- light mode (#2025) (1020cd5)
- LLM: mistral ignoring assistant messages (#1954) (c7212ac)
- llm: special tokens and leading space (#1831) (347be64)
- make embedding_api_base match api_base when on docker (#1859) (2a432bf)
- nomic embeddings (#2030) (5465958)
- prevent to ingest local files (by default) (#2010) (e54a8fe)
- Replacing unsafe
eval()
withjson.loads()
(#1890) (9d0d614) - settings: enable cors by default so it will work when using ts sdk (spa) (#1925) (966af47)
- ui: gradio bug fixes (#2021) (d4375d0)
- unify embedding models (#2027) (40638a1)
v0.5.0
0.5.0 (2024-04-02)
Features
- code: improve concat of strings in ui (#1785) (bac818a)
- docker: set default Docker to use Ollama (#1812) (f83abff)
- docs: Add guide Llama-CPP Linux AMD GPU support (#1782) (8a836e4)
- docs: Feature/upgrade docs (#1741) (5725181)
- docs: upgrade fern (#1596) (84ad16a)
- ingest: Created a faster ingestion mode - pipeline (#1750) (134fc54)
- llm - embed: Add support for Azure OpenAI (#1698) (1efac6a)
- llm: adds serveral settings for llamacpp and ollama (#1703) (02dc83e)
- llm: Ollama LLM-Embeddings decouple + longer keep_alive settings (#1800) (b3b0140)
- llm: Ollama timeout setting (#1773) (6f6c785)
- local: tiktoken cache within repo for offline (#1467) (821bca3)
- nodestore: add Postgres for the doc and index store (#1706) (68b3a34)
- rag: expose similarity_top_k and similarity_score to settings (#1771) (087cb0b)
- RAG: Introduce SentenceTransformer Reranker (#1810) (83adc12)
- scripts: Wipe qdrant and obtain db Stats command (#1783) (ea153fb)
- ui: Add Model Information to ChatInterface label (f0b174c)
- ui: add sources check to not repeat identical sources (#1705) (290b9fb)
- UI: Faster startup and document listing (#1763) (348df78)
- ui: maintain score order when curating sources (#1643) (410bf7a)
- unify settings for vector and nodestore connections to PostgreSQL (#1730) (63de7e4)
- wipe per storage type (#1772) (c2d6948)
Bug Fixes
v0.4.0
v0.3.0
0.3.0 (2024-02-16)
Features
- add mistral + chatml prompts (#1426) (e326126)
- Add stream information to generate SDKs (#1569) (24fae66)
- API: Ingest plain text (#1417) (6eeb95e)
- bulk-ingest: Add --ignored Flag to Exclude Specific Files and Directories During Ingestion (#1432) (b178b51)
- llm: Add openailike llm mode (#1447) (2d27a9f), closes #1424
- llm: Add support for Ollama LLM (#1526) (6bbec79)
- settings: Configurable context_window and tokenizer (#1437) (4780540)
- settings: Update default model to TheBloke/Mistral-7B-Instruct-v0.2-GGUF (#1415) (8ec7cf4)
- ui: make chat area stretch to fill the screen (#1397) (c71ae7c)
- UI: Select file to Query or Delete + Delete ALL (#1612) (aa13afd)
Bug Fixes
- Adding an LLM param to fix broken generator from llamacpp (#1519) (869233f)
- deploy: fix local and external dockerfiles (fde2b94)
- docker: docker broken copy (#1419) (059f358)
- docs: Update quickstart doc and set version in pyproject.toml to 0.2.0 (0a89d76)
- minor bug in chat stream output - python error being serialized (#1449) (6191bcd)
- settings: correct yaml multiline string (#1403) (2564f8d)
- tests: load the test settings only when running tests (d3acd85)
- UI: Updated ui.py. Frees up the CPU to not be bottlenecked. (24fb80c)
v0.2.0
v0.1.0
0.1.0 (2023-11-30)
Features
- Improved documentation using Fern
- Fastest ingestion through different ingestions modes ([#1309] (#1309))
- Add sources to completions APIs and UI
- Add simple Basic auth
- Add basic CORS
- Add "search in docs" to UI
- LLM and Embeddings model separate configuration
- Allow using a system prompt in the API to modify the LLM behaviour
- Expose configuration of the model execution such as max_new_tokens
- Multiple prompt styles support for different models
- Update to Gradio 4
- Document deletion API
- Sagemaker support
- Disable Gradio Analytics (#1165) (6583dc8)
- Drop loguru and use builtin
logging
(#1133) (64c5ae2) - enable resume download for hf_hub_download (#1249) (4197ada)
- move torch and transformers to local group (#1172) (0d677e1)
- Qdrant support (#1228) (03d1ae6)
- Added wipe command to easy up vector database reset
Bug Fixes
- Docker and sagemaker setup (#1118) (895588b)
- fix pytorch version to avoid wheel bug (#1123) (24cfddd)
- Remove global state (#1216) (022bd71)
- sagemaker config and chat methods (#1142) (a517a58)
- typo in README.md (#1091) (ba23443)
- Windows 11 failing to auto-delete tmp file (#1260) (0d52002)
- Windows permission error on ingest service tmp files (#1280) (f1cbff0)
v0.0.2
v0.0.1
0.0.1 (2023-10-20)
Features
- Get answers using preferred number of chunks (cf709a6)
- Release GitHub action (#1078) (b745091)
- ui: add LLM mode to UI (#1080) (d249a17)
Bug Fixes
- 294 (tested) (4cda348)
- Add
TARGET_SOURCE_CHUNKS
toexample.env
(2027ac5) - Disable Chroma Telemetry (8c6a81a)
- make docs more visible (#1081) (aa4bb17)
Miscellaneous Chores
- Initial version (490d93f)