Skip to content

Commit

Permalink
Merge pull request #6 from AstraBert/december-2024
Browse files Browse the repository at this point in the history
v1.0.0
  • Loading branch information
AstraBert authored Dec 10, 2024
2 parents cc587fc + 50ed2a0 commit a460e73
Show file tree
Hide file tree
Showing 25 changed files with 530 additions and 200 deletions.
3 changes: 3 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
PG_DB="postgres"
PG_USER="pgql_usr"
PG_PASSWORD="pgql_psw"
11 changes: 10 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,11 @@
model/
__pycache__/
lib/scripts/__pycache__/
lib/docker/__pycache__/
lib/scripts/.env
lib/docker/.env
.env
virtualenv/
qdrant_storage/
lib/docker/florence-2/
lib/docker/qwen/
lib/docker/labse/
10 changes: 0 additions & 10 deletions .gitpod.yml

This file was deleted.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions .gradio/flagged/dataset1.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Search Query,Image Search Query,Maximum Number of Search Results,Enable RAG,Debug,Search Results,timestamp
,.gradio\flagged\Image Search Query\41ddcc66f19720a7c3bd\Components-of-a-nuclear-power-plant-1400x803.png,3,false,true,"### Understanding Nuclear Power Plants and Reactors

Nuclear power plants generate electricity through controlled nuclear fission in large, specialized facilities. These systems consist of multiple components including nuclear reactors which convert uranium or other fissile materials into usable energy.

#### Key Components:
- **Reactor Core**: The heart of any nuclear power plant where the actual reaction takes place.
- *Bangombe Deposit*: A notable location for potential future use as an additional source of fuel.

- **Heat Transfer Fluids**: Various types such as heavy water (deuterium oxide) and light water (ordinary H₂O), depending on specific designs like RBMK reactors mentioned here.

- **Power Outputs**: Units can produce significant amounts of electrical output; e.g., one reactor might be rated at around 150 MWe (Megawatts Electrical).

- **Safety Measures & Regulations**: Strict guidelines ensure safety during operation by managing coolant flow rates carefully (""Follow Loads Reasonably Easily Without Burning"").

- **Fuel Cycle Management**: Ensures smooth transition between different phases of operations using detailed guides available online.

This type of technology has been extensively studied over decades leading up to today's advanced models. For instance, there have been numerous reports documenting its development process from early stages all the way to current state-of-the-art technologies.

If interested further details about individual aspects could refer directly to World-Nuclear Organization resources linked within this summary. They provide comprehensive documentation covering everything from theoretical principles down to practical applications across diverse contexts worldwide.",2024-12-10 16:31:14.503594
10 changes: 5 additions & 5 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Contributing to SearchPhi
# Contributing to PrAIvateSearch

Do you want to contribute to this project? Make sure to read this guidelines first :)

Expand Down Expand Up @@ -32,15 +32,15 @@ Do you want to contribute to this project? Make sure to read this guidelines fir
3. Submit pull request (make sure to provide a thorough description of the changes)


## Showcase your SearchPhi
## Showcase your PrAIvateSearch

**When to do it**:

- You modified the base application with new features but you don't want/can't merge them with the original SearchPhi
- You modified the base application with new features but you don't want/can't merge them with the original PrAIvateSearch

**How to do it**:

- Go to [_GitHub Discussions > Show and tell_](https://github.com/AstraBert/SearchPhi/discussions/categories/show-and-tell) page
- Open a new discussion there, describing your SearchPhi application
- Go to [_GitHub Discussions > Show and tell_](https://github.com/AstraBert/PrAIvateSearch/discussions/categories/show-and-tell) page
- Open a new discussion there, describing your PrAIvateSearch application

### Thanks for contributing!
16 changes: 0 additions & 16 deletions Dockerfile

This file was deleted.

8 changes: 0 additions & 8 deletions Dockerfile.gitpod

This file was deleted.

91 changes: 37 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,52 @@
<h1 align="center">SearchPhi</h1>
<h2 align="center">Open source and AI-powered web search engine🌐</h2>

<h1 align="center">PrAIvateSearch</h1>
<h2 align="center">Own your AI, search the web with it🌐😎</h2>

<div align="center">
<img src="https://img.shields.io/github/languages/top/AstraBert/SearchPhi" alt="GitHub top language">
<img src="https://img.shields.io/github/commit-activity/t/AstraBert/SearchPhi" alt="GitHub commit activity">
<img src="https://img.shields.io/badge/Status-stable_beta-green" alt="Static Badge">
<img src="https://img.shields.io/badge/Release-v0.0_beta.0-purple" alt="Static Badge">
<img src="https://img.shields.io/docker/image-size/astrabert/searchphi
" alt="Docker image size">
<img src="https://img.shields.io/badge/Supported_platforms-Windows/POSIX-brown" alt="Static Badge">
<div>
<img src="./imgs/SearchPhi_logo.png" alt="Logo" align="center">
<img src="./imgs/PrAIvateSearch_logo.png" alt="Logo" align="center">
</div>
</div>

## About SearchPhi

SearchPhi is a Streamlit application that aims to implement similar features to SearchGPT, but in an open-source, local and private way.
## About PrAIvateSearch

PrAIvateSearch is a Gradio application that aims to implement similar features to SearchGPT, but in an open-source, local and private way.

## Flowchart

<div align="center">
<img src="./imgs/PrAIvateSearch_Flowchart.png" alt="Logo" align="center">
<p><i>Flowchart for PrAIvateSearch</i></p>
</div>

The process of creating and the functioning of PrAIvateSearch is explained in this [blog post on HuggingFace](https://huggingface.co/blog/as-cle-bert/build-an-ai-powered-search-engine-from-scratch).

## Installation and usage

### Source code

1. Clone the repository:

```bash
git clone https://github.com/AstraBert/SearchPhi.git
cd SearchPhi
git clone https://github.com/AstraBert/PrAIvateSearch.git
cd PrAIvateSearch
```

2. Create a `model` folder, download [this GGUF file](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/blob/main/Phi-3-mini-4k-instruct-q4.gguf) and move the GGUF file in the `model` folder:
2. Move `.env.example` to `.env`...

```bash
mkdir model
mv /path/to/Downloads/Phi-3-mini-4k-instruct-q4.gguf model/
mv .env.example .env
```

...and specify PostgreSQL related variables:

```bash
# .env file
PG_DB="postgres"
PG_USER="pgql_usr"
PG_PASSWORD="pgql_psw"
```


3. Install necessary dependencies:
- Linux:
```bash
Expand All @@ -56,59 +66,32 @@ source c:\path\to\SearchPhi\Scripts\activate # For Git
python3 -m pip install -r requirements.txt
```


4. Run the application:
4. Start third-party services:

```bash
python3 -m streamlit run app.py
docker compose up -d
```

You'll see the application on `http://localhost:8501`.

**PROs**: You can customize the application code (change the GGUF model, change CPU/GPU settings, change generation kwargs, modify the app interface...)

**CONs**: Longer and more complex installation process

### Docker

1. Pull the image

```bash
docker pull astrabert/searchphi:latest
```

2. Run the container:
5. Run the application:

```bash
docker run -p 8501:8501 astrabert/searchphi:latest
python3 scripts/app.py
```

Shortly after you submit the `docker run` command, the container logs will tell you that the application is up and running on `http://localhost:8501`.
Once the models will be downloaded and loaded on your hardware, you'll see the application on `http://localhost:7860`.

**PROs**: Shorter and simpler installation process
**PROs**: You can customize the application code (change the model, change CPU/GPU settings, change generation kwargs, modify the app interface...)

**CONs**: You cannot customize the application code

### Run in cloud

- **GitPod workspaces**: Click [here](https://gitpod.io/#https://github.com/AstraBert/SearchPhi) to open the GitPod workspace

**PROs**: No local installation and you can exploit better hardwares

**CONs**: Limited resources
**CONs**: Longer and more complex installation process

### Usage note

> ⚠️ _The Streamlit application was successfully developed and tested on a Windows 10.0.22631 machine, with 32GB RAM, 16 core CPU and Nvidia GEFORCE RTX4050 GPU (6GB, cuda version 12.3), python version 3.11.9_
> ⚠️ _The Docker container was successfully tested on a Windows 10.0.22631 machine and on a Ubuntu 22.04.3 machine_
> ⚠️ _The Gradio application was successfully developed and tested on a Windows 10.0.22631 machine, with 32GB RAM, 16 core CPU and Nvidia GEFORCE RTX4050 GPU (6GB, cuda version 12.3), python version 3.11.9_
Although being at a good stage of development, the application is a `beta` and might still contain bugs and have OS/hardware/python version incompatibilities.

## Demo

You can try out SearchPhi on [this HuggingFace Space](https://huggingface.co/spaces/as-cle-bert/SearchPhi).

Here's a video demo of what it can do:

![Video demo for SearechPhi](./imgs/demo.gif)
Expand Down
27 changes: 0 additions & 27 deletions app.py

This file was deleted.

40 changes: 40 additions & 0 deletions compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
networks:
mynet:
driver: bridge

services:
db:
image: postgres
restart: always
ports:
- "5432:5432"
networks:
- mynet
environment:
POSTGRES_DB: $PG_DB
POSTGRES_USER: $PG_USER
POSTGRES_PASSWORD: $PG_PASSWORD
volumes:
- pgdata:/var/lib/postgresql/data

semantic_memory:
image: qdrant/qdrant
restart: always
ports:
- "6333:6333"
- "6334:6334"
networks:
- mynet
volumes:
- "./qdrant_storage:/qdrant/storage"

adminer:
image: adminer
restart: always
ports:
- "8080:8080"
networks:
- mynet

volumes:
pgdata:
Binary file added imgs/PrAIvateSearch_Flowchart.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added imgs/PrAIvateSearch_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed imgs/SearchPhi_logo.png
Binary file not shown.
Binary file modified imgs/demo.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
50 changes: 50 additions & 0 deletions lib/scripts/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
import warnings
warnings.filterwarnings("ignore")

import gradio as gr
from text_inference import text_inference
from image_gen import caption_image
from PIL import Image
from websearching import web_search, date_for_debug

def reply(text_input, image_input=None, max_results=5, enable_rag=False, debug = True):
if debug:
print(f"[{date_for_debug()}] Started query processing...")
if image_input is None:
prompt, qdrant_success = web_search(text_input, max_results, enable_rag, debug)
if debug:
print(qdrant_success)
results = text_inference(prompt, debug)
results = results.replace("<|im_end|>","")
if debug:
print(f"[{date_for_debug()}] Finished query processing!")
return results
else:
if text_input:
img = Image.fromarray(image_input)
caption = caption_image(img)
full_query = caption +"\n\n"+text_input
prompt, qdrant_success = web_search(full_query, max_results, enable_rag)
if debug:
print(qdrant_success)
results = text_inference(prompt, debug)
results = results.replace("<|im_end|>","")
if debug:
print(f"[{date_for_debug()}] Finished query processing!")
return results
else:
img = Image.fromarray(image_input)
caption = caption_image(img)
prompt, qdrant_success = web_search(caption, max_results, enable_rag)
if debug:
print(qdrant_success)
results = text_inference(prompt, debug)
results = results.replace("<|im_end|>","")
if debug:
print(f"[{date_for_debug()}] Finished query processing!")
return results


iface = gr.Interface(fn=reply, inputs=[gr.Textbox(value="",label="Search Query"), gr.Image(value=None, label="Image Search Query"), gr.Slider(1,10,value=5,label="Maximum Number of Search Results", step=1), gr.Checkbox(value=False, label="Enable RAG"), gr.Checkbox(value=True, label="Debug")], outputs=[gr.Markdown(value="Your output will be generated here", label="Search Results")], title="PrAIvateSearch")

iface.launch(server_name="0.0.0.0", server_port=7860)
Loading

0 comments on commit a460e73

Please sign in to comment.