Add to database endpoint #593

cparish312 · 2024-10-28T15:19:22Z

name: Add /add endpoint to database
about: Creates an endpoint to add frames, ocr_results, and transcription results to the screenpipe database from outside sources

description

Creates an endpoint to add frames, ocr_results, and transcription results to the screenpipe database from outside sources

related issue: #
/claim #467

type of change

new feature

checklist

i have read the CONTRIBUTING.md file
i have added the custom cursor AI prompt to my settings as mentioned in CONTRIBUTING.md and used to write this PR
my code follows the project's style guidelines
i have performed a self-review of my code
[] i have updated the documentation if necessary
my changes generate no new warnings
i have added tests that prove my fix is effective or that my feature works
all tests pass locally with my changes

additional notes

any other relevant information about the pr.

vercel · 2024-10-28T15:19:35Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
screenpipe	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Oct 29, 2024 11:52pm

cparish312 · 2024-10-28T15:28:55Z

@louis030195 Not sure how you want to add tests for the frame writing to mp4 bit.

Also, currently you can't specify the OcrEngine used to generate the ocr_results and it just inputs the default engine

louis030195 · 2024-10-28T15:42:29Z

screenpipe-server/src/server.rs

+    match &payload.content {
+        AddContentData::Frames(frames) => {
+            if !frames.is_empty() {
+                let output_dir = state.screenpipe_dir.join("videos");


the dir is data

Ahh I pulled from merge_frames_handler in server.rs. Should I change that to data too or is that correct for that function?

oh

can you change merge_frames_handler to go to data

i programmed this quick & dirty mode during a hackathon

louis030195 · 2024-10-28T15:45:38Z

how can i test?

cparish312 · 2024-10-28T20:26:15Z

Working on testing now. So right now there is a foreign key constraint on the the audio_transcription table having an audio_chunk_id in the audio_chunks table. How would you like to handle this for adding transcriptions without associated audio_chunks

louis030195 · 2024-10-29T00:26:33Z

@cparish312

Working on testing now. So right now there is a foreign key constraint on the the audio_transcription table having an audio_chunk_id in the audio_chunks table. How would you like to handle this for adding transcriptions without associated audio_chunks

hmm okay so the use case is that you don't have the .mp4 audio recording to share?

like maybe someone is syncing their iphone manual recording and dont have .mp4 or lazy and we want to allow them to sync just the transcription without audio chunk

that's a bit annoying because all the code is based around this, like the search, in the UI we display result of path to video etc

dumb workaround is to generate TTS the chunk using AI XD

what are the possible solutions?

louis030195 · 2024-10-29T00:32:49Z

i guess we have to allow nullable and in the UI not showing any audio chunk

cparish312 · 2024-10-29T01:12:26Z

@louis030195 Okay cool yeah that probably makes the most sense

cparish312 · 2024-10-29T01:32:51Z

Oh man didn't realize how much of a pain a nullable migration is in SQLite

cparish312 · 2024-10-29T02:21:56Z

Yeah this is causing some issues for searching

cparish312 · 2024-10-29T02:29:26Z

The transcription is showing up when hitting the search endpoint, but not seeing it in the UI. Assuming this is because there is no path / audio_chunk_id. How do you want to handle the results in the UI?

cparish312 · 2024-10-29T15:02:39Z

Updated so it just shows "No file path available for this audio." in the app search UI when there is no audio path

louis030195 · 2024-10-29T15:25:18Z

screenpipe-server/src/migrations/20241029015040_audio_transcription_audio_chunk_id_nullable.sql

+    transcription_engine TEXT NOT NULL DEFAULT 'Whisper',
+    device TEXT NOT NULL DEFAULT '',
+    is_input_device BOOLEAN NOT NULL DEFAULT TRUE,
+    FOREIGN KEY (audio_chunk_id) REFERENCES audio_chunks(id)


is it nullable?

kinda scary mutation, i have 120 gb of screenpipe data, can it take a while?

audio_chunk_id isn't currently nullable. Yeah this is super non-ideal but seems like the only way to make the column nullable in SQLite. Not sure how long it might take but I doubt too long considering it is just text data. Other options I can think of are removing the foreign key constraint or creating a dummy (or multiple dummy) audio_chunks

Never mind same issue removing the foreign key constraint. Dummy audio_chunks is the only solution I can think of.

maybe something clean is to remove foreign key and be optional string and in the audio_chunks table add a foreign key reference to audio_transcriptions

?

why cant we remove foreign key?

otherwise if not possible yeah let's just add dummy audio_chunk with no file path / empty

You have to do the same process of creating a new table and copying over. The audio_chunks file_path is also not nullable so will need to put in "no_path" or just an empty string

lets jsut do dummy chunk with empty file path

Good call that migration was a nightmare

louis030195 · 2024-10-29T15:28:26Z

screenpipe-server/src/server.rs

+    let mut success_messages = Vec::new();
+
+    match payload.content.content_type.as_str() {
+        "Frames" => {


why is that uppercase?

seems like it should be snake case

Yup sorry this was left over from a different way of doing this. Will fix!

cparish312 · 2024-10-29T17:57:02Z

Testing transcription insert:
curl -X POST "http://localhost:3030/add" \ -H "Content-Type: application/json" \ -d '{ "device_name": "MacBook Pro Microphone (input)", "content": { "content_type": "transcription", "data": { "transcription": "This is an example transcription of recorded audio.", "transcription_engine": "speech_to_text_v1" } } }'

cparish312 · 2024-10-29T18:00:28Z

Testing frames insert. Will need to change the file_paths to paths that exist on your computer:
`curl -X POST "http://localhost:3030/add" \ -H "Content-Type: application/json" \ -d '{'device_name': 'hindsight_android',
'content': {'content_type': 'frames',
'data': [{'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433244710.jpg',
'timestamp': '2024-06-03T16:47:24.710000038Z',
'app_name': 'Clock',
'window_name': 'Clock',
'ocr_results': [],
'tags': ['hindsight', 'Clock']},
{'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433242624.jpg',
'timestamp': '2024-06-03T16:47:22.624000072Z',
'app_name': 'Clock',
'window_name': 'Clock',
'ocr_results': [],
'tags': ['hindsight', 'Clock']}]}}'

louis030195 · 2024-10-29T20:36:28Z

Testing frames insert. Will need to change the file_paths to paths that exist on your computer: `curl -X POST "http://localhost:3030/add" \ -H "Content-Type: application/json" \ -d '{'device_name': 'hindsight_android', 'content': {'content_type': 'frames', 'data': [{'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433244710.jpg', 'timestamp': '2024-06-03T16:47:24.710000038Z', 'app_name': 'Clock', 'window_name': 'Clock', 'ocr_results': [], 'tags': ['hindsight', 'Clock']}, {'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433242624.jpg', 'timestamp': '2024-06-03T16:47:22.624000072Z', 'app_name': 'Clock', 'window_name': 'Clock', 'ocr_results': [], 'tags': ['hindsight', 'Clock']}]}}'

will try today

louis030195 · 2024-10-29T21:34:50Z

audio

curl -X POST "http://localhost:3035/add" -H "Content-Type: application/json" -d '{
  "device_name": "MacBook Pro Microphone (input)",
  "content": {
    "content_type": "transcription",
    "data": {
      "transcription": "This is an example transcription of recorded audio.",
      "transcription_engine": "speech_to_text_v1"
    }
  }
}' | jq

curl -X GET "http://localhost:3035/search?q=example&content_type=audio" -H "Content-Type: application/json" | jq

{
  "data": [
    {
      "type": "Audio",
      "content": {
        "chunk_id": 8,
        "transcription": "This is an example transcription of recorded audio.",
        "timestamp": "2024-10-29T21:34:06.615182Z",
        "file_path": "",
        "offset_index": -1,
        "tags": [],
        "device_name": "MacBook Pro Microphone (input)",
        "device_type": "Input"
      }
    }
  ],
  "pagination": {
    "limit": 20,
    "offset": 0,
    "total": 1
  }
}

frames

curl -X POST "http://localhost:3035/add" -H "Content-Type: application/json" -d '{
  "device_name": "macbook_pro",
  "content": {
    "content_type": "frames",
    "data": [
      {
        "file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/02722091-76A7-4215-9CAB-E4A4DC5A37BA.png",
        "timestamp": "2024-03-14T16:47:24.710Z",
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "ocr_results": [],
        "tags": ["screenshot", "desktop"]
      },
      {
        "file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/0D7F899B-DE6B-494E-B70D-1F5338A54AEE.png",
        "timestamp": "2024-03-14T16:47:22.624Z",
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "ocr_results": [],
        "tags": ["screenshot", "desktop"]
      }
    ]
  }
}' | jq

curl -X GET "http://localhost:3035/search?window_name=screenshot&content_type=ocr&limit=1000" -H "Content-Type: application/json" | jq

{
  "success": true,
  "message": "Frames added successfully"
}

{
  "data": [],
  "pagination": {
    "limit": 1000,
    "offset": 0,
    "total": 0
  }
}

not sure i did a mistake or, expected to get the frame here

also nto seeing the merged video

(env) (base) louisbeaumont@mac:~/Documents/screen-pipe$ ls /tmp/sp/data/
Display 1 (output)_2024-10-29_21-27-01.mp4              monitor_1_2024-10-29_21-31-45.mp4                       monitor_1_2024-10-29_21-38-39.mp4
MacBook Pro Microphone (input)_2024-10-29_21-27-14.mp4  monitor_1_2024-10-29_21-32-53.mp4                       monitor_1_2024-10-29_21-39-49.mp4
macbook_pro_2024-10-29_21-40-21.mp4                     monitor_1_2024-10-29_21-34-22.mp4                       monitor_1_2024-10-29_21-41-04.mp4
macbook_pro_2024-10-29_21-43-37.mp4                     monitor_1_2024-10-29_21-35-34.mp4                       monitor_1_2024-10-29_21-42-15.mp4
monitor_1_2024-10-29_21-26-44.mp4                       monitor_1_2024-10-29_21-37-28.mp4

also don't you have OCR?

i assume this API might be used in a very broad range of use case so should be flexible for example:

adding family photo to my screenpipe just the frames
adding documents like legal stuff idk, here i'd want screenpipe to do OCR when i add it (using given engine)
adding raw audio files - screenpipe could extract the transcription to disk (using given engine)
etc.

for the scope of this PR we can stick to the minimum i think, not much post processing

cparish312 · 2024-10-29T22:09:47Z

Yeah I agree running OCR by default when OCR results are not provided would be ideal but sounds good to add in another PR.

Are the macbook_pro videos not the merged videos? I'm storing by "{device_name}_{current_time}.mp4"

Maybe they aren't appearing in the search since there are no ocr results? Could you try putting in OCR results.

curl -X POST "http://localhost:3035/add" -H "Content-Type: application/json" -d '{
  "device_name": "macbook_pro",
  "content": {
    "content_type": "frames",
    "data": [
      {
        "file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/02722091-76A7-4215-9CAB-E4A4DC5A37BA.png",
        "timestamp": "2024-03-14T16:47:24.710Z",
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "ocr_results": [{'text': 'test add frames with ocr results',
                                'text_json': '{}',
                                'ocr_engine': 'apple_native'}],
        "tags": ["screenshot", "desktop"]
      },
      {
        "file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/0D7F899B-DE6B-494E-B70D-1F5338A54AEE.png",
        "timestamp": "2024-03-14T16:47:22.624Z",
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "ocr_results":  [{'text': 'test add frames with ocr results 2',
                                'text_json': '{}',
                                'ocr_engine': 'apple_native'}],
        "tags": ["screenshot", "desktop"]
      }
    ]
  }
}' | jq

louis030195 · 2024-10-29T22:56:09Z

ref: BasedHardware/omi#1212

louis030195 · 2024-10-29T23:26:21Z

works!

{
  "data": [
    {
      "type": "OCR",
      "content": {
        "frame_id": 1,
        "text": "test add frames with ocr results",
        "timestamp": "2024-03-14T16:47:24.710Z",
        "file_path": "/tmp/spp/data/macbook_pro_2024-10-29_23-25-31.mp4",
        "offset_index": 0,
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "tags": [
          "screenshot",
          "desktop"
        ],
        "frame": null
      }
    },
    {
      "type": "OCR",
      "content": {
        "frame_id": 2,
        "text": "test add frames with ocr results 2",
        "timestamp": "2024-03-14T16:47:22.624Z",
        "file_path": "/tmp/spp/data/macbook_pro_2024-10-29_23-25-31.mp4",
        "offset_index": 1,
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "tags": [
          "screenshot",
          "desktop"
        ],
        "frame": null
      }
    }
  ],
  "pagination": {
    "limit": 1000,
    "offset": 0,
    "total": 2
  }
}

louis030195 · 2024-10-29T23:30:58Z

@cparish312 should i merge now?

cparish312 · 2024-10-29T23:52:32Z

@louis030195 Did some final cleanups should be good to go!

louis030195 · 2024-10-30T16:34:20Z

/approve

thx!

one use case i'd want to try (would need to add a OCR option) is to create an apple shortcut to add a document into screenpipe, maybe a pdf converted to image

algora-pbc · 2024-10-30T16:34:22Z

@louis030195: The claim has been successfully added to reward-all. You can visit your dashboard to complete the payment.

cparish312 added 5 commits October 24, 2024 14:18

migration

e6d3f13

handle device_name column

3f1201f

server changes removed

4872463

starting add endpoint

7735e3f

add add endpoint to database

cb22f00

algora-pbc bot mentioned this pull request Oct 28, 2024

$200 - endpoint to ingest data in screenpipe #467

Closed

algora-pbc bot added the 🙋 Bounty claim label Oct 28, 2024

vercel bot deployed to Preview October 28, 2024 15:21 View deployment

Merge branch 'main' into add_to_database_endpoint

b07f39e

vercel bot deployed to Preview October 28, 2024 15:26 View deployment

minor cleanup

6665dca

louis030195 reviewed Oct 28, 2024

View reviewed changes

vercel bot deployed to Preview October 28, 2024 15:42 View deployment

fix endpoint json parsing

867ef98

vercel bot deployed to Preview October 28, 2024 20:53 View deployment

audio_chunk_id nullable in audio_transcriptions

4c1b31f

vercel bot deployed to Preview October 29, 2024 02:05 View deployment

cparish312 added 2 commits October 29, 2024 10:46

updated tests

56e1638

UI search handle no audio path

36d58cb

vercel bot deployed to Preview October 29, 2024 15:02 View deployment

louis030195 reviewed Oct 29, 2024

View reviewed changes

audio_chunk_id back to not nullable

75b3415

vercel bot deployed to Preview October 29, 2024 17:50 View deployment

app to handle empty string audio_chunk_id

ca59a80

vercel bot deployed to Preview October 29, 2024 17:55 View deployment

cleanup audio_chunk_id not nullable

c9fdb78

vercel bot deployed to Preview October 29, 2024 18:13 View deployment

cleanup comment

8558623

vercel bot deployed to Preview October 29, 2024 23:50 View deployment

JOIN instead of LEFT JOIN on audio_chunks_id

2b2f543

vercel bot deployed to Preview October 29, 2024 23:52 View deployment

louis030195 merged commit 1985efe into mediar-ai:main Oct 30, 2024
4 of 7 checks passed

Add to database endpoint #593

Add to database endpoint #593

Conversation

cparish312 commented Oct 28, 2024

description

type of change

checklist

additional notes

vercel bot commented Oct 28, 2024 • edited Loading

cparish312 commented Oct 28, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

louis030195 commented Oct 28, 2024

cparish312 commented Oct 28, 2024

louis030195 commented Oct 29, 2024 • edited Loading

louis030195 commented Oct 29, 2024 • edited Loading

cparish312 commented Oct 29, 2024

cparish312 commented Oct 29, 2024

cparish312 commented Oct 29, 2024

cparish312 commented Oct 29, 2024

cparish312 commented Oct 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cparish312 Oct 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cparish312 commented Oct 29, 2024 • edited Loading

cparish312 commented Oct 29, 2024

louis030195 commented Oct 29, 2024

louis030195 commented Oct 29, 2024 • edited Loading

audio

frames

cparish312 commented Oct 29, 2024 • edited Loading

louis030195 commented Oct 29, 2024

louis030195 commented Oct 29, 2024

louis030195 commented Oct 29, 2024

cparish312 commented Oct 29, 2024

louis030195 commented Oct 30, 2024

algora-pbc bot commented Oct 30, 2024

vercel bot commented Oct 28, 2024 •

edited

Loading

louis030195 commented Oct 29, 2024 •

edited

Loading

louis030195 commented Oct 29, 2024 •

edited

Loading

cparish312 Oct 29, 2024 •

edited

Loading

cparish312 commented Oct 29, 2024 •

edited

Loading

louis030195 commented Oct 29, 2024 •

edited

Loading

cparish312 commented Oct 29, 2024 •

edited

Loading