Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add to database endpoint #593

Merged
merged 16 commits into from
Oct 30, 2024

Conversation

cparish312
Copy link
Contributor


name: Add /add endpoint to database
about: Creates an endpoint to add frames, ocr_results, and transcription results to the screenpipe database from outside sources


description

Creates an endpoint to add frames, ocr_results, and transcription results to the screenpipe database from outside sources

related issue: #
/claim #467

type of change

  • new feature

checklist

  • i have read the CONTRIBUTING.md file
  • i have added the custom cursor AI prompt to my settings as mentioned in CONTRIBUTING.md and used to write this PR
  • my code follows the project's style guidelines
  • i have performed a self-review of my code
  • [] i have updated the documentation if necessary
  • my changes generate no new warnings
  • i have added tests that prove my fix is effective or that my feature works
  • all tests pass locally with my changes

additional notes

any other relevant information about the pr.

Copy link

vercel bot commented Oct 28, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
screenpipe ✅ Ready (Inspect) Visit Preview 💬 Add feedback Oct 29, 2024 11:52pm

@cparish312
Copy link
Contributor Author

@louis030195 Not sure how you want to add tests for the frame writing to mp4 bit.

Also, currently you can't specify the OcrEngine used to generate the ocr_results and it just inputs the default engine

match &payload.content {
AddContentData::Frames(frames) => {
if !frames.is_empty() {
let output_dir = state.screenpipe_dir.join("videos");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the dir is data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I pulled from merge_frames_handler in server.rs. Should I change that to data too or is that correct for that function?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh

can you change merge_frames_handler to go to data

i programmed this quick & dirty mode during a hackathon

@louis030195
Copy link
Collaborator

how can i test?

@cparish312
Copy link
Contributor Author

Working on testing now. So right now there is a foreign key constraint on the the audio_transcription table having an audio_chunk_id in the audio_chunks table. How would you like to handle this for adding transcriptions without associated audio_chunks

@louis030195
Copy link
Collaborator

louis030195 commented Oct 29, 2024

@cparish312

Working on testing now. So right now there is a foreign key constraint on the the audio_transcription table having an audio_chunk_id in the audio_chunks table. How would you like to handle this for adding transcriptions without associated audio_chunks

hmm okay so the use case is that you don't have the .mp4 audio recording to share?

like maybe someone is syncing their iphone manual recording and dont have .mp4 or lazy and we want to allow them to sync just the transcription without audio chunk

that's a bit annoying because all the code is based around this, like the search, in the UI we display result of path to video etc

dumb workaround is to generate TTS the chunk using AI XD

what are the possible solutions?

@louis030195
Copy link
Collaborator

louis030195 commented Oct 29, 2024

i guess we have to allow nullable and in the UI not showing any audio chunk

@cparish312
Copy link
Contributor Author

@louis030195 Okay cool yeah that probably makes the most sense

@cparish312
Copy link
Contributor Author

Oh man didn't realize how much of a pain a nullable migration is in SQLite

@cparish312
Copy link
Contributor Author

Yeah this is causing some issues for searching

@cparish312
Copy link
Contributor Author

The transcription is showing up when hitting the search endpoint, but not seeing it in the UI. Assuming this is because there is no path / audio_chunk_id. How do you want to handle the results in the UI?

@cparish312
Copy link
Contributor Author

Updated so it just shows "No file path available for this audio." in the app search UI when there is no audio path

transcription_engine TEXT NOT NULL DEFAULT 'Whisper',
device TEXT NOT NULL DEFAULT '',
is_input_device BOOLEAN NOT NULL DEFAULT TRUE,
FOREIGN KEY (audio_chunk_id) REFERENCES audio_chunks(id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it nullable?

kinda scary mutation, i have 120 gb of screenpipe data, can it take a while?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

audio_chunk_id isn't currently nullable. Yeah this is super non-ideal but seems like the only way to make the column nullable in SQLite. Not sure how long it might take but I doubt too long considering it is just text data. Other options I can think of are removing the foreign key constraint or creating a dummy (or multiple dummy) audio_chunks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind same issue removing the foreign key constraint. Dummy audio_chunks is the only solution I can think of.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe something clean is to remove foreign key and be optional string and in the audio_chunks table add a foreign key reference to audio_transcriptions

?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why cant we remove foreign key?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise if not possible yeah let's just add dummy audio_chunk with no file path / empty

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to do the same process of creating a new table and copying over. The audio_chunks file_path is also not nullable so will need to put in "no_path" or just an empty string

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets jsut do dummy chunk with empty file path

Copy link
Contributor Author

@cparish312 cparish312 Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call that migration was a nightmare

let mut success_messages = Vec::new();

match payload.content.content_type.as_str() {
"Frames" => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that uppercase?

seems like it should be snake case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup sorry this was left over from a different way of doing this. Will fix!

@cparish312
Copy link
Contributor Author

cparish312 commented Oct 29, 2024

Testing transcription insert:
curl -X POST "http://localhost:3030/add" \ -H "Content-Type: application/json" \ -d '{ "device_name": "MacBook Pro Microphone (input)", "content": { "content_type": "transcription", "data": { "transcription": "This is an example transcription of recorded audio.", "transcription_engine": "speech_to_text_v1" } } }'

@cparish312
Copy link
Contributor Author

Testing frames insert. Will need to change the file_paths to paths that exist on your computer:
`curl -X POST "http://localhost:3030/add" \ -H "Content-Type: application/json" \ -d '{'device_name': 'hindsight_android',
'content': {'content_type': 'frames',
'data': [{'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433244710.jpg',
'timestamp': '2024-06-03T16:47:24.710000038Z',
'app_name': 'Clock',
'window_name': 'Clock',
'ocr_results': [],
'tags': ['hindsight', 'Clock']},
{'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433242624.jpg',
'timestamp': '2024-06-03T16:47:22.624000072Z',
'app_name': 'Clock',
'window_name': 'Clock',
'ocr_results': [],
'tags': ['hindsight', 'Clock']}]}}'

@louis030195
Copy link
Collaborator

Testing frames insert. Will need to change the file_paths to paths that exist on your computer: `curl -X POST "http://localhost:3030/add" \ -H "Content-Type: application/json" \ -d '{'device_name': 'hindsight_android', 'content': {'content_type': 'frames', 'data': [{'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433244710.jpg', 'timestamp': '2024-06-03T16:47:24.710000038Z', 'app_name': 'Clock', 'window_name': 'Clock', 'ocr_results': [], 'tags': ['hindsight', 'Clock']}, {'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433242624.jpg', 'timestamp': '2024-06-03T16:47:22.624000072Z', 'app_name': 'Clock', 'window_name': 'Clock', 'ocr_results': [], 'tags': ['hindsight', 'Clock']}]}}'

will try today

@louis030195
Copy link
Collaborator

louis030195 commented Oct 29, 2024

audio

curl -X POST "http://localhost:3035/add" -H "Content-Type: application/json" -d '{
  "device_name": "MacBook Pro Microphone (input)",
  "content": {
    "content_type": "transcription",
    "data": {
      "transcription": "This is an example transcription of recorded audio.",
      "transcription_engine": "speech_to_text_v1"
    }
  }
}' | jq

curl -X GET "http://localhost:3035/search?q=example&content_type=audio" -H "Content-Type: application/json" | jq
{
  "data": [
    {
      "type": "Audio",
      "content": {
        "chunk_id": 8,
        "transcription": "This is an example transcription of recorded audio.",
        "timestamp": "2024-10-29T21:34:06.615182Z",
        "file_path": "",
        "offset_index": -1,
        "tags": [],
        "device_name": "MacBook Pro Microphone (input)",
        "device_type": "Input"
      }
    }
  ],
  "pagination": {
    "limit": 20,
    "offset": 0,
    "total": 1
  }
}

frames

curl -X POST "http://localhost:3035/add" -H "Content-Type: application/json" -d '{
  "device_name": "macbook_pro",
  "content": {
    "content_type": "frames",
    "data": [
      {
        "file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/02722091-76A7-4215-9CAB-E4A4DC5A37BA.png",
        "timestamp": "2024-03-14T16:47:24.710Z",
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "ocr_results": [],
        "tags": ["screenshot", "desktop"]
      },
      {
        "file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/0D7F899B-DE6B-494E-B70D-1F5338A54AEE.png",
        "timestamp": "2024-03-14T16:47:22.624Z",
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "ocr_results": [],
        "tags": ["screenshot", "desktop"]
      }
    ]
  }
}' | jq

curl -X GET "http://localhost:3035/search?window_name=screenshot&content_type=ocr&limit=1000" -H "Content-Type: application/json" | jq
{
  "success": true,
  "message": "Frames added successfully"
}

{
  "data": [],
  "pagination": {
    "limit": 1000,
    "offset": 0,
    "total": 0
  }
}

not sure i did a mistake or, expected to get the frame here

also nto seeing the merged video

(env) (base) louisbeaumont@mac:~/Documents/screen-pipe$ ls /tmp/sp/data/
Display 1 (output)_2024-10-29_21-27-01.mp4              monitor_1_2024-10-29_21-31-45.mp4                       monitor_1_2024-10-29_21-38-39.mp4
MacBook Pro Microphone (input)_2024-10-29_21-27-14.mp4  monitor_1_2024-10-29_21-32-53.mp4                       monitor_1_2024-10-29_21-39-49.mp4
macbook_pro_2024-10-29_21-40-21.mp4                     monitor_1_2024-10-29_21-34-22.mp4                       monitor_1_2024-10-29_21-41-04.mp4
macbook_pro_2024-10-29_21-43-37.mp4                     monitor_1_2024-10-29_21-35-34.mp4                       monitor_1_2024-10-29_21-42-15.mp4
monitor_1_2024-10-29_21-26-44.mp4                       monitor_1_2024-10-29_21-37-28.mp4

also don't you have OCR?

i assume this API might be used in a very broad range of use case so should be flexible for example:

  • adding family photo to my screenpipe just the frames
  • adding documents like legal stuff idk, here i'd want screenpipe to do OCR when i add it (using given engine)
  • adding raw audio files - screenpipe could extract the transcription to disk (using given engine)
  • etc.

for the scope of this PR we can stick to the minimum i think, not much post processing

@cparish312
Copy link
Contributor Author

cparish312 commented Oct 29, 2024

Yeah I agree running OCR by default when OCR results are not provided would be ideal but sounds good to add in another PR.

Are the macbook_pro videos not the merged videos? I'm storing by "{device_name}_{current_time}.mp4"

Maybe they aren't appearing in the search since there are no ocr results? Could you try putting in OCR results.

curl -X POST "http://localhost:3035/add" -H "Content-Type: application/json" -d '{
  "device_name": "macbook_pro",
  "content": {
    "content_type": "frames",
    "data": [
      {
        "file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/02722091-76A7-4215-9CAB-E4A4DC5A37BA.png",
        "timestamp": "2024-03-14T16:47:24.710Z",
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "ocr_results": [{'text': 'test add frames with ocr results',
                                'text_json': '{}',
                                'ocr_engine': 'apple_native'}],
        "tags": ["screenshot", "desktop"]
      },
      {
        "file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/0D7F899B-DE6B-494E-B70D-1F5338A54AEE.png",
        "timestamp": "2024-03-14T16:47:22.624Z",
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "ocr_results":  [{'text': 'test add frames with ocr results 2',
                                'text_json': '{}',
                                'ocr_engine': 'apple_native'}],
        "tags": ["screenshot", "desktop"]
      }
    ]
  }
}' | jq

@louis030195
Copy link
Collaborator

ref: BasedHardware/omi#1212

@louis030195
Copy link
Collaborator

works!

{
  "data": [
    {
      "type": "OCR",
      "content": {
        "frame_id": 1,
        "text": "test add frames with ocr results",
        "timestamp": "2024-03-14T16:47:24.710Z",
        "file_path": "/tmp/spp/data/macbook_pro_2024-10-29_23-25-31.mp4",
        "offset_index": 0,
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "tags": [
          "screenshot",
          "desktop"
        ],
        "frame": null
      }
    },
    {
      "type": "OCR",
      "content": {
        "frame_id": 2,
        "text": "test add frames with ocr results 2",
        "timestamp": "2024-03-14T16:47:22.624Z",
        "file_path": "/tmp/spp/data/macbook_pro_2024-10-29_23-25-31.mp4",
        "offset_index": 1,
        "app_name": "Desktop",
        "window_name": "Screenshot",
        "tags": [
          "screenshot",
          "desktop"
        ],
        "frame": null
      }
    }
  ],
  "pagination": {
    "limit": 1000,
    "offset": 0,
    "total": 2
  }
}

@louis030195
Copy link
Collaborator

@cparish312 should i merge now?

@cparish312
Copy link
Contributor Author

@louis030195 Did some final cleanups should be good to go!

@louis030195
Copy link
Collaborator

/approve

thx!

one use case i'd want to try (would need to add a OCR option) is to create an apple shortcut to add a document into screenpipe, maybe a pdf converted to image

Copy link

algora-pbc bot commented Oct 30, 2024

@louis030195: The claim has been successfully added to reward-all. You can visit your dashboard to complete the payment.

@louis030195 louis030195 merged commit 1985efe into mediar-ai:main Oct 30, 2024
4 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants