Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support local vector search for the selection of key text segments of page and video context #36801

Closed
bbondy opened this issue Mar 14, 2024 · 5 comments · Fixed by brave/brave-core#24953
Assignees
Labels
browser-ai OS/Android Fixes related to Android browser functionality OS/Desktop OS/iOS Fixes related to iOS browser functionality priority/P3 The next thing for us to work on. It'll ride the trains. QA Pass - Android ARM QA Pass - iPhone QA Pass-Win64 QA/Test-Plan-Specified QA/Yes release-notes/include

Comments

@bbondy
Copy link
Member

bbondy commented Mar 14, 2024

Key use case to support:
What is the Gnarly nutrition discount code in this video?
Revealed at 26:56 of the video, but because of our truncation method, Leo can't answer.

Often times our summarization and Q&A leads to truncation of the input text we send up. For now we warn about it, but users get this warning very often. And often we can't produce the answers people need because they are not within the first N characters.

Your mission, should you chose to accept, is to support this local vector search.

This should only be employed when the context is too long. For summarization tasks, it might be useful to select dissimilar things. We could perhaps see if the input query is a summarization query via a similar similarity comparison.

It likely involves some steps like:

  • Preprocess the text: lowercase, remove special characters, sentence chunking
  • Convert the preprocessed text into vectors: TF-IDF: Term Frequency-Inverse Document Frequency or Word2Vec or other
  • Divide the long text into chunks and doing a similarity measurement
  • For each chunk calculate the similarity to the prompt using cosine similarity or other
  • Find best matches and sending only that subset to avoid the truncation warning we typically give and also be able to support better answers.

I'm not sure if we'd use TFLite or just custom code for this type of stuff, so it involves some investigation.

@bbondy bbondy added OS/Android Fixes related to Android browser functionality OS/Desktop OS/iOS Fixes related to iOS browser functionality labels Mar 14, 2024
@mattmcalister mattmcalister added priority/P3 The next thing for us to work on. It'll ride the trains. browser-ai labels Mar 15, 2024
@mattmcalister mattmcalister moved this to Todo in Browser AI Mar 15, 2024
@mattmcalister mattmcalister moved this from Todo to In Progress in Browser AI May 29, 2024
@darkdh darkdh self-assigned this May 29, 2024
@darkdh
Copy link
Member

darkdh commented Jul 31, 2024

summarization will be excluded in this PR and tracked in a different issue

@darkdh darkdh moved this from In Progress to In Review in Browser AI Aug 6, 2024
@github-project-automation github-project-automation bot moved this from In Review to Done in Browser AI Aug 22, 2024
@brave-builds brave-builds added this to the 1.71.x - Nightly milestone Aug 22, 2024
@srirambv
Copy link
Contributor

Verification passed on

Brave 1.71.112 Chromium: 130.0.6723.44 (Official Build) (64-bit)
Revision 7d51df656b247f9432ee714a6d160142a1e11c13
OS Windows 11 Version 23H2 (Build 22631.4317)
  • Verified steps from brave/brave-core#24953
  • Verified brave-ai-chat-page-content-refine is disabled by default
  • Verified when the flag is enabled, profile folder creates AIChatLocalModels
  • Verified when the flag is disabled, profile folder removes AIChatLocalModels
  • Verified when browser-ai-chat is disabled, also removes AIChatLocalModels when brave-ai-chat-page-content-refine is enabled
  • Verified Page content is refined or Page content is truncated is shown when page refine flag is enabled and very long article is used to summarize or ask questions
36801.-.Scenario.2.mp4
36801.-.Scenario.1.mp4

@hffvld hffvld added the QA/In-Progress Indicates that QA is currently in progress for that particular issue label Oct 15, 2024
@hffvld
Copy link
Contributor

hffvld commented Oct 16, 2024

Verified only test case 1 which is required rooted Android device. To verify test case 2 we need to wait for brave/brave-core#26006 and brave/brave-core#26041 to be uplifted into 1.71.x.


Verified on Pixel 7 using version(s):

Device/OS: Pixel 7 / panther_beta-user 15 AP41.240823.009 release-keys
Brave build: 1.71.112
Chromium: 130.0.6723.44 (Official Build) (64-bit) 

Component updater, the dir name would be "AIChatLocalModels"

STEPS:

  1. Follow the STR/TP from AI Chat feature: Page Content Refine brave-core#24953 (comment)
  2. Verify

ACTUAL RESULTS:

  • Verified that brave-ai-chat-page-content-refine is OFF by default
  • Verified that AIChatLocalModels folder is not present in the Profile folder if brave-ai-chat-page-content-refine is OFF
  • Verified that AIChatLocalModels folder was created and shown in the Profile folder if brave-ai-chat-page-content-refine is ON

brave-ai-chat-page-content-refine is OFF brave-ai-chat-page-content-refine is ON
1 2
1 2

@hffvld hffvld removed the QA/In-Progress Indicates that QA is currently in progress for that particular issue label Oct 16, 2024
@srirambv
Copy link
Contributor

Scenario 2 is verified as part of #41481 (comment)

@hffvld hffvld added the QA/In-Progress Indicates that QA is currently in progress for that particular issue label Oct 22, 2024
@hffvld
Copy link
Contributor

hffvld commented Oct 22, 2024

Verified on iPhone 14 using version(s):

Device/OS: iPhone 14 / iOS 17.7
Brave build: 1.71 (116)
BraveCore: 1.71.116 (130.0.6723.58)

Component updater, the dir name would be "AIChatLocalModels"

STEPS:

  1. Follow the steps from AI Chat feature: Page Content Refine brave-core#24953 (comment)
  2. Verify

ACTUAL RESULTS:

  • Verified that brave-ai-chat-page-content-refine flag is OFF by default
  • Verified that AIChatLocalModels folder is not present in the Library > Application Support > Chromium folder if brave-ai-chat-page-content-refine is OFF
  • Verified that AIChatLocalModels folder was created/downloaded and shown in Library > Application Support > Chromium folder if brave-ai-chat-page-content-refine is ON

Default(Disabled) Enabled flag
1 2
1 2
Page content refine (truncated/refined indicator doesn't apply on iOS)

STEPS:

  1. Follow the steps from AI Chat feature: Page Content Refine brave-core#24953 (comment)
  2. Verify

ACTUAL RESULTS:

  • Verified that truncated/refined indicator doesn't apply on iOS when brave-ai-chat-page-content-refine flag is ON or OFF
  • Verified that the conversation results are shown as expected

2024-10-22_17-04-51.mp4

@hffvld hffvld added QA Pass - iPhone and removed QA/In-Progress Indicates that QA is currently in progress for that particular issue labels Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
browser-ai OS/Android Fixes related to Android browser functionality OS/Desktop OS/iOS Fixes related to iOS browser functionality priority/P3 The next thing for us to work on. It'll ride the trains. QA Pass - Android ARM QA Pass - iPhone QA Pass-Win64 QA/Test-Plan-Specified QA/Yes release-notes/include
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

6 participants