Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(vscode): support collecting relevant snippets from recenlty changed file. #1844

Conversation

icycodes
Copy link
Member

@icycodes icycodes commented Apr 15, 2024

Complete TAB-512
Changes:

  • Agent: Added support to send relevantSnippetsFromChangedFiles in completion requests.
    • Also filtered out snippets that overlap with any snippet in declarations before sending request.
  • VSCode: Added support to collect relevant snippets from recently edited code chunks.
    • Implemented an in-memory code search engine powered by orama.

How the code search works:

  • Set up a listener for text changes in current workspace text files to create an indexing job.
  • Each indexing job for a file has a 1000ms debouncing interval to prevent multiple jobs at once.
  • The indexing worker extracts code in a window around the edited location, 20 lines before the first edited line to 20 lines after the last edited line.
  • The extracted code is split into chunks, each with at most 500 characters (stopping at a newline), with a 1 line overlap between neighboring chunks.
  • Each chunk is further divided into words, filtering out reserved keywords (e.g. function class) and using the remaining words as symbols to be indexed in the in-memory database.
  • To manage memory, when the number of indexed chunks exceeds 100, all chunks from the oldest indexed file are removed.
  • When an indexed file needs updating, all old chunks from this file are deleted and the file is re-chunked, with the window range for extracting code being union.
  • For every code completion request, the same method is used to extract symbols from prefix lines, which are then used as the search term to be matched. The search candidates are limited to other files with the same language, with up to 3 chunks selected.

@icycodes icycodes marked this pull request as draft April 15, 2024 07:04
@icycodes icycodes marked this pull request as ready for review April 16, 2024 10:56
@icycodes icycodes requested a review from wsxiaoys April 16, 2024 10:56
@wsxiaoys wsxiaoys merged commit f29d24a into TabbyML:main Apr 16, 2024
3 checks passed
@rodion-m
Copy link

Sounds great! Is a similar feature planned for JetBrains IDEs?

@icycodes
Copy link
Member Author

Hi, @rodion-m

Sounds great! Is a similar feature planned for JetBrains IDEs?

Certainly! You can expect them to be included in the upcoming release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants