-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add search command #157
add search command #157
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool! I have a lot of thoughts, some of these would be follow up PRs:
-
The first time I ran it I got
Embedding batch 1/13...
toEmbedding batch 13/13...
printed out. The second time it just printedEmbedding batch 1/1...
. I understand the first call had to embed everything, but why did the second call still have a batch? Maybe the batch algorithm always returns at least one batch that might be empty? -
I don't think we are tracking embedding costs with the cost logger. I know they are very low, but it might be good to show what they are, so users know? Relatedly, I worry someone will run this on a massive codebase and it'll actually cost them a lot. Like if they have tons of json / data files checked in to the repo or something. Should we have some warning if the number of batches is actually crazy, and it'll cost more than like $1?
-
What do we do if a file is too big too embed all at once?
-
When we split embedding sections from whole files to smaller parts it'll be great to show the code in the search results. Formatting that might be tricky. In the VS Code extension we'd make it easy to jump to those files / sections
mentat/terminal/client.py
Outdated
@@ -229,7 +229,7 @@ def run_cli(): | |||
help="Exclude the file structure/syntax map from the system prompt", | |||
) | |||
parser.add_argument( | |||
"--embedding", | |||
"--use-embedding", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does --use-embeddings
make more sense?
We talked about these but for the record:
It's embedding the prompt. So if it's a new prompt (hash not in db), there'll always be at least a batch of 1.
I've set this up as you said: give option to ignore embeddings if the cost > $1, otherwise display the cost (with 4 decimal places) afterwards.
These are ignored for now, but I'll make sure they're included when we implement file-splitting.
Agreed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me just a few small things.
One other note: I notice with small searches, like when I searched for just "parser" the init files score very highly. I wonder if we should not embed very small files. Say ones under 10 characters?
Other than that the searches I tried seemed to surface pretty relevant files.
mentat/commands.py
Outdated
await stream.send(str(e), color="red") | ||
return | ||
|
||
for i, (feature, score) in enumerate(results): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should 1 index instead of 0 index.
mentat/commands.py
Outdated
|
||
for i, (feature, score) in enumerate(results): | ||
_i = f"{i}: " if i < 10 else f"{i}:" | ||
await stream.send(f"{_i} {score:.3f} | {feature.path}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use :2 to force i to take up the same vertical space:
await stream.send(f"{i:2} {score:.3f} | {feature.path}")
I think that'd be a bit cleaner.
mentat/commands.py
Outdated
await stream.send("\nShow More results? ") | ||
if not await ask_yes_no(default_yes=True): | ||
break | ||
await stream.send("Search complete", color="green") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think from a UX perspective it's better not to send this message. The user will know the search is over. We don't send a similar message when they ask a general question and the model responds.
mentat/conversation.py
Outdated
@@ -171,6 +171,7 @@ async def get_model_response(self) -> list[FileEdit]: | |||
conversation_history = "\n".join([m["content"] for m in messages_snapshot]) | |||
tokens = count_tokens(conversation_history, self.model) | |||
response_buffer = 1000 | |||
print() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be removed
SEARCH_RESULT_BATCH_SIZE = 10 | ||
|
||
|
||
class SearchCommand(Command, command_name="search"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a SearchCommandTest
?
mentat/code_context.py
Outdated
) -> list[tuple[CodeFile, float]]: | ||
"""Return the top n features that are most similar to the query.""" | ||
if not self.settings.use_embeddings: | ||
raise UserError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if we just sent a red error rather than crashed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with all of Jake's comments and left one of my own, but looks good to merge after all of that is done! Thanks for adding this!
Thanks for the great feedback, I think I hit everything.
Hmm.. the git_root relative path is included in the embedding. I think returning empty files, as it did, will be useful in some cases. Moving this convo to slack. |
This PR implements a
/search
command using embeddings, along with some follow-ups from #144. To use it, add the--use-embedding
flag.