Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Framework for adding context to LLM prompt #993

Merged
merged 28 commits into from
Sep 25, 2024

Conversation

michaelchia
Copy link
Collaborator

@michaelchia michaelchia commented Sep 12, 2024

Description

Aims to solve the first task of #910.

Extendable framework for adding context to prompts. Allows users to define BaseContextProviders that are responsible for taking in user chat input and retrieving all relevant information that should be injected into the LLM prompt as context. BaseContextProvider has an abstract method make_context_prompt to be implemented, that takes in the HumanChatMessage and returns a string that should be added to the context. The BaseChatHandler will have a new method make_context_prompt that would loop over all the BaseContextProviders to append all the context prompts together. The DefaultChatHandler will pass the context to the prompt template that has a new optional {context} placeholder. The default provider prompt template has been modified to include this optional {context} placeholder.

The subclass BaseCommandContextProvider can be triggered via a command with the @ prefix with auto-completion suggestions in the chat input UI, similar to slash commands. The difference is that it does not need to be at the start of the input; can have multiple instances in the input; and may have a single argument that is part of the command. E.g. "tell me what are the differences in @file:file1.py and @file.file2.py. BaseCommandContextProvider can optionally modify or remove the commands from the user prompt before passing the LLM in DefaultChatHandler. This is to clean up the prompt to make it more understandable by the LLM that might not know what the command means.

Two default BaseCommandContextProvider were implemented.
FileContextProvider that allows you to add the contents of a file to the context by calling @file:<filepath>. The filepath uses the same base path config logic and supported extensions as /learn. It also allows for filepath auto-complete suggestions (see demo).
LearnedContextProvider (not final name. see discussion point below) to be triggered with @learn as a replacement for /ask. It calls the same retriever and adds the snippets to the context. It is more flexible than /ask in that it can be used with other context providers. However, it is not obvious that it needs to be added as a replacement for /ask. I mostly implemented it as a concrete example of a retriever-based context provider. I'll have no issue removing it if needed.

Demo

FileContextProvider

Screen.Recording.2024-09-12.at.11.24.54.PM.mov

LearnedContextProvider

Screen.Recording.2024-09-12.at.11.30.00.PM.mov

Usage

For users who want to develop their own context provider:

For context provider without a command that is triggered always or through some other form of inference/config, subclass BaseContextProvider and implement make_context_prompt.

To include a command, subclass BaseCommandContextProvider and similarly implement make_context_prompt. Use the _find_instances(text: str) method to get all instances of the command to detect presence of command or generate and concat the context for each instance. For commands with an argument, you can optionally implement a get_arg_options(prefix: str) method to return the list of auto-complete options for the argument.

Examples

I added some examples in the 'jupyter_ai/context_providers/_examples.py'. This is to be removed before merging and I don't think any of these should be added by default.

They are meant to illustrate:

  • simplicitly of creating retriever-based context providers
  • use of llm provider within context provider
  • example of non-command context provider

Some ideas for context providers are:

  • @var:<variable name> to add variables from an active notebook kernel into the context
    • Probably requires passing the kernel_id of the active notebook via the human_msg.
    • For my usecase I already have logic to pass the kernel id, and I will specifically be implementing @df:<var name> for adding pandas and pyspark dataframe schemas
  • ErrorContextProvider that checks if the error is contained in the human_msg and uses that to query an internal database of errors and solutions for internal packages and error messages.
    • Rough outline example in 'jupyter_ai/context_providers/_examples.py'.
  • Replace adding selection into the message body with a SelectionContextProvider that adds the selection to the context section instead.

Implementation summary

  • Refactored auto-completion logic in 'chat-inputs.tsx'

    • Replaced slashCommandOptions stuff with a more general autocompleteOptions logic that also supports the autocomplete logic for context providers or any other future autocompletion types.
    • Similarly replaced the SlashCommandsInfoHandler with an AutocompleteOptionsHandler to provide the autocomplete options. (SlashCommandsInfoHandler was not removed yet as I do not know if it is intended to be used elsewhere in the future)
  • Added a jai_context_providers to the extension's self.settings and as a param to BaseChatHandler, similar to the jai_chat_handlers.

  • Modified DefaultChatHandler with a step to make the context prompt and to replace/clean up the prompt as mentioned above.

  • Modified the default prompt templates in juypter-ai-magics to include the {context} placeholder.

    • If a user is using their own custom template without the {context} placeholder, the context provider stuff will be ignored. This is for backward compatibility.

Points of discussion

  • Whether to rename ContextProvider. @3coins was suggesting KnowledgeBase but I feel that Context is more generally applicable and conventional for this.
  • Whether to remove SlashCommandsInfoHandler and related code if not planned on being used elsewhere.
  • Whether LearnedContextProvider should be added.
  • What to name LearnedContextProvider if to be added.
    • This was just a placeholder name as I couldn't think of a name and didn't want to spend too much time on it.
  • What to set as the prompt templates for default LLM provider and for the context providers.
    • Honestly didn't spend too much time thinking about and formatting the templates.
  • Overall design. There are plently of minor decision I am not 100% certain about.

Future enhancements

  • Config management for this in the UI as mentioned in Add configurable knowledgebase management #910.
    • Ability to enable/disable non-command providers.
    • Ability to pin command providers to avoid having to type it repeatedly.
    • Ability to configure provider specific config.
      • Currently there is also no mechanism to add provider-specific config to the objects
    • Ability to save config profiles?
  • Add some style formatting to the commands in the chat input and messages so that they are easily identifiable as commands.
    • can have different formats for '/' and '@' commands
    • consider applying some invisible tags around them so that they can be more easily extracted downstream.
  • Add limits to prevent accidentally adding too much context and blowing up the token count.
    • How should this be configured and what should the default be?
    • At what level? Entire context prompt? each context provider context? each instance context?

Let me know what you guys think and if there is anything you would like me to change.

@michaelchia michaelchia changed the title Contexthandler Framework for adding context to LLM prompt Sep 12, 2024
@michaelchia michaelchia added the enhancement New feature or request label Sep 12, 2024
@michaelchia michaelchia marked this pull request as draft September 12, 2024 20:28
@dlqqq
Copy link
Member

dlqqq commented Sep 16, 2024

@michaelchia Thank you for working on such a significant potential addition to Jupyter AI! Given the size and scope of this PR, it will take more time for us to review this and determine if this user experience is aligned with our longer-term vision for Jupyter AI. We appreciate your patience in the meantime. 🙏

I've rebased your PR for review purposes.

@dlqqq
Copy link
Member

dlqqq commented Sep 16, 2024

@michaelchia Wow, just watched the demo videos, and this feature is mind-blowingly awesome! 🎉 🎉 I love how easy it is to just point Jupyternaut at a file and allow its context to be used to answer your query.

Only two callouts regarding the UX:

  1. We may want to reserve the @ symbol because we are working on a multi-user collaborative experience where users can @ mention other users, kind of like on GitHub, Slack, and Discord. I'm thinking about alternative symbols we can leverage here, but if you have any suggestions, feel free to let us know. No need to change anything atm.
    • Perhaps we could explore using $? I think this makes a lot of sense to a developer audience, since $ is also used to interpolate strings in shell prompts, e.g. find -name $FILE_NAME. Using $ to provide context is almost like interpolating the content of the file into your prompt for the LLM.
    • $ is also used in LaTeX markup however, so the parsing logic could get tricky.
    • Other alternatives: ! (similar to Markdown syntax for images), # (similar to channel syntax for Slack & Discord).
  2. As you've pointed out, the @learned syntax being proposed is redundant with the /ask slash command we have already. However, I think that if we are exploring an entirely new chat syntax for pulling in context, we should consider how we can deprecate /ask in favor of some syntax.

@ellisonbg
Copy link
Contributor

@michaelchia wow, this is really amazing! Thanks for putting all the time into this, can't wait to try it out.

@michaelchia
Copy link
Collaborator Author

michaelchia commented Sep 17, 2024

@dlqqq really appreciate your consideration for this PR. Take your time with it. There are definitely tons of details to iron out before it is ready.

@ is pretty much the de facto for this kind of functionality in all other AI assistant plugins. Personally, I would strongly favour it for both the user familiarity and general convention following. But sure, we can explore some alternatives if it conflicts with future plans.

On a side note on the multi-user features, for my usecase, I do not see us requiring or using such a feature so I hope that it won't add too many changes or other limitations since it is primarily an AI assistant tool. This is just my personal opinion and don't mean to speak for other users that I am sure would find such a thing useful.

Copy link
Member

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelchia Hey Michael, I've reviewed as much as I can today. I'm leaving a partial code review here, so you can address my feedback as I review the rest.

Overall, the code looks excellent! I'm impressed by how much work you've put into this. Left some feedback for you below.

Copy link
Member

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelchia Hey Michael, I've responded to your comments above and resolved conversations as needed. I left some more feedback below for you. I'm about 70% done with this review, but still need more time to test & get feedback from other contributors.

I sincerely appreciate your patience in the meantime! 🤗

packages/jupyter-ai/jupyter_ai/models.py Outdated Show resolved Hide resolved
packages/jupyter-ai/jupyter_ai/context_providers/base.py Outdated Show resolved Hide resolved
packages/jupyter-ai/jupyter_ai/models.py Outdated Show resolved Hide resolved
packages/jupyter-ai/jupyter_ai/handlers.py Outdated Show resolved Hide resolved
packages/jupyter-ai/jupyter_ai/context_providers/base.py Outdated Show resolved Hide resolved
Copy link
Member

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelchia Hey Michael, this PR looks great after your recent revisions. I've checked in with other members on the team, who all agree that this would be wonderful to include in the next 2.x release. I've also tested this branch locally and verified that it works well. Thank you for your hard work and patience! 🤗

ℹ️ I've left one last point of feedback below suggesting a safer way to ensure context providers are only triggered if the user explicitly typed a command to run them.
ℹ️ After deleting _examples.py and addressing that last point of feedback, I will approve and merge this PR.

There was only one usability issue that I noticed: the @ autocomplete options are still shown even when a slash command is used. However, context is only included when using DefaultChatHandler. I think it is fine for this issue to be addressed in a future PR, given that this feature is still early-stage and in-development.

Again, thank you for the outstanding effort you've invested into Jupyter AI thus far!

packages/jupyter-ai/jupyter_ai/context_providers/base.py Outdated Show resolved Hide resolved
@michaelchia michaelchia marked this pull request as ready for review September 25, 2024 07:00
Copy link
Member

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelchia Thank you! This will be included in a minor release tomorrow. 🎉

@dlqqq dlqqq merged commit 6e426ab into jupyterlab:main Sep 25, 2024
9 checks passed
Marchlak pushed a commit to Marchlak/jupyter-ai that referenced this pull request Oct 28, 2024
* context provider

* split base and base command context providers + replacing prompt

* comment

* only replace prompt if context variable in template

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Run mypy on CI, fix or ignore typing issues (jupyterlab#987)

* Run mypy on CI

* Rename, add mypy to test deps

* Fix typing jupyter-ai codebase (mostly)

* Three more cases

* update deepmerge version specifier

---------

Co-authored-by: David L. Qiu <[email protected]>

* context provider

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* mypy

* black

* modify backtick logic

* allow for spaces in filepath

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactor

* fixes

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix test

* refactor autocomplete to remove hardcoded '/' and '@' prefix

* modify context prompt template

Co-authored-by: david qiu <[email protected]>

* refactor

* docstrings + refactor

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* mypy

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add context providers to help

* remove _examples.py and remove @learned from defaults

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* make find_commands unoverridable

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Michał Krassowski <[email protected]>
Co-authored-by: David L. Qiu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants