Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow $ to literally denote quantities of USD in chat #1068

Merged
merged 5 commits into from
Oct 31, 2024

Conversation

dlqqq
Copy link
Member

@dlqqq dlqqq commented Oct 30, 2024

Description

  • Fixes Improve rendering of literal dollar signs in chat #1067
  • Allows for $ to literally denote multiple quantities of USD per line in the chat by escaping dollar signs in the frontend.
  • The rendering behavior now only allows for \( <math> \) to denote inline math, while allowing $$ <math> $$ and \[ <math> \] to denote display math.
  • Edits the system prompt to request that the LLM not use $ as an inline math delimiter, but instead use LaTeX math delimiters (i.e. \( <math> \) instead of $<math>$).

Demo

Screenshot 2024-10-31 at 10 56 23 AM

@dlqqq dlqqq added the bug Something isn't working label Oct 30, 2024
Copy link
Collaborator

@srdas srdas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the formatting for several LLMs: all Bedrock Chat Anthropic LLMs, Llama3 models, and mistral models. The formatting looks great and does not fall afoul of TeX formatting $ limiters.
image

Even if you prompt it to use TeX formatting it does so by displaying the formatting and not forcing math mode, as shown here:
image
So the modification in this PR is robuest and all LGTM.

@dlqqq
Copy link
Member Author

dlqqq commented Oct 30, 2024

@srdas Thanks for the review! Actually your testing pointed out a regression. While we don’t want to treat $ for inline math, we should still allow for $$ to be used for display math. This is aligned with Mathjax’s recommendation for delimiters here: https://docs.mathjax.org/en/latest/input/tex/delimiters.html

JupyterLab overrides the default delimiters set by MathJax, which is why we have to escape single dollar signs in our application despite MathJax being the default implementation of ILatexTypesetter.

I'll update this PR to allow for double dollar signs while still escaping single dollar signs.

@dlqqq
Copy link
Member Author

dlqqq commented Oct 30, 2024

I just noticed another regression below. We shouldn't escape dollar signs within Markdown code blocks. The LLM is trying to write \$ in its response, but this gets transformed to \\\$ when we escape the dollar sign by adding triple backticks.

Handling this will be more tricky because I don't think a regex can match this case.

Screenshot 2024-10-30 at 1 11 18 PM

@srdas
Copy link
Collaborator

srdas commented Oct 30, 2024

I just noticed another regression below. We shouldn't escape dollar signs within Markdown code blocks. The LLM is trying to write \$ in its response, but this gets transformed to \\\$ when we escape the dollar sign by adding triple backticks.

Handling this will be more tricky because I don't think a regex can match this case.

I also noticed this in my testing, because at one point it returned this sort of formatting, but I was unable to reproduce it again when I tried. Good you caught it again.

@dlqqq
Copy link
Member Author

dlqqq commented Oct 30, 2024

It may be easier to somehow change the MathJax config to simply not respect $ as an inline delimiter, which means we don't have to implement any complex escaping logic for $ symbols. However, the documentation is unclear at best: https://docs.mathjax.org/en/latest/web/configuration.html#configuring-mathjax-after-it-is-loaded

@dlqqq
Copy link
Member Author

dlqqq commented Oct 31, 2024

It may be easier to somehow change the MathJax config to simply not respect $ as an inline delimiter

This strategy proved to be too complex. I tried creating a new instance of MathJaxTypesetter with custom inline delimiters, but that also affected the rendering in notebooks. It seems only one MathJax configuration can be used at a time across the entire browser.

I've updated this PR and its description to address the 2 bugs called out earlier:

  • Escapes $ but not $$
  • Does not escape $ when inside of a "literal element", e.g. code, pre.

@dlqqq
Copy link
Member Author

dlqqq commented Oct 31, 2024

Demo showing dollar symbols are not escaped in code blocks:

Screenshot 2024-10-31 at 11 33 33 AM

Copy link
Collaborator

@srdas srdas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great use of AI to correct AI! Jupyter AI is now officially self-repairing!
LGTM!

@dlqqq dlqqq merged commit e1d99cc into jupyterlab:main Oct 31, 2024
10 checks passed
@dlqqq
Copy link
Member Author

dlqqq commented Oct 31, 2024

@meeseeksdev please backport to v3-dev

meeseeksmachine pushed a commit to meeseeksmachine/jupyter-ai that referenced this pull request Oct 31, 2024
dlqqq added a commit that referenced this pull request Oct 31, 2024
@dlqqq
Copy link
Member Author

dlqqq commented Oct 31, 2024

cc: @jtpio @brichet re: jupyter-chat

@brichet
Copy link

brichet commented Nov 4, 2024

cc: @jtpio @brichet re: jupyter-chat

Thanks for the ping @dlqqq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve rendering of literal dollar signs in chat
3 participants