Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I wrote a bugfix for issue jupyter#980. Description follows.
Problem
The problem is relative to the inline LaTeX rendering of the Markdown/text cells. The creator of the issue provided the following example to illustrate it:
This results in the following:
Background
One of the cell types supported by Jupyter is the text/Markdown cell. Using this type of cell we can write Markdown which gets processed when the cell is executed. Moreover, it's possible to write inline LaTeX which is rendered using MathJax.
Combining Markdown and LaTeX is a delicate operation, since some special characters are shared by the two languages, leading to conflicts.
The LaTeX code that needs to be rendered is specified in the text cell using delimiters. In particular, the following delimiters are supported:
$
$$
\begin
and\end
\(
and\)
A note on the last one. It's necessary to write
\\(
instead of\(
because in the latter case the backslash gets interpreted by Markdown as an escaper for the parenthesis.There is a mechanism in place in order for the LaTeX code not to get interpreted as Markdown. It works as follows:
*abc* $x_1 = 1, x_2 = 2$ _def_
, and clicks execute cell.remove_math
innotebook/static/notebook/js/mathjaxutils.js
is used to extract the LaTeX groups from the text, put them in a separate array, and replace the groups in the text with placeholders. In this case, for example, the function will return['*abc* @@0@@ _def_', ['$x_1 = 1, x_2 = 2$']]
.This procedure is necessary since, otherwise, the underscores in the LaTeX group would be interpreted as italic delimiters, in this example.
The core of the problem is that the function
remove_math
extracts from the text only the groups delimited by 1, 2 and 3, but not 4.It's easy in fact to see that in the third line in the example the underscores were interpreted as italic delimiters by Markdown.
Fix
The changes are made on the
notebook/static/notebook/js/mathjaxutils.js
file.The
remove_math
function contains some logic to identify the LaTeX blocks and extract them. The first step in this procedure is to split the text on all the possible group delimiters. This is done using theMATHSPLIT
regular expression defined on line 62. As the comments in the code say, it's a bit "magical" in the sense that its workings are not crystal clear.The
\\(
and\\)
delimiters were missing from the regular expression, so I added them appending\\\\(?:\(|\)))
to it. Moreover, since the regular expression was already matching the text\\
as a delimiter, I had to remove it (otherwise the block\\(
would not be grouped together and we would not be able to identify it as a group delimiter). It is not clear to me the purpose of splitting on the\\
s since we're only looking for LaTeX group delimiters and they are not. I suspect it a result of a blind copy & paste from somewhere else, since the comments cite different sources for the code.After being split, the text is processed by running over each of the blocks and looking for start and end LaTeX delimiters. On line 181 I added the missing logic to handle the case in which the text
\\(
is the start delimiter and the text\\)
is the end delimiter.The last change is in the line 208. It is necessary because since the LaTeX code is extracted, backed up, and reinserted in the text after the Markdown is rendered, the
\\(
that was necessary for Markdown is not interpreted by it (resulting in\(
), so we have to manually replace the instances of\\(
and\\)
to\(
and\)
respectively, which are the delimiters that are recognized by MathJax.Thoughts?