-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Improve MathJax Support #626
Conversation
First commit in a branch dedicated to improving MathJax support. See issue rust-lang#400.
We differentiate between inline and block. And the legacy style and the intended style.
This defers to Mathematics for a replacement which is not implemented yet.
This is so we can strip the delimiters from the mathematics text
@Michael-F-Bryan The bulk of the work for this pull-request is done. There are a few points left to do, but maybe you want to take a look to get a feel for the overall quality of the work. I have taken the This still can't be merged for the following things need to be addressed for sure.
Furthermore, maybe squash the commits down to a single commit? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a couple comments about points we may want to spend more time on, but they mostly come under the dot-points you've already mentioned (e.g. test cases and checking the regex).
One thing I think we should do is add a bunch of mathjax and non-mathjax expressions to the "dummy" book then add tests to tests/rendered_output.rs
to make sure the appropriate <div class="math">$...$</div>
tag is rendered.
|
||
fn find_mathematics(content: &str) -> MathematicsIterator { | ||
lazy_static! { | ||
static ref REGEXP: Regex = Regex::new(r"(?x) # insignificant whitespace mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to split this regex up and spell out what it's doing! Regular expressions are usually unreadable, so when you mentioned you weren't too confident with the regex I was a little afraid of what I'd see 😜
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I has been said that regular expressions are a write only tool. I am pretty confident with them, but I know for a fact that it currently is incorrect.
src/preprocess/mathjax.rs
Outdated
match delimiter.as_str() { | ||
"$$" => Kind::Block, | ||
"$" => Kind::Inline, | ||
"\\\\[" => Kind::LegacyBlock, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can always use a raw string here, r"\\["
to avoid escaping backslashes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a very good suggestion. I wasn't aware of raw strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
src/preprocess/mathjax.rs
Outdated
start_index: m.start(), | ||
end_index: m.end(), | ||
kind: kind, | ||
text: kind.text(m.as_str()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the text
field be the body of the mathjax expression, ignoring the surrounding $
's? I've never used mathjax, so I don't know what form it wants things in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For mathjax to work properly it needs the surrounding dollar signs. I strip them here so we can put them in when we do text replacement. This way we can treat the intended and the legacy case the same. ✔️
src/preprocess/mathjax.rs
Outdated
|
||
#[test] | ||
fn should_find_legacy_inline_mathematics() { | ||
let content = "Pythagorean theorem: \\\\(a^{2} + b^{2} = c^{2}\\\\)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to use raw strings here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
src/preprocess/mathjax.rs
Outdated
let mut replacement = String::new(); | ||
match self.kind { | ||
Kind::Block | Kind::LegacyBlock => { | ||
replacement.push_str("<div class=\"math\">$$"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These match
arms can probably be turned into format!()
calls:
format!(r#"<div class="math">$${}$$</div>"#, self.text)
... Or do you think the incremental string building is more readable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think I prefer the format!
calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
src/preprocess/mathjax.rs
Outdated
} | ||
|
||
impl Kind { | ||
fn text<'a>(&self, delimited_text: &'a str) -> &'a str { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have thought this method belongs on the Mathematics
type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I placed it on Kind
because I needed the body of the delimited text when I construct a Mathematics
structure. This could well be done in a constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted for a auxiliary function with the same functionality. ✔️
} | ||
|
||
#[cfg(test)] | ||
mod tests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you've mentioned, we probably want to flesh these test out a bit. Off the top of my head we probably want to check
- Strings with just one
$
- What happens when you have a newline between delimiters (e.g.
"$foo\nbar$
) - Is there a limit to the length of an expression? (e.g. you have a
$
at the start of your chapter and another towards the end of the page) - What happens if I overlap delimiters? e.g.
$a^{2} + b^{2} = c^{2}\\]
and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of these kinds of tests. I will add them in.
I will work on your suggestion and the ones I already had in mind. I have added the One further question. When the work for this issue is done, would you prefer a squashed commit? Or do you want to keep the history? |
I am starting to incorporate the mathjax pre-processor into the system and we need to discuss how, because there are several options. In Mathjax is only relevant for the HTML renderer and a In order to enable the mathjax pre-processor I see the following options. If a user wants mathjax support
Maybe other options are possible as well. @Michael-F-Bryan How would you like me to proceed? |
I usually merge PRs with the "squash and merge" button because it leads to a much cleaner git history. GitHub will do the squashing for us, so you shouldn't need to do anything.
This is one of the things I was referring to in #629 (comment) when I said:
I was thinking a user would add the Or expressed as code: // when generating the book
for renderer in &self.renderers {
let mut book = self.book.clone();
let name = renderer.name();
for preprocessor in &self.preprocessors {
let ctx = /* ... */;
preprocessor.run(ctx, name, &mut book)?;
}
renderer.render(...)?;
}
// in the MathJax preprocessor
const RENDERER_WHITELIST: &[&str] = &["html"];
impl Preprocessor for MathJax {
fn run(&self, ctx: &PreprocessorContext, renderer: &'static str, book: &mut Book) -> Result<()> {
if !RENDERER_WHITELIST.contains(renderer) {
return Ok(()); // there's nothing to do. Bail early.
}
/* do usual preprocessing */
}
} This is probably going to be the most scalable approach because we aren't using an ad-hoc side-channel for communicating whether the preprocessor should run. It does allow you to couple preprocessors to specific renderers, but in this case I'd say the MathJax preprocessor is already innately coupled to the HTML renderer so it's not necessarily a problem. If a preprocessor doesn't really care which renderer it's being run for (e.g. the current link rewriters), it can always ignore the @dvberkel sorry for the wall of text! It's a non-trivial problem and I can imagine as we start using preprocessors for more stuff we'll need to find a more generic solution. I'm kinda hesitant to add custom logic to the |
@Michael-F-Bryan Don't worry about the wall of text. It conveys your point very well. I would rather want to read more, think and do a well-informed job then a rush to a mediocre solution. Is what you describe already in place? Or is it something that needs to be done? |
It's one of the ideas I've been thinking about, but nothing is implemented yet. I started #631 so we have a place to discuss these sorts of things. Of itself, introducing this For example what do we do if a user wants to say "run the MathJax preprocessor when building an EPUB"? With this system the preprocessor's whitelist would be hard-coded and wouldn't allow that. A preprocessor could add a setting under its table in Another option is to let e.g. the user could manually specify which renderer to run the MathJax preprocessor for: [preprocessor.mathjax]
renderers = ["html", "epub"] And if the |
@Michael-F-Bryan I was away for a while. My mom died and I needed to take care of things. What is the best way forward? should we first decide on a #631? It seems like the best option. |
Hi @dvberkel I'm really sorry to hear about that. I'm thinking we should just make a decision on how renderers and preprocessors should interact and that'll unblock this issue. |
@dvberkel since this PR is still WIP and it was created a while back, would be a good idea to close it. Feel free to submit a new PR and thanks for contributing |
This pull request is work in progress towards solving issue #400.
Feedback is appreciated.