WIP: Improve MathJax Support #626

dvberkel · 2018-02-17T10:34:43Z

This pull request is work in progress towards solving issue #400.

Feedback is appreciated.

First commit in a branch dedicated to improving MathJax support. See issue rust-lang#400.

We differentiate between inline and block. And the legacy style and the intended style.

This defers to Mathematics for a replacement which is not implemented yet.

This is so we can strip the delimiters from the mathematics text

dvberkel · 2018-02-20T13:52:00Z

@Michael-F-Bryan The bulk of the work for this pull-request is done. There are a few points left to do, but maybe you want to take a look to get a feel for the overall quality of the work.

I have taken the links pre-processor as inspiration and modified it where I saw fit.

This still can't be merged for the following things need to be addressed for sure.

Furthermore, maybe squash the commits down to a single commit?

Michael-F-Bryan

I've added a couple comments about points we may want to spend more time on, but they mostly come under the dot-points you've already mentioned (e.g. test cases and checking the regex).

One thing I think we should do is add a bunch of mathjax and non-mathjax expressions to the "dummy" book then add tests to tests/rendered_output.rs to make sure the appropriate <div class="math">$...$</div> tag is rendered.

Michael-F-Bryan · 2018-02-21T15:50:59Z

src/preprocess/mathjax.rs

+
+fn find_mathematics(content: &str) -> MathematicsIterator {
+    lazy_static! {
+        static ref REGEXP: Regex = Regex::new(r"(?x) # insignificant whitespace mode


Thanks for taking the time to split this regex up and spell out what it's doing! Regular expressions are usually unreadable, so when you mentioned you weren't too confident with the regex I was a little afraid of what I'd see 😜

I has been said that regular expressions are a write only tool. I am pretty confident with them, but I know for a fact that it currently is incorrect.

Michael-F-Bryan · 2018-02-21T15:52:53Z

src/preprocess/mathjax.rs

+                 match delimiter.as_str() {
+                     "$$"    => Kind::Block,
+                     "$"     => Kind::Inline,
+                     "\\\\[" => Kind::LegacyBlock,


You can always use a raw string here, r"\\[" to avoid escaping backslashes.

That is a very good suggestion. I wasn't aware of raw strings.

Michael-F-Bryan · 2018-02-21T15:54:14Z

src/preprocess/mathjax.rs

+            start_index: m.start(),
+            end_index: m.end(),
+            kind: kind,
+            text: kind.text(m.as_str()),


Should the text field be the body of the mathjax expression, ignoring the surrounding $'s? I've never used mathjax, so I don't know what form it wants things in.

For mathjax to work properly it needs the surrounding dollar signs. I strip them here so we can put them in when we do text replacement. This way we can treat the intended and the legacy case the same. ✔️

Michael-F-Bryan · 2018-02-21T15:54:55Z

src/preprocess/mathjax.rs

+
+    #[test]
+    fn should_find_legacy_inline_mathematics() {
+        let content = "Pythagorean theorem: \\\\(a^{2} + b^{2} = c^{2}\\\\)";


Might want to use raw strings here as well.

Michael-F-Bryan · 2018-02-21T15:59:49Z

src/preprocess/mathjax.rs

+        let mut replacement = String::new();
+        match self.kind {
+            Kind::Block  | Kind::LegacyBlock  => {
+                replacement.push_str("<div class=\"math\">$$");


These match arms can probably be turned into format!() calls:

format!(r#"<div class="math">$${}$$</div>"#, self.text)

... Or do you think the incremental string building is more readable?

No, I think I prefer the format! calls.

Michael-F-Bryan · 2018-02-21T16:00:48Z

src/preprocess/mathjax.rs

+}
+
+impl Kind {
+    fn text<'a>(&self, delimited_text: &'a str) -> &'a str {


I would have thought this method belongs on the Mathematics type.

I placed it on Kind because I needed the body of the delimited text when I construct a Mathematics structure. This could well be done in a constructor.

I opted for a auxiliary function with the same functionality. ✔️

Michael-F-Bryan · 2018-02-21T16:04:16Z

src/preprocess/mathjax.rs

+}
+
+#[cfg(test)]
+mod tests {


As you've mentioned, we probably want to flesh these test out a bit. Off the top of my head we probably want to check

Strings with just one $

What happens when you have a newline between delimiters (e.g. "$foo\nbar$)

Is there a limit to the length of an expression? (e.g. you have a $ at the start of your chapter and another towards the end of the page)

What happens if I overlap delimiters? e.g. $a^{2} + b^{2} = c^{2}\\] and so on.

I was thinking of these kinds of tests. I will add them in.

dvberkel · 2018-02-21T17:34:28Z

I will work on your suggestion and the ones I already had in mind. I have added the tests/rendered_output.rs as a separate bullet point.

One further question. When the work for this issue is done, would you prefer a squashed commit? Or do you want to keep the history?

dvberkel · 2018-03-05T07:08:04Z

I am starting to incorporate the mathjax pre-processor into the system and we need to discuss how, because there are several options.

In src/book/mod.rs the determine_preprocessors function returns a list of pre-processors. It does this by a lookup on the name of the pre-processor configured in the config.rs BuildConfig.

Mathjax is only relevant for the HTML renderer and a mathjax_support option can be found in the HtmlConfig.

In order to enable the mathjax pre-processor I see the following options. If a user wants mathjax support

She should add the mathjax key in the pre-processors in the build configuration and set mathjax_support to true in the HTML configuration.
She should set the mathjax_support to true in the HTML configuration. This automatically add the mathjax pre-processor to the list of pre-processors.

Maybe other options are possible as well.

@Michael-F-Bryan How would you like me to proceed?

Michael-F-Bryan · 2018-03-05T07:58:49Z

When the work for this issue is done, would you prefer a squashed commit? Or do you want to keep the history?

I usually merge PRs with the "squash and merge" button because it leads to a much cleaner git history. GitHub will do the squashing for us, so you shouldn't need to do anything.

Mathjax is only relevant for the HTML renderer and a mathjax_support option can be found in the HtmlConfig.

This is one of the things I was referring to in #629 (comment) when I said:

... I'm stumped on a couple things, for instance some preprocessors will only be relevant when combined with a specific backend ...

I was thinking a user would add the mathjax preprocessor in their book.toml (or we'd add it as part of the default list if the user doesn't specify any preprocessors) and then for each backend we'll run the entire preprocessor pipeline independently. You can then tell each preprocessor which renderer it's being run for (the Renderer trait requires a name() method), and it then becomes the preprocessor's responsibility to change its behaviour (e.g. run or return early) based on that.

Or expressed as code:

// when generating the book

for renderer in &self.renderers {
  let mut book = self.book.clone();
  let name = renderer.name();
  
  for preprocessor in &self.preprocessors {
    let ctx = /* ... */;
    preprocessor.run(ctx, name, &mut book)?;
  }

  renderer.render(...)?;
}

// in the MathJax preprocessor

const RENDERER_WHITELIST: &[&str] = &["html"];

impl Preprocessor for MathJax {
  fn run(&self, ctx: &PreprocessorContext, renderer: &'static str, book: &mut Book) -> Result<()> {
    if !RENDERER_WHITELIST.contains(renderer) {
      return Ok(());  // there's nothing to do. Bail early.
    }

    /* do usual preprocessing */
  }
}

This is probably going to be the most scalable approach because we aren't using an ad-hoc side-channel for communicating whether the preprocessor should run. It does allow you to couple preprocessors to specific renderers, but in this case I'd say the MathJax preprocessor is already innately coupled to the HTML renderer so it's not necessarily a problem. If a preprocessor doesn't really care which renderer it's being run for (e.g. the current link rewriters), it can always ignore the renderer argument.

@dvberkel sorry for the wall of text! It's a non-trivial problem and I can imagine as we start using preprocessors for more stuff we'll need to find a more generic solution. I'm kinda hesitant to add custom logic to the determine_preprocessor() function (your second point) because that couples it to the HTML renderer and won't work for 3rd party preprocessors.

dvberkel · 2018-03-05T08:16:48Z

@Michael-F-Bryan Don't worry about the wall of text. It conveys your point very well. I would rather want to read more, think and do a well-informed job then a rush to a mediocre solution.

Is what you describe already in place? Or is it something that needs to be done?

Michael-F-Bryan · 2018-03-05T08:41:27Z

Is what you describe already in place? Or is it something that needs to be done?

It's one of the ideas I've been thinking about, but nothing is implemented yet. I started #631 so we have a place to discuss these sorts of things.

Of itself, introducing this renderer argument to Preprocessor::run() is a fairly trivial change to make. I just want to hear more people's opinions before implementing anything.

For example what do we do if a user wants to say "run the MathJax preprocessor when building an EPUB"? With this system the preprocessor's whitelist would be hard-coded and wouldn't allow that. A preprocessor could add a setting under its table in book.toml (e.g. renderers = ["html", "epub"]), but then we're just swapping one ad-hoc side-channel for another.

Another option is to let mdbook decide whether to run a preprocessor using the previous renderers = ... setting, falling back to some sort of hint from the preprocessor if it's not available.

e.g. the user could manually specify which renderer to run the MathJax preprocessor for:

[preprocessor.mathjax]
renderers = ["html", "epub"]

And if the renderers key isn't provided, we use a fallback, probably by adding a fn hint_renderer_is_supported(&self, render: &str) -> bool method to the Preprocessor trait.

dvberkel · 2018-03-26T08:39:59Z

@Michael-F-Bryan I was away for a while. My mom died and I needed to take care of things.

What is the best way forward? should we first decide on a #631? It seems like the best option.

Michael-F-Bryan · 2018-04-02T14:08:24Z

Hi @dvberkel I'm really sorry to hear about that.

I'm thinking we should just make a decision on how renderers and preprocessors should interact and that'll unblock this issue.

Dylan-DPC-zz · 2019-05-07T22:36:46Z

@dvberkel since this PR is still WIP and it was created a while back, would be a good idea to close it. Feel free to submit a new PR and thanks for contributing

dvberkel added 14 commits February 17, 2018 11:29

WIP: Improve MathJax Support

2f89760

First commit in a branch dedicated to improving MathJax support. See issue rust-lang#400.

Create a skeletal NOOP implementation of MathJaxPreprocessor

643edbe

Iterate over section without replacing mathematics

7d0a1f1

Create a skeletal find_mathematics function

16e22e1

Describe different kind of mathematics

7657044

We differentiate between inline and block. And the legacy style and the intended style.

Implement find_mathemtics loop inside replace_all

b4f77eb

This defers to Mathematics for a replacement which is not implemented yet.

Implement find_mathematics for inline maths

784b814

Include mathematics text

4aa0df1

Recognize both inline as block mathematics

534dfb4

Recognize both types of legacy mathematics

2a05964

Capture mathematics text in regular expression

92259f1

Reveal kind of mathematics earlier

e5be09d

This is so we can strip the delimiters from the mathematics text

Remove delimiters from mathematics text

121f638

Do actual replacement of text

ea4f998

Michael-F-Bryan suggested changes Feb 21, 2018

View reviewed changes

dvberkel added 9 commits February 22, 2018 06:47

Use raw strings to prevent excessive exscaping

a55e640

Use the format! macro instead of push_str

1b0abf7

Demote text method on kind to auxiliary function

e68bee5

Don't capture the mathematics text

1648126

Correct regexp for mathematics

2512b87

Test text with a single dollar sign

1871b87

Detect mathematics over multiple lines

92b5fad

Test non-matching delimiters

629c778

Run cargo fmt on preprocess/mathjax.rs

7540a3e

Michael-F-Bryan mentioned this pull request Mar 5, 2018

Next stage in the plugin architecture #631

Open

Michael-F-Bryan mentioned this pull request Apr 2, 2018

Configurable preprocessor #658

Merged

Dylan-DPC-zz closed this May 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Improve MathJax Support #626

WIP: Improve MathJax Support #626

dvberkel commented Feb 17, 2018

dvberkel commented Feb 20, 2018 •

edited

Loading

Michael-F-Bryan left a comment

Michael-F-Bryan Feb 21, 2018

dvberkel Feb 21, 2018

Michael-F-Bryan Feb 21, 2018

dvberkel Feb 21, 2018

dvberkel Feb 22, 2018

Michael-F-Bryan Feb 21, 2018

dvberkel Feb 21, 2018 •

edited

Loading

Michael-F-Bryan Feb 21, 2018

dvberkel Feb 21, 2018

dvberkel Feb 22, 2018

Michael-F-Bryan Feb 21, 2018

dvberkel Feb 21, 2018

dvberkel Feb 22, 2018

Michael-F-Bryan Feb 21, 2018

dvberkel Feb 21, 2018

dvberkel Feb 22, 2018

Michael-F-Bryan Feb 21, 2018

dvberkel Feb 21, 2018

dvberkel commented Feb 21, 2018

dvberkel commented Mar 5, 2018

Michael-F-Bryan commented Mar 5, 2018

dvberkel commented Mar 5, 2018

Michael-F-Bryan commented Mar 5, 2018

dvberkel commented Mar 26, 2018

Michael-F-Bryan commented Apr 2, 2018

Dylan-DPC-zz commented May 7, 2019

WIP: Improve MathJax Support #626

WIP: Improve MathJax Support #626

Conversation

dvberkel commented Feb 17, 2018

dvberkel commented Feb 20, 2018 • edited Loading

Michael-F-Bryan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dvberkel Feb 21, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dvberkel commented Feb 21, 2018

dvberkel commented Mar 5, 2018

Michael-F-Bryan commented Mar 5, 2018

dvberkel commented Mar 5, 2018

Michael-F-Bryan commented Mar 5, 2018

dvberkel commented Mar 26, 2018

Michael-F-Bryan commented Apr 2, 2018

Dylan-DPC-zz commented May 7, 2019

dvberkel commented Feb 20, 2018 •

edited

Loading

dvberkel Feb 21, 2018 •

edited

Loading