Optimizations #19

asterite · 2019-09-08T00:24:03Z

This PR optimizes Markd code. Each commit has one optimization. The most controversial one is about not indexing strings with indices like string[index] but instead using string.byte_at(index). Because Markdown is an ASCII format (well, there's unicode but the control characters like *, _, [, etc. are all ASCII) we can go byte per byte. But all optimizations here are good.

I used two benchmarks:

require "./src/markd"
require "benchmark"
require "markdown"

text = "hello world"
# text = File.read("README.md")

Benchmark.ips do |x|
  x.report("std") { Markdown.to_html(text) }
  x.report("markd") { Markd.to_html(text) }
end

The first one is the code above. The second one is using the uncommented line, where "README.md" is this repo's README.md (I actually wanted to use the "sample markdown file" mentioned in the README but it seems it doesn't exist anymore).

In the code above require "markdown" is just the std's (incomplete, buggy) markdown.

Results

Hello world

Before this PR

  std   3.49M (286.71ns) (± 2.95%)    384B/op        fastest
markd 342.15k (  2.92µs) (± 4.93%)  4.09kB/op  10.19× slower

After this PR

  std   3.59M (278.37ns) (± 0.81%)    384B/op        fastest
markd 962.57k (  1.04µs) (± 1.79%)  1.46kB/op   3.73× slower

README.md

Before this PR

  std  25.44k ( 39.31µs) (± 1.39%)  46.1kB/op        fastest
markd   2.22k (449.56µs) (± 1.48%)   408kB/op  11.44× slower

After this PR

  std  25.92k ( 38.58µs) (± 0.56%)  46.1kB/op        fastest
markd   5.22k (191.64µs) (± 1.64%)   175kB/op   4.97× slower

Conclusion

It seems time and memory went down by 3 times.

I tried to optimize more but it's harder and harder without me trying to understand more the code. But I think this is good enough: 3~4 times slower than std but without bugs and feature complete is really good! And I'm tempted to ask you to maybe move markd to the standard library so we have a really good doc generate and a good Markdown library in the standard library, but we'll have to discuss that in the Crystal repo.

Renderer#escape was each time passing a new hash instance to gsub, making it allocate more memory than neccessary. Moving the hash to a constant improves performance a bit. For `Markd.to_html("hello world")` before 350.46k ( 2.85µs) (± 6.31%) 4.09kB/op after 431.05k ( 2.32µs) (± 1.43%) 3.07kB/op

For `Markd.to_html("hello world")` before 431.05k ( 2.32µs) (± 1.43%) 3.07kB/op after 506.43k ( 1.97µs) (± 2.50%) 2.72kB/op

For `Markd.to_html("hello world")` before 506.43k ( 1.97µs) (± 2.50%) 2.72kB/op after 554.18k ( 1.80µs) (± 1.10%) 2.41kB/op

For `Markd.to_html("hello world")`: before 554.18k ( 1.80µs) (± 1.10%) 2.41kB/op after 640.45k ( 1.56µs) (± 6.70%) 2.06kB/op

This avoids initializing a Hash when not needed. For `Markd.to_html("hello world")`: before 640.45k ( 1.56µs) (± 6.70%) 2.06kB/op after 727.39k ( 1.37µs) (± 2.28%) 1.59kB/op

Array literals in tight loops allocate a lot of memory! Parsing markd's README.md: before 3.52k (283.87µs) (± 0.59%) 259kB/op after 3.66k (273.38µs) (± 0.54%) 244kB/op

For `Markd.to_html("hello world")`: before 726.36k ( 1.38µs) (± 1.86%) 1.59kB/op after 800.81k ( 1.25µs) (± 0.58%) 1.58kB/op

Markdown is an ASCII format: all control characters are ASCII so we can avoid decoding unicode characters when not needed. The old code was indexing bytes by unicode index position which is very slow because the string must be decoded each time. Here we change it to index by byte, and change all operations to do byte slices instead of unicode slices. For markd's README.md: before 3.86k (258.86µs) (± 0.52%) 242kB/op after 4.11k (243.25µs) (± 1.37%) 242kB/op

For `Markd.to_html("hello world")`: before 800.81k ( 1.25µs) (± 0.58%) 1.58kB/op after 829.33k ( 1.21µs) (±25.35%) 1.56kB/op

We can use an inline comparison or a tuple. For markd's README.md: before 4.57k (218.68µs) (± 0.64%) 198kB/op after 4.63k (215.84µs) (± 2.20%) 196kB/op

`values.each` will create an intermediary array

We can use `each_line` instead of `lines` and then iterating. For `Markd.to_html("hello world")`: before 882.84k ( 1.13µs) (± 1.46%) 1.56kB/op after 929.11k ( 1.08µs) (± 2.75%) 1.46kB/op For markd's README.md: before 4.85k (206.37µs) (± 2.24%) 187kB/op after 4.98k (200.82µs) (± 0.47%) 183kB/op

`String::Builder` is slightly faster when used for a one-time shot.

Because all escape chars are ASCII it's fine if we go byte per byte. For `Markd.to_html("hello world")`: before 942.28k ( 1.06µs) (± 5.45%) 1.46kB/op after 959.11k ( 1.04µs) (± 0.72%) 1.46kB/op For markd's README.md: before 4.97k (201.41µs) (± 0.74%) 176kB/op after 5.05k (197.95µs) (± 1.51%) 176kB/op

straight-shoota

Seems like you had some fun =)

straight-shoota · 2019-09-08T08:15:52Z

src/markd/renderers/html_renderer.cr

      if entering
        cr
        tag(tag_name, attrs(node))
        # toc(node) if @options.toc
      else
-        tag("/#{tag_name}")
+        tag(end_tag_name)


It would be even simpler to let tag add the slash when calling it with tag(tag_name, end_tag: true) similar to self_closing arg.

Good idea. Done, thanks!

For markd's README.md: before 5.14k (194.65µs) (± 1.25%) 174kB/op after 5.23k (191.14µs) (± 0.55%) 173kB/op

icyleaf

Awesome optimization that i learn from the PR, this MUST be merged ASAP.

And as my previous words, i will contribute Markd to the standard library, let's do it now 😆

asterite · 2019-09-09T10:59:51Z

Another fact: I tried benchmarking rendering markd's README.md with markd and crystal-cmark, which is just a wrapper around a C library, and these are the results:

      markd   5.18k (193.20µs) (± 3.62%)   170kB/op   2.42× slower
common_mark  12.53k ( 79.78µs) (± 4.53%)  6.48kB/op        fastest

So markd is pretty good, considering it wasn't written with optimizations in mind (I think there's still room for improvement). Just 2.42 times slower is actually pretty good. The memory consumption there isn't real because the benchmark just shows memory Crystal knows about, not memory consumed by C, so we can't compare that.

asterite added 22 commits September 7, 2019 10:31

Optimization: avoid calling gsub when not needed

66c6ea0

For `Markd.to_html("hello world")` before 431.05k ( 2.32µs) (± 1.43%) 3.07kB/op after 506.43k ( 1.97µs) (± 2.50%) 2.72kB/op

Optimization: avoid creating attributes Hash if not needed

530de4c

For `Markd.to_html("hello world")` before 506.43k ( 1.97µs) (± 2.50%) 2.72kB/op after 554.18k ( 1.80µs) (± 1.10%) 2.41kB/op

Optimization: avoid creating openers_bottom Hash if not needed

92f850e

For `Markd.to_html("hello world")`: before 554.18k ( 1.80µs) (± 1.10%) 2.41kB/op after 640.45k ( 1.56µs) (± 6.70%) 2.06kB/op

Optimization: make Node#data be lazily initialized

c096d59

This avoids initializing a Hash when not needed. For `Markd.to_html("hello world")`: before 640.45k ( 1.56µs) (± 6.70%) 2.06kB/op after 727.39k ( 1.37µs) (± 2.28%) 1.59kB/op

Optimization: extract array literal to constant

795b0ff

Array literals in tight loops allocate a lot of memory! Parsing markd's README.md: before 3.52k (283.87µs) (± 0.59%) 259kB/op after 3.66k (273.38µs) (± 0.54%) 244kB/op

Optimization: avoid double regex match

4a8e485

For `Markd.to_html("hello world")`: before 726.36k ( 1.38µs) (± 1.86%) 1.59kB/op after 800.81k ( 1.25µs) (± 0.58%) 1.58kB/op

Optimization: perform MAIN rule by hand

938ae0e

For `Markd.to_html("hello world")`: before 800.81k ( 1.25µs) (± 0.58%) 1.58kB/op after 829.33k ( 1.21µs) (±25.35%) 1.56kB/op

Optimization: avoid allocating unnecessary arrays

e3d247e

We can use an inline comparison or a tuple. For markd's README.md: before 4.57k (218.68µs) (± 0.64%) 198kB/op after 4.63k (215.84µs) (± 2.20%) 196kB/op

Optimization: use str.ends_with? instead of str[-1] ==

e451a58

Optimization: use each_value instead of values.each

ee76396

`values.each` will create an intermediary array

Optimization: avoid string interpolation for h1...h6 headings

8540dda

Optimization: avoid string interpolation for lists tag names

c0eb22f

Rename line_size to lines_size

86c700d

Optimization: use String::Builder instead of IO::Memory

fb8048d

`String::Builder` is slightly faster when used for a one-time shot.

Optimization: avoid creating Hash when not necessary

ba9d8c4

Optimization: avoid creating array when not necessary

ab2a2d1

Optimization: split(/\s+/) is the same as split

5e9d34c

Optimization: match byte_begin + bytesize is just byte_end

1f83e56

straight-shoota reviewed Sep 8, 2019

View reviewed changes

asterite added 4 commits September 8, 2019 08:56

Add end_tag to HtmlRenderer#tag

80ae327

HtmlRenderer headings can be an array

16ed2f6

Optimization: pass capacity to String.build in normalize_uri

8688ebb

Optimization: avoid gsub if not needed in decode_uri

50ee837

For markd's README.md: before 5.14k (194.65µs) (± 1.25%) 174kB/op after 5.23k (191.14µs) (± 0.55%) 173kB/op

icyleaf approved these changes Sep 9, 2019

View reviewed changes

asterite mentioned this pull request Sep 9, 2019

Improve Markdown implementation - Or: Should we separate compiler tools? crystal-lang/crystal#4613

Closed

icyleaf merged commit d6fc062 into icyleaf:master Sep 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizations #19

Optimizations #19

asterite commented Sep 8, 2019

straight-shoota left a comment

straight-shoota Sep 8, 2019 •

edited

Loading

asterite Sep 8, 2019

icyleaf left a comment

asterite commented Sep 9, 2019 •

edited

Loading

Optimizations #19

Optimizations #19

Conversation

asterite commented Sep 8, 2019

Results

Hello world

Before this PR

After this PR

README.md

Before this PR

After this PR

Conclusion

straight-shoota left a comment

Choose a reason for hiding this comment

straight-shoota Sep 8, 2019 • edited Loading

Choose a reason for hiding this comment

asterite Sep 8, 2019

Choose a reason for hiding this comment

icyleaf left a comment

Choose a reason for hiding this comment

asterite commented Sep 9, 2019 • edited Loading

straight-shoota Sep 8, 2019 •

edited

Loading

asterite commented Sep 9, 2019 •

edited

Loading