Skip to content

Commit

Permalink
Parser: Optimize JSON-attribute parsing
Browse files Browse the repository at this point in the history
Alternate approach to #11355
Fixes: #11066

Parsing JSON-attributes in the existing parsers can be slow when the
attribute list is very long. This is due to the need to lookahead at
each character and then backtrack as the parser looks for the closing
of the attribute section and the block comment closer.

In this patch we're introducing an optimization that can eliminate
a significant amount of memory-usage and execution time when parsing
long attribute sections by recognizing that _when inside the JSON
attribute area_ any character that _isn't_ a `}` can be consumed
without any further investigation.

I've added a test to make sure we can parse 100,000 characters inside
the JSON attributes. The default parser was fine in testing with more
than a million but the spec parser was running out of memory. I'd
prefer to keep the tests running on all parsers and definitely let our
spec parser define the acceptance criteria so I left the limit low for
now. It'd be great if we found a way to update php-pegjs so that it
could generate the code differently. Until then I vote for a weaker
specification.

The default parser went from requiring hundreds of thousands of steps
and taking seconds to parse the attack to taking something like 35
steps and a few milliseconds.

There should be no functional changes to the parser outputs and no
breaking changes to the project. This is a performance/operational
optimization.
  • Loading branch information
dmsnell committed Nov 1, 2018
1 parent b3e0e89 commit 345accb
Show file tree
Hide file tree
Showing 6 changed files with 665 additions and 218 deletions.
Loading

0 comments on commit 345accb

Please sign in to comment.