Skip to content

Commit

Permalink
[fix] re #331 eager scan for carriage return caused slow parsing
Browse files Browse the repository at this point in the history
  • Loading branch information
biojppm committed Nov 27, 2022
1 parent 64f037a commit 9082379
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 7 deletions.
2 changes: 1 addition & 1 deletion changelog/0.5.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@
- ~10x faster than `scanf()`
- ~30x-50x faster than a naive `stringstream::str()` followed by `stringstream::operator>>()`
For more details, see [the changelog for c4core 0.1.10](https://github.com/biojppm/c4core/releases/tag/v0.1.10).
- Fix [#289](https://github.com/biojppm/rapidyaml/issues/289) - parsing of flow-style sequences had quadratic complexity, causing long parse times in ultra long lines [PR#293](https://github.com/biojppm/rapidyaml/pull/293).
- Fix [#289](https://github.com/biojppm/rapidyaml/issues/289) and [#331](https://github.com/biojppm/rapidyaml/issues/331) - parsing of single-line flow-style sequences had quadratic complexity, causing long parse times in ultra long lines [PR#293](https://github.com/biojppm/rapidyaml/pull/293)/[PR#332](https://github.com/biojppm/rapidyaml/pull/332).
- This was due to scanning for the token `: ` before scanning for `,` or `]`, which caused line-length scans on every scalar scan. Changing the order of the checks was enough to address the quadratic complexity, and the parse times for flow-style are now in line with block-style.
- As part of this changeset, a significant number of runtime branches was eliminated by separating `Parser::_scan_scalar()` into several different `{seq,map}x{block,flow}` functions specific for each context. Expect some improvement in parse times.
- Also, on Debug builds (or assertion-enabled builds) there was a paranoid assertion calling `Tree::has_child()` in `Tree::insert_child()` that caused quadratic behavior because the assertion had linear complexity. It was replaced with a somewhat equivalent O(1) assertion.
Expand Down
10 changes: 4 additions & 6 deletions src/c4/yml/parse.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4139,10 +4139,9 @@ csubstr Parser::_scan_squot_scalar()

// leading whitespace also needs filtering
needs_filter = needs_filter
|| numlines > 1
|| (numlines > 1)
|| line_is_blank
|| (_at_line_begin() && line.begins_with(' '))
|| (m_state->line_contents.full.last_of('\r') != csubstr::npos);
|| (_at_line_begin() && line.begins_with(' '));

if(pos == npos)
{
Expand Down Expand Up @@ -4241,10 +4240,9 @@ csubstr Parser::_scan_dquot_scalar()

// leading whitespace also needs filtering
needs_filter = needs_filter
|| numlines > 1
|| (numlines > 1)
|| line_is_blank
|| (_at_line_begin() && line.begins_with(' '))
|| (m_state->line_contents.full.last_of('\r') != csubstr::npos);
|| (_at_line_begin() && line.begins_with(' '));

if(pos == npos)
{
Expand Down

0 comments on commit 9082379

Please sign in to comment.