Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unparse.js: don't excessively recompute depth cache for massive speedup #642

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

seansfkelley
Copy link

@seansfkelley seansfkelley commented Aug 21, 2023

I noticed that the unparse script was brutally slow for even modestly-sized grammars and modest depths. I would typically see 5-10s generation times for a grammar with 30-40 rules and depth 14.

min_depths_rule appears to be a memoizing cache, and also seems to be cleared unnecessarily frequently. The function that populates it, min_depth_cache, only relies on rules (immutable), i (the rule index), visited (always given as []) and the min_depths_rule cache (always set to [] before calling). Thus, as used, every call to min_depth_cache always returns the same value given the same i.

This commit adds an extra layer of caching, coupled with precomputation, to avoid calculating the min depth for each rule repeatedly. This results in a drop from 5-10s to a few milliseconds without any apparent change in result quality.

An earlier version of this change simply repurposed min_depths_rule as the precomputation cache by never clearing it. However, this subtly changed the results, an issue which I eventually traced to the observation that this cache gets populated with different values depending on which rule's min depth is currently being calculated. That is: the min depth for a rule depends on context, namely, the "root" rule in question, so this cache should not be reused across rules.

Caveat: I don't really understand the unparsing algorithm at the conceptual level. My claim for the correctness of this change rests purely on the manual code analysis explained above.

seansfkelley added a commit to seansfkelley/seansfkelley.github.io that referenced this pull request Aug 21, 2023
@seansfkelley seansfkelley changed the title unparse.js: don't clear rule depth cache for massive speedup unparse.js: don't excessively recompute depth cache for massive speedup Aug 27, 2023
The precomputed values for each rule's min depth should never change
over the lifetime of an unparse, so compute them up-front instead. By
precomputing these values, a speedup of multiple orders of magnitude is
achieved.

Note that there are two levels of caching here: the precomputed values
never change, but there is also an intermediate depth value cache which
gets populated with differing values depending on which rule was started
with. This latter cache is cleared between each precomputing each rule.

This code could be more time-efficient if it lazily computed the min
depth for each rule as requested, but I suspect any nontrivial depth
will hit most every rule anyway, and I wanted to keep the code diff
small and avoid adding more spaghetti.
@seansfkelley
Copy link
Author

I'd be happy to restructure this code to avoid the kinda-fishy dependency of the function on min_depth_rule_cache from the outer scope, if you'd like, but for the time being I opted to keep the diff small and simple instead of undertaking a larger rewrite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant