Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lens] Improve performance for large formulas #141456

Merged
merged 2 commits into from
Sep 23, 2022
Merged

Conversation

dej611
Copy link
Contributor

@dej611 dej611 commented Sep 22, 2022

Summary

Fixes #140875

Formulas are parsed multiple times during their own lifecycle before rendering, but somehow the expression parsing step seems to be very slow (really slow!):

tinymath_performance_explanation

Check in particular the aggregated time of 5,96 seconds (!) for this basic formulas like this:

sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes)

As noted in the linked issue that formula is rewritten as this new formula:

add(add(add(add(add(add(add(add(add(add(add(add(add(add(add(add("3e090ea6-3faa-4941-902d-19bed42ead83X0","3e090ea6-3faa-4941-902d-19bed42ead83X1"),"3e090ea6-3faa-4941-902d-19bed42ead83X2"),"3e090ea6-3faa-4941-902d-19bed42ead83X3"),"3e090ea6-3faa-4941-902d-19bed42ead83X4"),"3e090ea6-3faa-4941-902d-19bed42ead83X5"),"3e090ea6-3faa-4941-902d-19bed42ead83X6"),"3e090ea6-3faa-4941-902d-19bed42ead83X7"),"3e090ea6-3faa-4941-902d-19bed42ead83X8"),"3e090ea6-3faa-4941-902d-19bed42ead83X9"),"3e090ea6-3faa-4941-902d-19bed42ead83X10"),"3e090ea6-3faa-4941-902d-19bed42ead83X11"),"3e090ea6-3faa-4941-902d-19bed42ead83X12"),"3e090ea6-3faa-4941-902d-19bed42ead83X13"),"3e090ea6-3faa-4941-902d-19bed42ead83X14"),"3e090ea6-3faa-4941-902d-19bed42ead83X15"),"3e090ea6-3faa-4941-902d-19bed42ead83X16")

After this rewriting performances get really bad, probably due to some recursion issue in the peg(gy) library used to generate the parser from the grammar - or rather non idiomatic way the tinymath grammar has been written.
Rewriting the grammar seemed not a trivial task, therefore the way the formula is rewritten has been targeted.

The idea here is to "flatten" the rewritten formula, but only the add function accepts more than 2 arguments (i.e. add( arg1, arg2, arg3, arg4, ....)). The intuition is that the first parsing timings were pretty good, so why not reuse the same technique there?
So here's the PR implementation to use the +/-/*// math operations instead of their long versions, which proved to be way faster: in fact the grammar is highly optimized to parse those 4 symbols vs the functions counterpart.

New implementation is now in line with the monaco editor performance with few ms of parsing time (see overall timing of ~15ms!):

Screenshot 2022-09-22 at 16 44 46

Taking as example the previous formula:

sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes) + sum(bytes)

The new rewrite will be something like:

(((((((((((((((("c87d5ed7-39f7-46f0-b81c-f04938586c17X0" + "c87d5ed7-39f7-46f0-b81c-f04938586c17X1") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X2") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X3") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X4") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X5") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X6") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X7") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X8") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X9") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X10") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X11") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X12") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X13") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X14") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X15") + "c87d5ed7-39f7-46f0-b81c-f04938586c17X16")

Round brackets in the new rewrite are used to preserve the original explicit grouping (i.e. sum(bytes) + (sum(bytes) - sum(bytes)) / (sum(bytes) - sum(bytes) * sum(bytes))) and avoid issues.

From user perspective results will appear now few seconds earlier.

Before:

large_formula_fix_before

After:

large_formula_fix_add

This fix does not affect only single operations cases (as in the issue example), rather any mixed combination of the 4 basic operations:

large_formula_fix

Note this is only a client side parsing step optimization. For further formulas requests optimization see #140859

Note 2 this fix is not a general solution for nesting performance issue, which has to be tackled at grammar level. A formula like the following will still suffer of performance issues: pick_min(pick_max(abs(sqrt(log(fix(floor(sum(bytes)))))), 5), 10)

Checklist

Delete any items that are not applicable to this PR.

Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.

When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:

Risk Probability Severity Mitigation/Notes
Multiple Spaces—unexpected behavior in non-default Kibana Space. Low High Integration tests will verify that all features are still supported in non-default Kibana Space and when user switches between spaces.
Multiple nodes—Elasticsearch polling might have race conditions when multiple Kibana nodes are polling for the same tasks. High Low Tasks are idempotent, so executing them multiple times will not result in logical error, but will degrade performance. To test for this case we add plenty of unit tests around this logic and document manual testing procedure.
Code should gracefully handle cases when feature X or plugin Y are disabled. Medium High Unit tests will verify that any feature flag or plugin combination still results in our service operational.
See more potential risk examples

For maintainers

@dej611 dej611 added Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens v8.6.0 labels Sep 22, 2022
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
lens 1.2MB 1.2MB +107.0B

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@dej611 dej611 changed the title [Lens] Fix performance issue for large formulas [Lens] Improve performance for large formulas Sep 22, 2022
@dej611 dej611 marked this pull request as ready for review September 22, 2022 16:13
@dej611 dej611 requested a review from a team as a code owner September 22, 2022 16:13
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-vis-editors @elastic/kibana-vis-editors-external (Team:VisEditors)

Copy link
Contributor

@flash1293 flash1293 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and this works perfectly fine, LGTM 👍

@flash1293
Copy link
Contributor

this fix is not a general solution for nesting performance issue, which has to be tackled at grammar level. A formula like the following will still suffer of performance issues: pick_min(pick_max(abs(sqrt(log(fix(floor(sum(bytes)))))), 5), 10)

True, but I think in practice this kind of formula is very rare while the optimized ones are very realistic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Lens release_note:enhancement Team:Visualizations Visualization editors, elastic-charts and infrastructure v8.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Lens] Poor performance for large formulas
5 participants