Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - perf: eagerly elaborate leaf nodes in linear_combination #15599

Closed
wants to merge 8 commits into from

Conversation

kmill
Copy link
Contributor

@kmill kmill commented Aug 7, 2024

Since the arithmetic type is known from the equality goal, we can eagerly elaborate all leaf nodes in expandLinearCombo while also avoiding re-elaborating them when deciding whether they are proofs or not. This brings the elaboration of a complicated example (see tests) from 3.91s to 0.29s.

The main issue is that large expressions can be slow to elaborate. In particular, exponentiation relies on the default instance mechanism, which appears can take quadratic time to resolve. Forcing the resolution of default instances within each coefficient of the linear combination bounds the default instance problems.

This is similar to PR #15570, but it sticks with the syntax transformation paradigm.

Co-authored-by: Heather Macbeth [email protected]


Open in Gitpod

Previously, `expandLinearCombo` would throw away the terms it would elaborate when determining which ones were proofs. Now, it saves these subproblems. This brings the elaboration of a complicated example from 3.91s to 1.85s.

We further bring this down to 0.286s by controling the elaboration of `^`, forcing the exponent to fully elaborate, which lets the HPow instances specialize more quickly. This is OK since we do not expect core Lean's HomogeneousPow instances to be applicable in the usual applications of `linear_combination`.

This is similar to PR #15570, but it uses the standard expression elaborator.

Co-authored-by: Heather Macbeth <[email protected]>
Copy link

github-actions bot commented Aug 7, 2024

PR summary 69100ddaf9

Import changes for modified files

No significant changes to the import graph

Import changes for all files
Files Import difference

Declarations diff

+ Expanded

You can run this locally as follows
## summary with just the declaration names:
./scripts/declarations_diff.sh <optional_commit>

## more verbose report:
./scripts/declarations_diff.sh long <optional_commit>

The doc-module for script/declarations_diff.sh contains some details about this script.

@kmill kmill changed the title perf: improve elaboration for linear_combination perf: improve ^ elaboration for linear_combination Aug 7, 2024
@hrmacbeth
Copy link
Member

This speedup is confusing to me, because, if I understand correctly, it's not just about the ^ elaboration! Case in point: in this branch
https://github.com/leanprover-community/mathlib4/tree/HM-linear-combination-variant
I tried to make the diff of this PR more minimal, by throwing away the Expanded data structure and using Option as before. In that variant, even using the tweaks to ^ elaboration, the big example still takes 58,000 heartbeats. So how is the Expanded structure helping?

@kmill
Copy link
Contributor Author

kmill commented Aug 7, 2024

@hrmacbeth I think the deal is that in your branch in the exponentiation case, it's only committing to the fully-elaborated exponent if the base of the exponent is a proof. Mine commits even if the base isn't a proof.

To spell out my understanding of the problem a bit more, when you have x ^ 2 for example, it might need to go through multiple rounds of default instances until it commits to 2 : Nat, and from there it's immediate that there's a Pow _ Nat instance since fields are monoids. The thing to keep in mind here is that the term elaborator state contains a big list of pending instance problems, and the elaborator is not very smart about how it makes progress: it iteratively goes through this entire list finding any that can make progress. Default instances only apply once it's gone through the entire list (and failed), and then once it finds any default instance, it returns to the beginning of the list and starts iterating again. This can have quadratic behavior (or worse?).

With my version of the code, we fully elaborate 2, which defaults to Nat fairly quickly. This prevents it from ever being added to the main list of instance problems, and we can see it turns out to have a significant effect.

@hrmacbeth
Copy link
Member

Thanks, that's instructive. So here's another version to consider: don't special-case exponentiation, just pre-elaborate every leaf term (noting that we know the expected type, because we can take it from the goal).
https://github.com/leanprover-community/mathlib4/tree/HM-linear-combination-variant-2

This is a little slower, 8,374 rather than 7,763 heartbeats, but not by much. And perhaps it might also avoid similar problems in situations other than exponentiation?

@kmill kmill changed the title perf: improve ^ elaboration for linear_combination perf: eagerly elaborate leaf nodes in linear_combination Aug 8, 2024
@kmill
Copy link
Contributor Author

kmill commented Aug 8, 2024

@hrmacbeth Good idea. I've updated the PR, and managed to re-use the elaboration in leaf nodes to bring down the heartbeats further. Let's see if mathlib builds!

@hrmacbeth
Copy link
Member

hrmacbeth commented Aug 8, 2024

By the way, in the long term I'm not in favour of the "exponentiation" capability this PR adds (like linear_combination h ^ 3) -- eventually I'd advocate for restricting the capabilities of linear_combination to only genuine linear operations, which would also involve disabling other capabilities Mario added in the port, like linear_combination h1 * h2 and linear_combination 5 / h.

Short version of my rationale: the congr term elaborator already provides support for arbitrary operations, but the particular way hypotheses are combined in linear_combination (apply the operation, move everything to LHS, convert, then normalize) becomes much less canonical/predictable for non-linear operations.

I'm happy for this PR to be merged as-is ... I can open another PR later to allow that discussion to happen separately.

@kmill
Copy link
Contributor Author

kmill commented Aug 8, 2024

@hrmacbeth This is so close to just generating expressions instead of going through macros, so I implemented a version that calls elaboration functions directly to do this, and for the big example it just saves 300 heartbeats, without measurably taking any less time. It adds a lot of complexity for no reason, so I'm not pushing it.

I think this proves that the fundamental slowdown is really just from instance synthesis and bounding the elaboration problems rather than using a macro-based approach. I benchmarked your Qq version too, and it takes 0.29s on my computer, with 7500 heartbeats. What do you think? Should we go with this version? Or go with the Qq one, and hopefully fix how it elaborates expressions so that you don't need to change any of mathlib?

@hrmacbeth
Copy link
Member

What do you think? Should we go with this version? Or go with the Qq one, and hopefully fix how it elaborates expressions so that you don't need to change any of mathlib?

@kmill This has been a very interesting exploration for me to follow, but I don't feel equipped to judge between the two implementations we have reached: the crude metric of speed is the only one I know for judging meta code, and they seem to be the same on this. So, your call -- do you think one version has an edge over the other in simplicity or in maintainability?

@kmill
Copy link
Contributor Author

kmill commented Aug 8, 2024

@hrmacbeth I think that this non-Qq version might be easier to maintain, and it has the bonus of being a drop-in replacement. I still don't completely feel like I have a complete handle on the issue... but I think we've pretty well localized it to elaborating large arithmetic expressions being slow, and the withSynthesize in expandLinearCombo really is the performance fix here.

@hrmacbeth
Copy link
Member

Sounds good to me! You made me a co-author on this, so I won't merge directly :)

maintainer merge

Copy link

github-actions bot commented Aug 8, 2024

🚀 Pull request has been placed on the maintainer queue by hrmacbeth.

@kim-em
Copy link
Contributor

kim-em commented Aug 9, 2024

bors merge

@github-actions github-actions bot added the ready-to-merge This PR has been sent to bors. label Aug 9, 2024
mathlib-bors bot pushed a commit that referenced this pull request Aug 9, 2024
Since the arithmetic type is known from the equality goal, we can eagerly elaborate all leaf nodes in `expandLinearCombo` while also avoiding re-elaborating them when deciding whether they are proofs or not. This brings the elaboration of a complicated example (see tests) from 3.91s to 0.29s.

The main issue is that large expressions can be slow to elaborate. In particular, exponentiation relies on the default instance mechanism, which appears can take quadratic time to resolve. Forcing the resolution of default instances within each coefficient of the linear combination bounds the default instance problems.

This is similar to PR #15570, but it sticks with the syntax transformation paradigm.

Co-authored-by: Heather Macbeth <[email protected]>
@mathlib-bors
Copy link
Contributor

mathlib-bors bot commented Aug 9, 2024

Pull request successfully merged into master.

Build succeeded:

@mathlib-bors mathlib-bors bot changed the title perf: eagerly elaborate leaf nodes in linear_combination [Merged by Bors] - perf: eagerly elaborate leaf nodes in linear_combination Aug 9, 2024
@mathlib-bors mathlib-bors bot closed this Aug 9, 2024
@mathlib-bors mathlib-bors bot deleted the kmill_fast_linear_combination branch August 9, 2024 02:01
bjoernkjoshanssen pushed a commit that referenced this pull request Sep 9, 2024
Since the arithmetic type is known from the equality goal, we can eagerly elaborate all leaf nodes in `expandLinearCombo` while also avoiding re-elaborating them when deciding whether they are proofs or not. This brings the elaboration of a complicated example (see tests) from 3.91s to 0.29s.

The main issue is that large expressions can be slow to elaborate. In particular, exponentiation relies on the default instance mechanism, which appears can take quadratic time to resolve. Forcing the resolution of default instances within each coefficient of the linear combination bounds the default instance problems.

This is similar to PR #15570, but it sticks with the syntax transformation paradigm.

Co-authored-by: Heather Macbeth <[email protected]>
bjoernkjoshanssen pushed a commit that referenced this pull request Sep 9, 2024
Since the arithmetic type is known from the equality goal, we can eagerly elaborate all leaf nodes in `expandLinearCombo` while also avoiding re-elaborating them when deciding whether they are proofs or not. This brings the elaboration of a complicated example (see tests) from 3.91s to 0.29s.

The main issue is that large expressions can be slow to elaborate. In particular, exponentiation relies on the default instance mechanism, which appears can take quadratic time to resolve. Forcing the resolution of default instances within each coefficient of the linear combination bounds the default instance problems.

This is similar to PR #15570, but it sticks with the syntax transformation paradigm.

Co-authored-by: Heather Macbeth <[email protected]>
bjoernkjoshanssen pushed a commit that referenced this pull request Sep 12, 2024
Since the arithmetic type is known from the equality goal, we can eagerly elaborate all leaf nodes in `expandLinearCombo` while also avoiding re-elaborating them when deciding whether they are proofs or not. This brings the elaboration of a complicated example (see tests) from 3.91s to 0.29s.

The main issue is that large expressions can be slow to elaborate. In particular, exponentiation relies on the default instance mechanism, which appears can take quadratic time to resolve. Forcing the resolution of default instances within each coefficient of the linear combination bounds the default instance problems.

This is similar to PR #15570, but it sticks with the syntax transformation paradigm.

Co-authored-by: Heather Macbeth <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintainer-merge ready-to-merge This PR has been sent to bors.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants