bottleneck of equality saturation #268

pca006132 · 2023-09-04T07:54:23Z

pca006132
Sep 4, 2023

Hi, I was reading on papers about e-graphs and I got a bit confused about the performance of e-graphs. In the paper "Equality Saturation: a New Approach to Optimization" (10.1145/1480881.1480915) the authors used a 0/1 integer programming approach for e-graph extraction, and took 90% of the time according to section 7.1. However, in the paper "Relational E-matching" (10.1145/3498696), the introduction part said that "e-matching is responsible for 60–90% of the overall run time".

It seems to me that the default extractor in egg can only handle trees, as #128 added an LP extractor that can handle DAG. I have two questions:

Does this means the default extractor extracts sub-optimal result for e-graphs that are more complicated than a tree? Or does this depends on the cost function?
How slow is the LP extractor comparing with the default extractor?

yihozhang · 2023-09-04T22:12:06Z

yihozhang
Sep 4, 2023
Maintainer

It depends on what "handle" means here. Both extractors take an e-graph and produce an extracted program. The difference between the default extractor and an LP extractor is that, the default extractor is optimal w.r.t. tree size of the extracted program, and the LP extractor is optimal w.r.t. DAG size (i.e., common subterms are counted only once) of the extracted program. In many cases, the tree size of a program is a very good approximation of DAG size.

In many use cases of equality saturation, the tree size is what people are actually looking for, thus the relational e-matching paper claimed "e-matching is responsible for 60–90% of the overall run time". However, you are right if you are using a more complex extraction algorithm, it's likely extraction is the bottleneck.

How slow is the LP extractor comparing with the default extractor?

It is much slower, as DAG extraction is an NP-hard problem. Check out https://github.com/egraphs-good/extraction-gym for more detailed comparisons between different extraction algorithms.

0 replies

mwillsey · 2023-09-05T18:08:00Z

mwillsey
Sep 5, 2023
Maintainer

It's also worth noting that the paper cited above (the original eqsat paper) doesn't really do e-matching in a form that similar to the way the egg or modern SMT solvers do it. It instead takes a trigger-based approach that transforms terms as they are inserted. They also bounded the search phase to a size that (if I recall correctly) is orders of magnitude smaller than we explore in egg. So we are spending more (relative) time in e-matching because 1. there is more to e-match and 2. we are using a simpler extraction approach.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bottleneck of equality saturation #268

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

bottleneck of equality saturation #268

pca006132 Sep 4, 2023

Replies: 2 comments

yihozhang Sep 4, 2023 Maintainer

mwillsey Sep 5, 2023 Maintainer

pca006132
Sep 4, 2023

yihozhang
Sep 4, 2023
Maintainer

mwillsey
Sep 5, 2023
Maintainer