Make inlining decisions a bit more predictable in our main queries. #14023

jpountz · 2024-11-27T18:02:43Z

This implements a small contained hack to make sure that our compound scorers like MaxScoreBulkScorer, ConjunctionBulkScorer, BlockMaxConjunctionBulkScorer, WANDScorer and ConjunctionDISI only have two concrete implementations of DocIdSetIterator and Scorable to deal with.

This helps because it makes calls to DocIdSetIterator#nextDoc(), DocIdSetIterator#advance(int) and Scorable#score() bimorphic at most, and bimorphic calls are candidate for inlining.

This should help speed up boolean queries of term queries at the expense of boolean queries of other query types. This feels fair to me as it gives more speedups than slowdowns in benchmarks, and that boolean queries of term queries are extremely typical. Boolean queries that mix term queries and other types of queries may get a slowdown or a speedup depending on whether they get more from the speedup on their term clauses than they lose on their other clauses.

This implements a small contained hack to make sure that our compound scorers like `MaxScoreBulkScorer`, `ConjunctionBulkScorer`, `BlockMaxConjunctionBulkScorer`, `WANDScorer` and `ConjunctionDISI` only have two concrete implementations of `DocIdSetIterator` and `Scorable` to deal with. This helps because it makes calls to `DocIdSetIterator#nextDoc()`, `DocIdSetIterator#advance(int)` and `Scorable#score()` bimorphic at most, and bimorphic calls are candidate for inlining. This should help speed up boolean queries of term queries at the expense of boolean queries of other query types. This feels fair to me as it gives more speedups than slowdowns in benchmarks, and that boolean queries of term queries are extremely typical. Boolean queries that mix term queries and other types of queries may get a slowdown or a speedup depending on whether they get more from the speedup on their term clauses than they lose on their other clauses.

jpountz · 2024-11-27T18:06:29Z

This gives a good speedup when running a tasks file that has very diverse queries like we now do in nightly benchmarks. Some less typical queries get a slowdown, which is fine in my opinion.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 AndHighOrMedMed       46.33      (1.1%)       42.47      (2.5%)   -8.3% ( -11% -   -4%) 0.000
                  FilteredPhrase       25.76      (2.9%)       25.16      (2.8%)   -2.3% (  -7% -    3%) 0.009
                DismaxOrHighHigh       69.11      (3.1%)       67.88      (3.7%)   -1.8% (  -8% -    5%) 0.101
               FilteredOrHighMed      139.21      (2.6%)      136.75      (2.1%)   -1.8% (  -6% -    3%) 0.019
                FilteredOr3Terms      152.47      (2.5%)      149.85      (2.0%)   -1.7% (  -6% -    2%) 0.016
                 DismaxOrHighMed       85.03      (3.2%)       83.86      (3.6%)   -1.4% (  -7% -    5%) 0.195
      FilteredOr2Terms2StopWords      150.78      (2.3%)      148.72      (1.7%)   -1.4% (  -5% -    2%) 0.035
              FilteredOrHighHigh       65.05      (2.7%)       64.16      (2.1%)   -1.4% (  -6% -    3%) 0.075
                    FilteredTerm      159.84      (2.2%)      157.75      (1.4%)   -1.3% (  -4% -    2%) 0.025
             FilteredOrStopWords       50.26      (2.8%)       49.61      (2.1%)   -1.3% (  -6% -    3%) 0.101
                  FilteredOrMany       12.07      (3.1%)       11.92      (3.9%)   -1.3% (  -8% -    5%) 0.257
                     CountPhrase        4.42      (2.0%)        4.39      (3.3%)   -0.7% (  -5% -    4%) 0.430
             CombinedAndHighHigh       15.36      (1.5%)       15.26      (1.5%)   -0.6% (  -3% -    2%) 0.186
                      DismaxTerm      645.14      (2.3%)      641.81      (2.5%)   -0.5% (  -5% -    4%) 0.490
                        PKLookup      276.07      (1.7%)      274.77      (2.1%)   -0.5% (  -4% -    3%) 0.425
                 CountAndHighMed      169.12      (1.6%)      168.44      (1.3%)   -0.4% (  -3% -    2%) 0.386
                  CountOrHighMed      142.97      (1.4%)      142.70      (1.0%)   -0.2% (  -2% -    2%) 0.623
                CountAndHighHigh       57.56      (1.2%)       57.46      (0.6%)   -0.2% (  -1% -    1%) 0.543
              CombinedAndHighMed       56.06      (1.7%)       56.02      (1.4%)   -0.1% (  -3% -    3%) 0.904
                 CountOrHighHigh       75.48      (1.0%)       75.66      (0.8%)    0.2% (  -1% -    2%) 0.402
                AndMedOrHighHigh       59.11      (1.6%)       59.38      (1.4%)    0.5% (  -2% -    3%) 0.340
                      OrHighRare      286.09      (3.7%)      287.56      (4.7%)    0.5% (  -7% -    9%) 0.701
                       CountTerm     9106.38      (4.7%)     9192.25      (4.0%)    0.9% (  -7% -   10%) 0.492
               FilteredAnd3Terms      191.49      (1.9%)      195.49      (2.0%)    2.1% (  -1% -    6%) 0.001
            FilteredAndStopWords       48.66      (2.4%)       49.95      (2.1%)    2.7% (  -1% -    7%) 0.000
                          OrMany       19.42      (5.2%)       20.00      (5.7%)    2.9% (  -7% -   14%) 0.087
     FilteredAnd2Terms2StopWords      197.27      (1.8%)      204.14      (1.6%)    3.5% (   0% -    6%) 0.000
             FilteredAndHighHigh       62.50      (2.3%)       64.82      (1.8%)    3.7% (   0% -    8%) 0.000
                    CombinedTerm       32.04      (3.1%)       33.31      (3.0%)    4.0% (  -2% -   10%) 0.000
              FilteredAndHighMed      125.80      (3.0%)      130.97      (3.0%)    4.1% (  -1% -   10%) 0.000
              Or2Terms2StopWords      163.16      (5.1%)      172.65      (5.7%)    5.8% (  -4% -   17%) 0.001
               CombinedOrHighMed       69.43      (4.6%)       73.56      (2.2%)    6.0% (   0% -   13%) 0.000
                        Or3Terms      172.44      (5.2%)      183.18      (5.8%)    6.2% (  -4% -   18%) 0.000
              CombinedOrHighHigh       18.17      (4.5%)       19.33      (2.0%)    6.4% (   0% -   13%) 0.000
             And2Terms2StopWords      159.48      (3.8%)      169.72      (4.0%)    6.4% (  -1% -   14%) 0.000
                       OrHighMed      197.10      (1.5%)      211.20      (3.8%)    7.2% (   1% -   12%) 0.000
                       And3Terms      168.66      (3.9%)      183.49      (4.6%)    8.8% (   0% -   18%) 0.000
                     OrStopWords       33.61      (8.0%)       36.79      (9.7%)    9.5% (  -7% -   29%) 0.001
                      OrHighHigh       53.00      (1.6%)       58.03      (4.9%)    9.5% (   2% -   16%) 0.000
                    AndStopWords       29.89      (5.6%)       32.87      (6.6%)   10.0% (  -2% -   23%) 0.000
                      AndHighMed      122.11      (1.7%)      135.85      (1.9%)   11.3% (   7% -   15%) 0.000
                     AndHighHigh       41.68      (1.5%)       46.61      (1.8%)   11.9% (   8% -   15%) 0.000

ChrisHegarty

LGTM

jpountz · 2024-11-29T12:14:06Z

Thanks @ChrisHegarty for looking, I was about to ping you. :)

I reflected more on this change. The bits about specializing our bulk scorers that compute top-k hits by score with ImpactsEnum is not controversial I believe, so I'm keeping it (MaxScoreBulkScorer, BlockMaxConjunctionBulkScorer, BlockMaxConjunctionScorer, WANDScorer). However, the bit about specializing other types of queries for PostingsEnum is less obvious since the iterators could likely be a BitSetIterator, a doc-value iterator, or something else. (ConjunctionBulkScorer, ConjunctionDISI) so I reverted this part, we can look into these queries in a follow-up.

jpountz · 2024-11-29T12:19:17Z

Benchmarks results are still good:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 AndHighOrMedMed       46.03      (0.9%)       44.05      (2.2%)   -4.3% (  -7% -   -1%) 0.000
                    CombinedTerm       32.13      (3.1%)       31.41      (4.5%)   -2.3% (  -9% -    5%) 0.216
                      OrHighRare      285.01      (1.8%)      279.80      (7.5%)   -1.8% ( -10% -    7%) 0.474
                DismaxOrHighHigh       71.98      (2.2%)       70.86      (3.8%)   -1.5% (  -7% -    4%) 0.284
                 DismaxOrHighMed       88.46      (2.1%)       87.20      (3.5%)   -1.4% (  -6% -    4%) 0.289
                  FilteredOrMany       17.06      (2.3%)       16.85      (3.9%)   -1.3% (  -7% -    5%) 0.402
                 CountAndHighMed      170.06      (1.3%)      168.00      (1.3%)   -1.2% (  -3% -    1%) 0.049
             CombinedAndHighHigh       15.38      (2.1%)       15.22      (1.8%)   -1.0% (  -4% -    2%) 0.262
              CombinedAndHighMed       56.15      (2.2%)       55.63      (1.5%)   -0.9% (  -4% -    2%) 0.300
                    FilteredTerm      160.08      (1.6%)      158.71      (2.3%)   -0.9% (  -4% -    3%) 0.367
                        PKLookup      276.03      (1.1%)      273.79      (1.0%)   -0.8% (  -2% -    1%) 0.096
                CountAndHighHigh       57.71      (1.0%)       57.33      (0.8%)   -0.7% (  -2% -    1%) 0.119
                  CountOrHighMed      142.86      (1.2%)      142.05      (1.0%)   -0.6% (  -2% -    1%) 0.262
                       OrHighMed      197.40      (1.1%)      196.55      (3.7%)   -0.4% (  -5% -    4%) 0.737
               CombinedOrHighMed       71.01      (6.2%)       70.71      (2.2%)   -0.4% (  -8% -    8%) 0.846
              FilteredOrHighHigh       66.54      (0.8%)       66.28      (1.4%)   -0.4% (  -2% -    1%) 0.480
                      OrHighHigh       53.05      (1.7%)       52.93      (5.0%)   -0.2% (  -6% -    6%) 0.905
                 CountOrHighHigh       75.24      (1.0%)       75.21      (0.4%)   -0.0% (  -1% -    1%) 0.914
              CombinedOrHighHigh       18.58      (6.3%)       18.58      (2.2%)    0.0% (  -7% -    9%) 0.985
                      DismaxTerm      636.83      (3.1%)      638.21      (3.7%)    0.2% (  -6% -    7%) 0.892
               FilteredOrHighMed      156.56      (1.0%)      156.91      (0.6%)    0.2% (  -1% -    1%) 0.564
      FilteredOr2Terms2StopWords      150.61      (0.9%)      151.00      (0.9%)    0.3% (  -1% -    2%) 0.540
             FilteredOrStopWords       44.91      (1.4%)       45.04      (1.5%)    0.3% (  -2% -    3%) 0.659
                  FilteredPhrase       25.49      (2.4%)       25.60      (2.5%)    0.4% (  -4% -    5%) 0.710
                FilteredOr3Terms      168.68      (0.9%)      169.46      (0.8%)    0.5% (  -1% -    2%) 0.259
                     CountPhrase        4.34      (1.6%)        4.37      (1.5%)    0.6% (  -2% -    3%) 0.382
                       CountTerm     8947.47      (3.5%)     9072.96      (4.2%)    1.4% (  -6% -    9%) 0.441
              Or2Terms2StopWords      163.56      (4.2%)      166.79      (2.5%)    2.0% (  -4% -    9%) 0.231
                        Or3Terms      172.60      (4.3%)      176.37      (1.9%)    2.2% (  -3% -    8%) 0.162
                     OrStopWords       33.82      (7.4%)       34.61      (2.9%)    2.3% (  -7% -   13%) 0.375
                AndMedOrHighHigh       59.30      (1.2%)       61.05      (1.3%)    3.0% (   0% -    5%) 0.000
            FilteredAndStopWords       48.71      (1.8%)       50.58      (0.7%)    3.9% (   1% -    6%) 0.000
     FilteredAnd2Terms2StopWords      197.16      (1.4%)      205.16      (0.9%)    4.1% (   1% -    6%) 0.000
                          OrMany       19.37      (4.5%)       20.16      (2.6%)    4.1% (  -2% -   11%) 0.018
             FilteredAndHighHigh       62.69      (1.9%)       65.50      (0.7%)    4.5% (   1% -    7%) 0.000
               FilteredAnd3Terms      191.55      (1.5%)      202.59      (1.4%)    5.8% (   2% -    8%) 0.000
              FilteredAndHighMed      126.44      (2.6%)      134.60      (1.4%)    6.5% (   2% -   10%) 0.000
             And2Terms2StopWords      159.08      (3.4%)      170.55      (1.5%)    7.2% (   2% -   12%) 0.000
                       And3Terms      169.46      (3.0%)      184.73      (1.3%)    9.0% (   4% -   13%) 0.000
                      AndHighMed      122.19      (1.3%)      134.05      (1.5%)    9.7% (   6% -   12%) 0.000
                     AndHighHigh       41.56      (1.2%)       45.79      (1.9%)   10.2% (   6% -   13%) 0.000
                    AndStopWords       30.03      (5.4%)       33.46      (1.8%)   11.4% (   4% -   19%) 0.000

…14023) This implements a small contained hack to make sure that our compound scorers like `MaxScoreBulkScorer`, `ConjunctionBulkScorer`, `BlockMaxConjunctionBulkScorer`, `WANDScorer` and `ConjunctionDISI` only have two concrete implementations of `DocIdSetIterator` and `Scorable` to deal with. This helps because it makes calls to `DocIdSetIterator#nextDoc()`, `DocIdSetIterator#advance(int)` and `Scorable#score()` bimorphic at most, and bimorphic calls are candidate for inlining. This should help speed up boolean queries of term queries at the expense of boolean queries of other query types. This feels fair to me as it gives more speedups than slowdowns in benchmarks, and that boolean queries of term queries are extremely typical. Boolean queries that mix term queries and other types of queries may get a slowdown or a speedup depending on whether they get more from the speedup on their term clauses than they lose on their other clauses.

jpountz · 2024-11-30T16:26:20Z

This helped on nightly benchmarks as well, e.g. AndHighHigh and AndStopWords got a 10% speedup. https://benchmarks.mikemccandless.com/AndHighHigh.html https://benchmarks.mikemccandless.com/AndStopWords.html

jpountz added this to the 10.1.0 milestone Nov 27, 2024

CHANGES

1e681a2

jpountz added 2 commits November 27, 2024 19:19

Fix test

318f1e1

fix

17f4d01

ChrisHegarty approved these changes Nov 29, 2024

View reviewed changes

jpountz added 3 commits November 29, 2024 11:17

iter

b3a228c

Merge branch 'main' into reduce_inlining_hazard

d756e47

iter

2122dbe

jpountz force-pushed the reduce_inlining_hazard branch from 7368254 to 2122dbe Compare November 29, 2024 12:11

jpountz merged commit f9869b5 into apache:main Nov 29, 2024
3 checks passed

jpountz deleted the reduce_inlining_hazard branch November 29, 2024 12:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make inlining decisions a bit more predictable in our main queries. #14023

Make inlining decisions a bit more predictable in our main queries. #14023

jpountz commented Nov 27, 2024

jpountz commented Nov 27, 2024 •

edited

Loading

ChrisHegarty left a comment

jpountz commented Nov 29, 2024

jpountz commented Nov 29, 2024

jpountz commented Nov 30, 2024

Make inlining decisions a bit more predictable in our main queries. #14023

Make inlining decisions a bit more predictable in our main queries. #14023

Conversation

jpountz commented Nov 27, 2024

jpountz commented Nov 27, 2024 • edited Loading

ChrisHegarty left a comment

Choose a reason for hiding this comment

jpountz commented Nov 29, 2024

jpountz commented Nov 29, 2024

jpountz commented Nov 30, 2024

jpountz commented Nov 27, 2024 •

edited

Loading