Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make inlining decisions a bit more predictable in our main queries. #14023

Merged
merged 7 commits into from
Nov 29, 2024

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Nov 27, 2024

This implements a small contained hack to make sure that our compound scorers like MaxScoreBulkScorer, ConjunctionBulkScorer, BlockMaxConjunctionBulkScorer, WANDScorer and ConjunctionDISI only have two concrete implementations of DocIdSetIterator and Scorable to deal with.

This helps because it makes calls to DocIdSetIterator#nextDoc(), DocIdSetIterator#advance(int) and Scorable#score() bimorphic at most, and bimorphic calls are candidate for inlining.

This should help speed up boolean queries of term queries at the expense of boolean queries of other query types. This feels fair to me as it gives more speedups than slowdowns in benchmarks, and that boolean queries of term queries are extremely typical. Boolean queries that mix term queries and other types of queries may get a slowdown or a speedup depending on whether they get more from the speedup on their term clauses than they lose on their other clauses.

This implements a small contained hack to make sure that our compound scorers
like `MaxScoreBulkScorer`, `ConjunctionBulkScorer`,
`BlockMaxConjunctionBulkScorer`, `WANDScorer` and `ConjunctionDISI` only have
two concrete implementations of `DocIdSetIterator` and `Scorable` to deal with.

This helps because it makes calls to `DocIdSetIterator#nextDoc()`,
`DocIdSetIterator#advance(int)` and `Scorable#score()` bimorphic at most, and
bimorphic calls are candidate for inlining.

This should help speed up boolean queries of term queries at the expense of
boolean queries of other query types. This feels fair to me as it gives more
speedups than slowdowns in benchmarks, and that boolean queries of term queries
are extremely typical. Boolean queries that mix term queries and other types of
queries may get a slowdown or a speedup depending on whether they get more from
the speedup on their term clauses than they lose on their other clauses.
@jpountz jpountz added this to the 10.1.0 milestone Nov 27, 2024
@jpountz
Copy link
Contributor Author

jpountz commented Nov 27, 2024

This gives a good speedup when running a tasks file that has very diverse queries like we now do in nightly benchmarks. Some less typical queries get a slowdown, which is fine in my opinion.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 AndHighOrMedMed       46.33      (1.1%)       42.47      (2.5%)   -8.3% ( -11% -   -4%) 0.000
                  FilteredPhrase       25.76      (2.9%)       25.16      (2.8%)   -2.3% (  -7% -    3%) 0.009
                DismaxOrHighHigh       69.11      (3.1%)       67.88      (3.7%)   -1.8% (  -8% -    5%) 0.101
               FilteredOrHighMed      139.21      (2.6%)      136.75      (2.1%)   -1.8% (  -6% -    3%) 0.019
                FilteredOr3Terms      152.47      (2.5%)      149.85      (2.0%)   -1.7% (  -6% -    2%) 0.016
                 DismaxOrHighMed       85.03      (3.2%)       83.86      (3.6%)   -1.4% (  -7% -    5%) 0.195
      FilteredOr2Terms2StopWords      150.78      (2.3%)      148.72      (1.7%)   -1.4% (  -5% -    2%) 0.035
              FilteredOrHighHigh       65.05      (2.7%)       64.16      (2.1%)   -1.4% (  -6% -    3%) 0.075
                    FilteredTerm      159.84      (2.2%)      157.75      (1.4%)   -1.3% (  -4% -    2%) 0.025
             FilteredOrStopWords       50.26      (2.8%)       49.61      (2.1%)   -1.3% (  -6% -    3%) 0.101
                  FilteredOrMany       12.07      (3.1%)       11.92      (3.9%)   -1.3% (  -8% -    5%) 0.257
                     CountPhrase        4.42      (2.0%)        4.39      (3.3%)   -0.7% (  -5% -    4%) 0.430
             CombinedAndHighHigh       15.36      (1.5%)       15.26      (1.5%)   -0.6% (  -3% -    2%) 0.186
                      DismaxTerm      645.14      (2.3%)      641.81      (2.5%)   -0.5% (  -5% -    4%) 0.490
                        PKLookup      276.07      (1.7%)      274.77      (2.1%)   -0.5% (  -4% -    3%) 0.425
                 CountAndHighMed      169.12      (1.6%)      168.44      (1.3%)   -0.4% (  -3% -    2%) 0.386
                  CountOrHighMed      142.97      (1.4%)      142.70      (1.0%)   -0.2% (  -2% -    2%) 0.623
                CountAndHighHigh       57.56      (1.2%)       57.46      (0.6%)   -0.2% (  -1% -    1%) 0.543
              CombinedAndHighMed       56.06      (1.7%)       56.02      (1.4%)   -0.1% (  -3% -    3%) 0.904
                 CountOrHighHigh       75.48      (1.0%)       75.66      (0.8%)    0.2% (  -1% -    2%) 0.402
                AndMedOrHighHigh       59.11      (1.6%)       59.38      (1.4%)    0.5% (  -2% -    3%) 0.340
                      OrHighRare      286.09      (3.7%)      287.56      (4.7%)    0.5% (  -7% -    9%) 0.701
                       CountTerm     9106.38      (4.7%)     9192.25      (4.0%)    0.9% (  -7% -   10%) 0.492
               FilteredAnd3Terms      191.49      (1.9%)      195.49      (2.0%)    2.1% (  -1% -    6%) 0.001
            FilteredAndStopWords       48.66      (2.4%)       49.95      (2.1%)    2.7% (  -1% -    7%) 0.000
                          OrMany       19.42      (5.2%)       20.00      (5.7%)    2.9% (  -7% -   14%) 0.087
     FilteredAnd2Terms2StopWords      197.27      (1.8%)      204.14      (1.6%)    3.5% (   0% -    6%) 0.000
             FilteredAndHighHigh       62.50      (2.3%)       64.82      (1.8%)    3.7% (   0% -    8%) 0.000
                    CombinedTerm       32.04      (3.1%)       33.31      (3.0%)    4.0% (  -2% -   10%) 0.000
              FilteredAndHighMed      125.80      (3.0%)      130.97      (3.0%)    4.1% (  -1% -   10%) 0.000
              Or2Terms2StopWords      163.16      (5.1%)      172.65      (5.7%)    5.8% (  -4% -   17%) 0.001
               CombinedOrHighMed       69.43      (4.6%)       73.56      (2.2%)    6.0% (   0% -   13%) 0.000
                        Or3Terms      172.44      (5.2%)      183.18      (5.8%)    6.2% (  -4% -   18%) 0.000
              CombinedOrHighHigh       18.17      (4.5%)       19.33      (2.0%)    6.4% (   0% -   13%) 0.000
             And2Terms2StopWords      159.48      (3.8%)      169.72      (4.0%)    6.4% (  -1% -   14%) 0.000
                       OrHighMed      197.10      (1.5%)      211.20      (3.8%)    7.2% (   1% -   12%) 0.000
                       And3Terms      168.66      (3.9%)      183.49      (4.6%)    8.8% (   0% -   18%) 0.000
                     OrStopWords       33.61      (8.0%)       36.79      (9.7%)    9.5% (  -7% -   29%) 0.001
                      OrHighHigh       53.00      (1.6%)       58.03      (4.9%)    9.5% (   2% -   16%) 0.000
                    AndStopWords       29.89      (5.6%)       32.87      (6.6%)   10.0% (  -2% -   23%) 0.000
                      AndHighMed      122.11      (1.7%)      135.85      (1.9%)   11.3% (   7% -   15%) 0.000
                     AndHighHigh       41.68      (1.5%)       46.61      (1.8%)   11.9% (   8% -   15%) 0.000

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jpountz jpountz force-pushed the reduce_inlining_hazard branch from 7368254 to 2122dbe Compare November 29, 2024 12:11
@jpountz
Copy link
Contributor Author

jpountz commented Nov 29, 2024

Thanks @ChrisHegarty for looking, I was about to ping you. :)

I reflected more on this change. The bits about specializing our bulk scorers that compute top-k hits by score with ImpactsEnum is not controversial I believe, so I'm keeping it (MaxScoreBulkScorer, BlockMaxConjunctionBulkScorer, BlockMaxConjunctionScorer, WANDScorer). However, the bit about specializing other types of queries for PostingsEnum is less obvious since the iterators could likely be a BitSetIterator, a doc-value iterator, or something else. (ConjunctionBulkScorer, ConjunctionDISI) so I reverted this part, we can look into these queries in a follow-up.

@jpountz
Copy link
Contributor Author

jpountz commented Nov 29, 2024

Benchmarks results are still good:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 AndHighOrMedMed       46.03      (0.9%)       44.05      (2.2%)   -4.3% (  -7% -   -1%) 0.000
                    CombinedTerm       32.13      (3.1%)       31.41      (4.5%)   -2.3% (  -9% -    5%) 0.216
                      OrHighRare      285.01      (1.8%)      279.80      (7.5%)   -1.8% ( -10% -    7%) 0.474
                DismaxOrHighHigh       71.98      (2.2%)       70.86      (3.8%)   -1.5% (  -7% -    4%) 0.284
                 DismaxOrHighMed       88.46      (2.1%)       87.20      (3.5%)   -1.4% (  -6% -    4%) 0.289
                  FilteredOrMany       17.06      (2.3%)       16.85      (3.9%)   -1.3% (  -7% -    5%) 0.402
                 CountAndHighMed      170.06      (1.3%)      168.00      (1.3%)   -1.2% (  -3% -    1%) 0.049
             CombinedAndHighHigh       15.38      (2.1%)       15.22      (1.8%)   -1.0% (  -4% -    2%) 0.262
              CombinedAndHighMed       56.15      (2.2%)       55.63      (1.5%)   -0.9% (  -4% -    2%) 0.300
                    FilteredTerm      160.08      (1.6%)      158.71      (2.3%)   -0.9% (  -4% -    3%) 0.367
                        PKLookup      276.03      (1.1%)      273.79      (1.0%)   -0.8% (  -2% -    1%) 0.096
                CountAndHighHigh       57.71      (1.0%)       57.33      (0.8%)   -0.7% (  -2% -    1%) 0.119
                  CountOrHighMed      142.86      (1.2%)      142.05      (1.0%)   -0.6% (  -2% -    1%) 0.262
                       OrHighMed      197.40      (1.1%)      196.55      (3.7%)   -0.4% (  -5% -    4%) 0.737
               CombinedOrHighMed       71.01      (6.2%)       70.71      (2.2%)   -0.4% (  -8% -    8%) 0.846
              FilteredOrHighHigh       66.54      (0.8%)       66.28      (1.4%)   -0.4% (  -2% -    1%) 0.480
                      OrHighHigh       53.05      (1.7%)       52.93      (5.0%)   -0.2% (  -6% -    6%) 0.905
                 CountOrHighHigh       75.24      (1.0%)       75.21      (0.4%)   -0.0% (  -1% -    1%) 0.914
              CombinedOrHighHigh       18.58      (6.3%)       18.58      (2.2%)    0.0% (  -7% -    9%) 0.985
                      DismaxTerm      636.83      (3.1%)      638.21      (3.7%)    0.2% (  -6% -    7%) 0.892
               FilteredOrHighMed      156.56      (1.0%)      156.91      (0.6%)    0.2% (  -1% -    1%) 0.564
      FilteredOr2Terms2StopWords      150.61      (0.9%)      151.00      (0.9%)    0.3% (  -1% -    2%) 0.540
             FilteredOrStopWords       44.91      (1.4%)       45.04      (1.5%)    0.3% (  -2% -    3%) 0.659
                  FilteredPhrase       25.49      (2.4%)       25.60      (2.5%)    0.4% (  -4% -    5%) 0.710
                FilteredOr3Terms      168.68      (0.9%)      169.46      (0.8%)    0.5% (  -1% -    2%) 0.259
                     CountPhrase        4.34      (1.6%)        4.37      (1.5%)    0.6% (  -2% -    3%) 0.382
                       CountTerm     8947.47      (3.5%)     9072.96      (4.2%)    1.4% (  -6% -    9%) 0.441
              Or2Terms2StopWords      163.56      (4.2%)      166.79      (2.5%)    2.0% (  -4% -    9%) 0.231
                        Or3Terms      172.60      (4.3%)      176.37      (1.9%)    2.2% (  -3% -    8%) 0.162
                     OrStopWords       33.82      (7.4%)       34.61      (2.9%)    2.3% (  -7% -   13%) 0.375
                AndMedOrHighHigh       59.30      (1.2%)       61.05      (1.3%)    3.0% (   0% -    5%) 0.000
            FilteredAndStopWords       48.71      (1.8%)       50.58      (0.7%)    3.9% (   1% -    6%) 0.000
     FilteredAnd2Terms2StopWords      197.16      (1.4%)      205.16      (0.9%)    4.1% (   1% -    6%) 0.000
                          OrMany       19.37      (4.5%)       20.16      (2.6%)    4.1% (  -2% -   11%) 0.018
             FilteredAndHighHigh       62.69      (1.9%)       65.50      (0.7%)    4.5% (   1% -    7%) 0.000
               FilteredAnd3Terms      191.55      (1.5%)      202.59      (1.4%)    5.8% (   2% -    8%) 0.000
              FilteredAndHighMed      126.44      (2.6%)      134.60      (1.4%)    6.5% (   2% -   10%) 0.000
             And2Terms2StopWords      159.08      (3.4%)      170.55      (1.5%)    7.2% (   2% -   12%) 0.000
                       And3Terms      169.46      (3.0%)      184.73      (1.3%)    9.0% (   4% -   13%) 0.000
                      AndHighMed      122.19      (1.3%)      134.05      (1.5%)    9.7% (   6% -   12%) 0.000
                     AndHighHigh       41.56      (1.2%)       45.79      (1.9%)   10.2% (   6% -   13%) 0.000
                    AndStopWords       30.03      (5.4%)       33.46      (1.8%)   11.4% (   4% -   19%) 0.000

@jpountz jpountz merged commit f9869b5 into apache:main Nov 29, 2024
3 checks passed
@jpountz jpountz deleted the reduce_inlining_hazard branch November 29, 2024 12:27
jpountz added a commit that referenced this pull request Nov 29, 2024
…14023)

This implements a small contained hack to make sure that our compound scorers
like `MaxScoreBulkScorer`, `ConjunctionBulkScorer`,
`BlockMaxConjunctionBulkScorer`, `WANDScorer` and `ConjunctionDISI` only have
two concrete implementations of `DocIdSetIterator` and `Scorable` to deal with.

This helps because it makes calls to `DocIdSetIterator#nextDoc()`,
`DocIdSetIterator#advance(int)` and `Scorable#score()` bimorphic at most, and
bimorphic calls are candidate for inlining.

This should help speed up boolean queries of term queries at the expense of
boolean queries of other query types. This feels fair to me as it gives more
speedups than slowdowns in benchmarks, and that boolean queries of term queries
are extremely typical. Boolean queries that mix term queries and other types of
queries may get a slowdown or a speedup depending on whether they get more from
the speedup on their term clauses than they lose on their other clauses.
@jpountz
Copy link
Contributor Author

jpountz commented Nov 30, 2024

This helped on nightly benchmarks as well, e.g. AndHighHigh and AndStopWords got a 10% speedup. https://benchmarks.mikemccandless.com/AndHighHigh.html https://benchmarks.mikemccandless.com/AndStopWords.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants