-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
terminate_after uses heuristic hits value instead of actual #57624
Comments
Pinging @elastic/es-search (:Search/Search) |
Are you sure of the timing difference ? What happens if you run the same query several times, are you still seing a 10x diff ? That shouldn't be the case since you have an aggregation in your request so we should evaluate the same number of docs no matter what the value of |
I still see roughly a 10x diff even after a few reruns. The boolean query is empty. I believe that's the actual query kibana runs to search for its autocomplete. Our non-redacted query that reproduces:
And
For each query, I fired off three requests serially (~5s downtime in between or so, human speed) first query: 24147ms, 21307ms, 23462ms all with (So, not quite 10x but close enough) |
For #57772 I included the whole reproduction (although there isn't any interesting timing difference since the sample size is so small, I think it's helpful to look at the After looking at the LOC you pointed out I'm.. surprised that my fix works, but the results seem to show the fix. I'll try to dig in and see why a bit more |
`terminate_after` is ignored on search requests that don't return top hits (`size` set to 0) and do not tracked the number of hits accurately (`track_total_hits`). We use early termination when the number of hits to track is reached during collection but this breaks the hard termination of `terminate_after` if it happens before we reached the `terminate_after` value. This change ensures that we continue to check `terminate_after` even if the tracking of total hits has reached the provided value. Closes elastic#57624
Thanks @travisby, I was able to reproduce the issue and opened a pr for the fix. Your assumption is correct, |
`terminate_after` is ignored on search requests that don't return top hits (`size` set to 0) and do not tracked the number of hits accurately (`track_total_hits`). We use early termination when the number of hits to track is reached during collection but this breaks the hard termination of `terminate_after` if it happens before we reached the `terminate_after` value. This change ensures that we continue to check `terminate_after` even if the tracking of total hits has reached the provided value. Closes #57624
`terminate_after` is ignored on search requests that don't return top hits (`size` set to 0) and do not tracked the number of hits accurately (`track_total_hits`). We use early termination when the number of hits to track is reached during collection but this breaks the hard termination of `terminate_after` if it happens before we reached the `terminate_after` value. This change ensures that we continue to check `terminate_after` even if the tracking of total hits has reached the provided value. Closes #57624
`terminate_after` is ignored on search requests that don't return top hits (`size` set to 0) and do not tracked the number of hits accurately (`track_total_hits`). We use early termination when the number of hits to track is reached during collection but this breaks the hard termination of `terminate_after` if it happens before we reached the `terminate_after` value. This change ensures that we continue to check `terminate_after` even if the tracking of total hits has reached the provided value. Closes #57624
tl;dr:
I believe there's a regression with using
terminate_after
in searches from es6->7 because of the changes totrack_total_hits
andhits.total.value
now being a heuristical value.Hello!
I've been looking into why kibana filter autocomplete has been so slow after our transition from an es/logstash/kibana-6.4 stack to an es/logstash/kibana-7.6 stack. We had also made some sharding decisions that were different so we weren't sure if it was us or an ES regression.
I've reverse-engineered the query kibana sends to roughly be:
In our case, index-pattern is something like 30 indices. In 6 it's 750 shards and 7 220ish, and is a keyword field where everything is ~10 characters (this is all the same data that we've cloned between the two)
With
"profile": true
we were able to get some interesting results in the head of the request:in elasticsearch6:
in es7:
So, we've taken much longer and we're timing out instead of terminating early, and that's leading to us taking 4x longer. We even noticed individual profiles are having us reading one shard taking ~seconds. I'm surprised, and would have expected that we should have been able to read each individual shard at roughly the same speed (thus, terminating_after happening), even if our concurrency was down.
The hits.total.value only being 10k instead of something like terminate_after * num_shards seemed suspect.
I wanted to see if we were getting anywhere near the same results, and added
"track_total_hits": true,
to our query. Hopefully with this we can get records/s from each shard and know exactly where our bottleneck is.!!!! that was a lot faster! And we're terminating_early now. the only change was we added track_total_hits: true to our query
My theory is that the circuit breaker for
terminate_after
is using the heuristical value, which now stops counting after 10k, and ends up searching the whole shard instead of respectingterminate_after
.The text was updated successfully, but these errors were encountered: