-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search Latency Tracking - Per Request Phase Took Time #9650
Labels
enhancement
Enhancement or improvement to existing feature or request
Search
Search query, autocomplete ...etc
Comments
dzane17
added
enhancement
Enhancement or improvement to existing feature or request
untriaged
labels
Aug 30, 2023
@dzane17 -- How does this work with profiling enabled? Does profiling already provide this info (and more)? Also, if we want to implement this, would it make sense to use the |
Hi @msfroh, there are a couple differences:
|
kkhatua
added
backport 2.x
Backport to 2.x branch
and removed
backport 2.x
Backport to 2.x branch
labels
Oct 17, 2023
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
Enhancement or improvement to existing feature or request
Search
Search query, autocomplete ...etc
Is your feature request related to a problem? Please describe.
As of today, we track search request latencies on a shard level via node stats. After every query/fetch phase is completed on a shard, we note down the time taken for each, keep accumulating those values and maintain an overall average value which is tracked under stats.
But we don’t have a mechanism to track search latencies around coordinator node. Coordinator node plays an important role in fanning out requests to individual shard/data-nodes, aggregating those responses and eventually sending response back to the client. We have seen multiple issues in the past where it becomes hard/impossible to reason latency related issues because of lack of insights into coordinator level stats and we ended up spending a lot of unnecessary time/bandwidth on figuring it out. Clients using search API only rely on overall took time(present as part of search response) which doesn’t offer much insights into time taken by different phases.
Parent RFC: #7334
Describe the solution you'd like
Per Request level tracking: As part of this, we will offer further breakdown of existing took time in search response. To do this, we will introduce a new field(phase_took) in search response which will give more insights/visibility into overall time taken by different search phases(query/fetch/canMatch etc) to the clients.
Additional context
Request phase_took times will be disabled by default since applications will not expect this new response field. Users can be enable the feature via a query parameter OR cluster setting. This gives users flexibility to set at a cluster level while also turning on/off as needed on individual requests.
The text was updated successfully, but these errors were encountered: