The KQL autocomplete values can take long time #46054

AlonaNadler · 2019-09-18T17:27:37Z

Currently, KQL autocompletes value suggestions can be slow to appear, especially when there is a lot of data to query to get these possible values. This results in a really slow and frustrating autocomplete.
One of the possible reasons (might be others) for the slowness is that the autocomplete doesn't take into account the time range the users are looking at and suggest all possible values.

Perhaps we can filter the values based on the time range the users are looking at when querying.
@Bargs that was raised as an issue in SIEM and APM in the past, would be good to think how we can improve that in order to keep this implementation consistent in Kibana and not have each solution making their own implementation

@Bargs @TinaHeiligers

elasticmachine · 2019-09-18T17:27:38Z

Pinging @elastic/kibana-app

Bargs · 2019-09-18T23:28:27Z

@AlonaNadler by default the value suggestions should never take more than a second to appear because we have a timeout. However there is a setting in kibana.yml for configuring this timeout. When you experienced the slowness, do you know if this timeout had been increased above its default?

smalenfant · 2019-10-15T12:41:08Z

+1 on time values. We are experiencing huge slowdown in queries and high CPU usage of our cluster because suggestions take a long time to get values from cluster (hot+cold with 50TB of data). It also queries for every single character you type in the filter box. 20 searches sent in serial that needs to complete before the final results come back. 100% CPU across all my warm nodes results from that. I'd like to be able to turn off "keystroke by keystroke" suggestion and turn on suggestion. Currently doesn't seem possible.

All queries should also be terminated when a user has fully entered when they needed (or save the filter).

This is a great feature although it doesn't play well with bug clusters used for time series data.

smalenfant · 2019-10-15T12:46:36Z

See the firefox network console when I type phn: dukecdedge01.rd.at.cox.net:

atoom · 2019-10-17T04:03:21Z

We are experiencing the same issue. I will copy and paste by comment from the Elastic dicussion board thread here: https://discuss.elastic.co/t/kql-related-performance-issue/199420

We have just updated our ELK installation from version 6.7.1 to 7.3.2 and we are experiencing the same issue. After a couple of days looking at everything from segment count, GC settings to disk i/o on the hosts I managed to pinpoint our high cpu usage and high response times to Kibana and auto completion of filter values together with KQL. When using Lucene syntax or setting filterEditor:suggestValues to Off as suggested above everything is much more responsive!
I have attached a screenshot from Chrome DevTools showing a waterfall diagram of all requests created when trying to search from the Disover view using KQL in Kibana when filter value suggestions are enabled. After the the number of in-flight request threads hits the browser max value subsequent requests are stalled until a previous request is completed - this can result in the actual search query request times out after 30 seconds.

markharwood · 2019-12-02T09:37:32Z

Adding another case here.

Took a long time (weeks) to diagnose the reason for a slow response - KQL autocomplete was adding >30s to response times in this case.

smalenfant · 2020-02-06T13:33:49Z

Any updates on when the suggest feature will use the "time range" selected instead of the full index scan?

rayafratkina · 2020-02-07T12:33:12Z

#48450 went into 7.6 and should resolve this issue

elasticmachine · 2020-02-20T15:35:47Z

Pinging @elastic/kibana-app-arch (Team:AppArch)

smalenfant · 2020-03-12T19:53:56Z

The fix provided provided some help about making sure 30 requests doesn't make it to the cluster while a user types.

The time range has not been. Turning on the filter suggest value bring our cluster to a crawl for hours since it's trying to hit all our indexes (cold and frozen).

AlonaNadler · 2020-03-12T20:41:50Z

Thanks for the feeback @smalenfant.
@elastic/kibana-app-arch I'm adding that to our short term, this is a friction point which is important to solve.

kustodian · 2020-05-15T19:48:07Z

Any updates on the progress of this issue?

erickjordan · 2020-06-01T06:35:07Z

Still slow in Kibana 7.7.

jimczi · 2020-06-17T12:37:52Z

We have a bug in 7.x that makes the terminate_after option to be ignored on search requests that use a size of 0. That explains, I think, why value suggestions are slower in 7.x (the bug was introduced in 7.0).
Although I agree with the comments made here, a value suggester that needs to hit all shards on every keystroke and retrieve 100k docs per shards will likely be slow on large deployment even with the fix. We should look at a more scalable solution and evaluate the cost of having this feature enabled by default on every deployment.

lizozom · 2020-09-28T13:30:02Z

@lukasolson made an interesting suggestion: to use async search to fetch search results progressively.
We'd give a 1s initial timeout for the results, and continue fetching them, as long as the user is not typing something new or chooses an option.
We also talked about applying some kind of sorting, to make sure "hot" data is queries first.

@jimczi does this make sense?

lizozom · 2020-10-04T15:18:51Z

For testing purposes, I used my large data cluster and replaced the query used to fetch autocomplete, to simply getting the latest 20 documents over a 3 year time range.

I used async search, but it takes ~10 seconds until the first result even for this simple query. I'm getting similar results running this query in Dev Tools.

How can we improve this? Or this is the performance to be expected?

{
    "size": 50,
    "sort": [
      {
        "@timestamp": {
          "order": "desc"
        }
      }
    ],
    "docvalue_fields": [
      "@message.keyword"
    ],
    "_source": false,
    "query": {
      "bool": {
        "filter": []
      }
    }
}

weltenwort · 2020-10-05T11:57:32Z

I could see several ways to optimize the query:

don't sort
use terminate_after
use a time range filter
disable total hits tracking (or set it to the same as size)

lizozom · 2020-10-14T17:35:59Z

@weltenwort I posted this query as a part of bench marking different autocomplete query combinations :)
I definitely tried not tracking, using the time range filter and disabling total hits tracking.
Terminate after would just yield partial results, correct?

weltenwort · 2020-10-14T18:33:39Z

Yes, AFAIK it would return as soon as the hit count is reached.

kustodian · 2020-10-15T06:44:45Z

I don't understand why are we discussing how to optimize this query when Kibana 6 just limited the time range and it worked great. That should revert how auto-complete worked before, which would fix most of the issues. Later on, we can discuss if it can be optimized better.

lizozom · 2020-10-15T07:50:31Z

I've benchmarked the performance of our current terms aggregation autocomplete query.
I also tried fetching the latest 50 documents (to potentially combine it with the terms results to speed up the process) and played around with a significant_terms aggregation, with and without a sampler.

I tried out the following configurations:

With and without trackTotalHits
With and without a timerange applied
With and without sorting
With various shard_size configurations

The data used for these tests is a ~40 million logs data set that was generated into a 7.10 staging cloud instance with default configuration.

Results

Times are taken from the took field on the Elasticsearch response in ms.

Record #	TERMS w/totals, wo/timerange, wo/sort	TERMS w/totals, w/sort	LATEST w/totals, w/sort	TERMS wo/totals, w/sort	LATEST wo/totals, w/sort	TERMS wo/totals, wo/sort	SIG TERMS wo/totals, w/sort	SIG TERMS wo/totals, w/sort, sampler
12M	6175	1862	275	1024	428	976	2458	3702
20M	6155	4688	3003	3260	1312	3257	6365	4048
40M	6208	7548	3465	5774	350	6104	9352	7023

So it's evident from this table that:

If timerange is not used, the terms aggregation runs at the maximal possible runtime, but even with the timerange, performance does not reach acceptable levels, on an average dataset and with no other queries running on the cluster.
Fetching last X docs is a good way to improve time to initial results
We shouldn't fetch totals when fetching autocomplete results
Sorting doesn't have a visible impact on the terms aggregation
Using a significant terms agg with a sampler didn't seem to make a difference (at least with a basic configuration)

kustodian · 2020-10-15T09:23:33Z

I'm more interested in what are the results when the time range is smaller, like 1h or 1 day. Because currently even if the selected time range is 1h Kibana still queries all the unique terms in the whole cluster, which totally kills the cluster, that's why I'm saying that the time range should be implemented first, and other optimization should be added later.

lizozom · 2020-11-05T14:31:16Z

Autocomplete short term fix

lizozom · 2020-11-05T14:33:49Z

Review Autocomplete ES definition

kustodian · 2020-11-09T23:11:53Z

I guess that Trello board is internal only?

weltenwort · 2020-11-10T13:44:47Z

@kustodian These are just an artifact of a misconfigured integration. This is still the main issue used to track and discuss.

lukasolson · 2021-07-29T21:46:48Z

Fixed by #100174.

AlonaNadler added the Team:Visualizations Visualization editors, elastic-charts and infrastructure label Sep 18, 2019

timroes added the feedback_needed label Sep 19, 2019

lukasolson mentioned this issue Oct 21, 2019

[KQL] [Filter editor] Cancel discarded requests for suggestions #48797

Closed

timroes added the Feature:KQL KQL label Dec 2, 2019

timroes added Team:AppArch and removed Team:Visualizations Visualization editors, elastic-charts and infrastructure labels Feb 20, 2020

alexh97 assigned lukasolson Jun 16, 2020

stacey-gammon mentioned this issue Aug 24, 2020

[KQL]: Autocomplete performance issues with runtime fields #75647

Closed

lizozom mentioned this issue Oct 29, 2020

[Autocomplete] Support useTimeFilter option #81515

Merged

lukasolson assigned lizozom and unassigned lukasolson Oct 30, 2020

timroes mentioned this issue Nov 3, 2020

Load exact hit count async #55975

Closed

lizozom mentioned this issue Nov 4, 2020

Using partial results in Discover #76307

Closed

exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Jun 2, 2021

lukasolson closed this as completed Jul 29, 2021

lukasolson mentioned this issue Oct 8, 2024

[Unified Search] Value autocomplete without field name specified #193608

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The KQL autocomplete values can take long time #46054

The KQL autocomplete values can take long time #46054

AlonaNadler commented Sep 18, 2019

elasticmachine commented Sep 18, 2019

Bargs commented Sep 18, 2019

smalenfant commented Oct 15, 2019

smalenfant commented Oct 15, 2019

atoom commented Oct 17, 2019

markharwood commented Dec 2, 2019

smalenfant commented Feb 6, 2020

rayafratkina commented Feb 7, 2020

elasticmachine commented Feb 20, 2020

smalenfant commented Mar 12, 2020

AlonaNadler commented Mar 12, 2020

kustodian commented May 15, 2020

erickjordan commented Jun 1, 2020

jimczi commented Jun 17, 2020

lizozom commented Sep 28, 2020 •

edited

Loading

lizozom commented Oct 4, 2020 •

edited

Loading

weltenwort commented Oct 5, 2020

lizozom commented Oct 14, 2020

weltenwort commented Oct 14, 2020

kustodian commented Oct 15, 2020 •

edited

Loading

lizozom commented Oct 15, 2020

kustodian commented Oct 15, 2020

lizozom commented Nov 5, 2020

lizozom commented Nov 5, 2020

kustodian commented Nov 9, 2020

weltenwort commented Nov 10, 2020

lukasolson commented Jul 29, 2021

The KQL autocomplete values can take long time #46054

The KQL autocomplete values can take long time #46054

Comments

AlonaNadler commented Sep 18, 2019

elasticmachine commented Sep 18, 2019

Bargs commented Sep 18, 2019

smalenfant commented Oct 15, 2019

smalenfant commented Oct 15, 2019

atoom commented Oct 17, 2019

markharwood commented Dec 2, 2019

smalenfant commented Feb 6, 2020

rayafratkina commented Feb 7, 2020

elasticmachine commented Feb 20, 2020

smalenfant commented Mar 12, 2020

AlonaNadler commented Mar 12, 2020

kustodian commented May 15, 2020

erickjordan commented Jun 1, 2020

jimczi commented Jun 17, 2020

lizozom commented Sep 28, 2020 • edited Loading

lizozom commented Oct 4, 2020 • edited Loading

weltenwort commented Oct 5, 2020

lizozom commented Oct 14, 2020

weltenwort commented Oct 14, 2020

kustodian commented Oct 15, 2020 • edited Loading

lizozom commented Oct 15, 2020

Results

kustodian commented Oct 15, 2020

lizozom commented Nov 5, 2020

lizozom commented Nov 5, 2020

kustodian commented Nov 9, 2020

weltenwort commented Nov 10, 2020

lukasolson commented Jul 29, 2021

lizozom commented Sep 28, 2020 •

edited

Loading

lizozom commented Oct 4, 2020 •

edited

Loading

kustodian commented Oct 15, 2020 •

edited

Loading