Re-enable filter editor suggestions #13376

lukasolson · 2017-08-07T22:10:22Z

Replaces #13286.
Fixes #12692.

This PR re-enables filter editor suggestions by default, but also adds additional parameters to the request to safeguard against long-running queries caused by the filter editor suggestions requests.

lukasolson · 2017-08-07T22:10:38Z

/cc @alexfrancoeur

alexfrancoeur · 2017-08-09T15:50:24Z

@lukasolson I see the advanced option for filterEditor:suggestValues is now set to true but I can't seem to find the missing additional parameters to modify the safeguard. Was that part of the PR or are we just providing a default and not allowing it to be configurable?

I have about a little over 3 million events using makelogs. If I do a unique count of _id it matches the total number of documents. So this filter is working through ~3m strings.

Initial request payload: {"field":"_id"}
Response: ["-0-bx10BmxeOLc-l1n-3","-0-bx10BmxeOLc-l1n63","-0-bx10BmxeOLc-l1oC3","-0-bx10BmxeOLc-l1oG3","-0-bx10BmxeOLc-l1oK3","-0-bx10BmxeOLc-l1oO3","-0-bx10BmxeOLc-l1oS3","-0-bx10BmxeOLc-l1oW3","-0-bx10BmxeOLc-l1oa3","-0-bx10BmxeOLc-l1oe3"]
Network time (mostly TTFB): 613.35ms

I know you discussed using terminate_after but am unsure of what it's set to. Looks like 10 here, am I right? We may need to increase a bit.

Probably unrelated, but I found a few bugs. Let me know if you'd prefer a new issue (my guess is yes, but I figured I'd try 😄 )

Special characters don't seem to be working correctly

No values in filter editor

Incorrect(?) filters in query DSL

Also do we need to surface .raw separately? It'd be nice to automatically provide suggestions regardless. If users don't want to use them, they can type in their own input.

Are we rendering the HTML? Look at headings.raw and how it's presented in the structured filter

alexfrancoeur · 2017-08-09T16:31:58Z

I tested this on cloud with 5.5.0 using a different unique ID but similar number of documents and similar string size. Two requests were made at around 16s each.

Not the best apples to apples comparison but there are certainly noticeable performance improvements.

lukasolson · 2017-08-09T20:27:22Z

I can't seem to find the missing additional parameters to modify the safeguard. Was that part of the PR or are we just providing a default and not allowing it to be configurable?

I've decided to stick with a simple enable/disable for now. If we get requests coming in to be able to tweak these options, we can look into it further, but I think this is simple yet effective.

I know you discussed using terminate_after but am unsure of what it's set to. Looks like 10 here, am I right? We may need to increase a bit.

Currently, it is being set to 100000. The number of results coming back in the request is 10 (which is the default). We can increase this if needed, but I think 10 should be fine for now, seeing as how you can type more and get more results.

Special characters don't seem to be working correctly.

Yeah, this is a bit tricky and I've noticed this before, but I can't find an issue for it. Feel free to open one.

Also do we need to surface .raw separately? It'd be nice to automatically provide suggestions regardless. If users don't want to use them, they can type in their own input.

We talked about this in the original PR, but decided to punt on it. Can you open an issue for this as well?

No values in filter editor
Incorrect(?) filters in query DSL

Can you elaborate more on how to reproduce those issues?

Thanks for the thorough feedback!

Bargs

Current code looks good.

A couple questions about new params:

I was wondering if you'd looked into Trevan's comment here #12692 (comment). It's interesting that the execution_hint hurt performance. We might want to get Adrien's thoughts on that before merging, if you haven't already pinged him about it.

I was also wondering how terminate_after might affect the results on high cardinality fields. So I reduced terminate_after to 1 with my makelogs dataset to simulate a similar ratio of terminate_after to hits that someone with millions of docs would have when using a default of 100,000. When I start typing in the prefix of an existing _id, I get no suggestions:

That's a bit worrisome. If that's how it works, someone with a lot of docs is going to get empty suggestions pretty often. Adrien had mentioned that execution_hint: "map" would tell the regexp to evaluate lazily on the collected terms, so I thought maybe that was the problem. I removed that param, but still ran into the same problem. I think it'll be pretty confusing if a user gets no suggestions when he/she is typing in the start of a value that he/she can plainly see in the doc table.

weltenwort

Yes, it looks as if the regex is executed after the samples from the buckets are fetched. If that was not the case, the performance would probably be pretty terrible. But that also has the effect of not giving expected results when the diversity of values is high. Setting the terminate_after to 1 completely throws of the results though, so I am not sure it is an appropriate simplification of the "simulation".

Have you investigated using timeout instead of terminate_after? It might make for a better user experience (and an easier-to-understand advanced setting) to just time out after something like 1s.

Additionally, I noticed the values are not being escaped before insertion into the regexp. This can confuse users and leads to hidden "bad request" errors when the user enters something that is not produced by the regex grammar.

I am wondering about the reasoning behind the usage of map as an execution_hint. Could someone explain that to me?

In any case displaying a loading indicator (e.g. a grayed out loading... in the dropdown) during the request and a hint that the list of suggestions has been truncated due to performance reasons could be helpful.

lukasolson · 2017-08-22T23:21:20Z

@weltenwort @Bargs Should be ready for review again.

lukasolson · 2017-08-22T23:24:02Z

@trevan Would you mind taking a look at this PR and seeing if it is an acceptable solution to #12692?

weltenwort

It feels a lot more responsive than before and the spinner is a nice touch!

One thing that doesn't quite work as expected is terms containing whitespace. As soon as I enter a character after a space (e.g. Mozilla/5.0 (), the suggestions come back empty. Escaping the space character as well seems to fix that for the query_string query.

Another thing is that suggestions for the _index and _type fields do not yield the expected suggestions. The former actually results in an error like [query_shard_exception] Can only use prefix queries on keyword and text fields - not on [_index] which is of type [_index] while the latter just does not return any results.

Since we are displaying suggestions only for aggregatable fields anyway, we could consider using the prefix term-level query, which would make the escaping unnecessary. It would still need special treatment for _id and potentially other meta fields though.

weltenwort · 2017-08-23T09:26:03Z

src/core_plugins/kibana/server/routes/api/suggestions/register_value_suggestions.js

+    .replace(/\*/g, '\\*')
+    .replace(/\?/g, '\\?')
+    .replace(/\:/g, '\\:')
+    .replace(/\//g, '\\/');


How about shortening this a bit into one regexp, e.g.

query.replace(/[\\+\-=&|><!(){}[\]^"~*?:/ ]/g, (match) => `\\${match}`)

weltenwort · 2017-08-23T09:27:00Z

src/core_plugins/kibana/server/routes/api/suggestions/register_value_suggestions.js

+    .replace(/\{/g, '\\{')
+    .replace(/\}/g, '\\}')
+    .replace(/\[/g, '\\[')
+    .replace(/\[/g, '\\]')


should be /\]/g

Bargs · 2017-08-23T14:03:35Z

we could consider using the prefix term-level query

I'd try out match_phrase_prefix as well https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query-phrase-prefix.html

I definitely think we should avoid the query string query unless it's absolutely necessary for some reason.

Bargs · 2017-08-23T15:11:34Z

Just a thought for the future: if we knew when fields had both analyzed an non-analyzed versions, we could always search against the analyzed version and do terms against the non-analyzed. That's a neat advantage of using search instead of the "include" param on the terms agg since searching the analyzed version will return more matches.

Bargs · 2017-08-23T15:30:56Z

src/core_plugins/kibana/server/routes/api/suggestions/register_value_suggestions.js

-      suggestions: { terms }
+      suggestions: {
+        terms: { field }
+      }


Hey @jpountz, do you see any gotchas with how we're using timeout here? We played around with terminate_after as you suggested. It helped a lot, but picking a value that would work for all users felt like a shot in the dark. timeout is nice because it allows us to control the user experience more explicitly (we can say "suggestions should take no longer than a second"). But I wanted to check with you, is there any reason why we shouldn't rely on timeout in this way?

I think this is a good idea! For the record, we made timeouts better recently: elastic/elasticsearch#25776.

lukasolson · 2017-08-23T16:58:47Z

@weltenwort @Bargs @jpountz I've made the changes we talked about this morning. Please take another look!

jpountz

LGTM. I just left minor suggestions.

jpountz · 2017-08-23T17:15:58Z

src/core_plugins/kibana/server/routes/api/suggestions/register_value_suggestions.js

+        terms: {
+          field,
+          include: `${getEscapedQuery(query)}.*`,
+          execution_hint: 'map'


Since it's not obvious, I think it's worth adding a comment that the map execution hint helps ensure that the regexp is not evaluated eagerly against the terms dictionary. And then the terminate_after helps keep the number of buckets that need to be tracked at the shard level contained in case this is a high-cardinality field.

also since we do not care about the accuracy of the counts, we could set the shard_sizeto 10 explicitly in order to reduce the amount of information that needs to be transmitted from shards to the coordinating node

Bargs

LGTM

Pending @weltenwort's lgtm, let's try to get this merged tomorrow so it goes into beta2.

weltenwort

looks good, except that entering a < character without a matching > results in a bad request ([parsing_exception] [terms] failed to parse field [include]). I guess we should escape the characters used for the optional features as well (# @ & < > ~).

weltenwort · 2017-08-24T08:07:24Z

Feels like we should cover the escaping by an api test in case ES adds additional characters that require escaping in the future.

trevan · 2017-08-24T14:44:27Z

@lukasolson, you changed the query since you asked me to test. Do you want me to test with the current query or wait some more?

lukasolson · 2017-08-24T15:41:38Z

@trevan Should be good to take a look now.

weltenwort

LGTM pending passing tests

* Re-enable filter editor suggestions * Use search instead of include * Escape query * Show spinner * Use include rather than search * Add additional regex and explanation for parameters * Add suggestions API test * Make sure test actually runs * Use send instead of query * Fix suggestions API test

lukasolson · 2017-08-25T23:45:29Z

6.x (6.1.0): 1625f64
6.0 (6.0.0): 043477e

Re-enable filter editor suggestions

4f4cb3f

lukasolson self-assigned this Aug 7, 2017

lukasolson mentioned this pull request Aug 7, 2017

Turn filter editor suggestions back on by default #13286

Closed

lukasolson requested review from Bargs, weltenwort and alexfrancoeur August 8, 2017 23:00

lukasolson added :Discovery Feature:Filters v6.0.0 v6.1.0 v7.0.0 review labels Aug 8, 2017

Bargs reviewed Aug 9, 2017

View reviewed changes

weltenwort requested changes Aug 10, 2017

View reviewed changes

lukasolson added 5 commits August 18, 2017 16:57

Use search instead of include

6480113

Merge remote-tracking branch 'upstream/master' into filterSuggestions

79a53e7

Merge branch 'master' into filterSuggestions

87bcc39

Escape query

dc06c33

Show spinner

b7203d4

weltenwort requested changes Aug 23, 2017

View reviewed changes

Bargs reviewed Aug 23, 2017

View reviewed changes

Use include rather than search

c3cbdcf

jpountz approved these changes Aug 23, 2017

View reviewed changes

Bargs approved these changes Aug 23, 2017

View reviewed changes

weltenwort requested changes Aug 24, 2017

View reviewed changes

lukasolson added 2 commits August 24, 2017 08:29

Merge branch 'master' into filterSuggestions

edc5cd0

Add additional regex and explanation for parameters

5da57b5

lukasolson added 3 commits August 24, 2017 19:57

Add suggestions API test

2371873

Make sure test actually runs

c6ed38c

Use send instead of query

4d4a831

weltenwort approved these changes Aug 25, 2017

View reviewed changes

Fix suggestions API test

44bd82d

lukasolson merged commit 19ac99a into elastic:master Aug 25, 2017

jimgoodwin added the v6.0.0-rc1 label Sep 28, 2017

lukasolson deleted the filterSuggestions branch March 27, 2018 21:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-enable filter editor suggestions #13376

Re-enable filter editor suggestions #13376

lukasolson commented Aug 7, 2017 •

edited

Loading

lukasolson commented Aug 7, 2017

alexfrancoeur commented Aug 9, 2017

alexfrancoeur commented Aug 9, 2017

lukasolson commented Aug 9, 2017

Bargs left a comment

weltenwort left a comment

lukasolson commented Aug 22, 2017

lukasolson commented Aug 22, 2017

weltenwort left a comment

weltenwort Aug 23, 2017 •

edited

Loading

weltenwort Aug 23, 2017

Bargs commented Aug 23, 2017

Bargs commented Aug 23, 2017

Bargs Aug 23, 2017

jpountz Aug 23, 2017

lukasolson commented Aug 23, 2017

jpountz left a comment

jpountz Aug 23, 2017

jpountz Aug 23, 2017

Bargs left a comment

weltenwort left a comment

weltenwort commented Aug 24, 2017

trevan commented Aug 24, 2017

lukasolson commented Aug 24, 2017

weltenwort left a comment

lukasolson commented Aug 25, 2017

Re-enable filter editor suggestions #13376

Re-enable filter editor suggestions #13376

Conversation

lukasolson commented Aug 7, 2017 • edited Loading

lukasolson commented Aug 7, 2017

alexfrancoeur commented Aug 9, 2017

alexfrancoeur commented Aug 9, 2017

lukasolson commented Aug 9, 2017

Bargs left a comment

Choose a reason for hiding this comment

weltenwort left a comment

Choose a reason for hiding this comment

lukasolson commented Aug 22, 2017

lukasolson commented Aug 22, 2017

weltenwort left a comment

Choose a reason for hiding this comment

weltenwort Aug 23, 2017 • edited Loading

Choose a reason for hiding this comment

weltenwort Aug 23, 2017

Choose a reason for hiding this comment

Bargs commented Aug 23, 2017

Bargs commented Aug 23, 2017

Bargs Aug 23, 2017

Choose a reason for hiding this comment

jpountz Aug 23, 2017

Choose a reason for hiding this comment

lukasolson commented Aug 23, 2017

jpountz left a comment

Choose a reason for hiding this comment

jpountz Aug 23, 2017

Choose a reason for hiding this comment

jpountz Aug 23, 2017

Choose a reason for hiding this comment

Bargs left a comment

Choose a reason for hiding this comment

weltenwort left a comment

Choose a reason for hiding this comment

weltenwort commented Aug 24, 2017

trevan commented Aug 24, 2017

lukasolson commented Aug 24, 2017

weltenwort left a comment

Choose a reason for hiding this comment

lukasolson commented Aug 25, 2017

lukasolson commented Aug 7, 2017 •

edited

Loading

weltenwort Aug 23, 2017 •

edited

Loading