-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KQL] Support fuzzy queries #54343
Comments
Pinging @elastic/kibana-app (Team:KibanaApp) |
I think this is the first I've heard a request for fuzzy search in KQL. Has it come up for you often? We intentionally left stuff like this out of the first iteration to keep KQL very simple. I think a lot of magic syntax makes the query language hard to understand. If it's a niche use case we could also consider making it possible in a filter instead of in KQL. I think it's easier to support more complex querying like this in filters since it's possible to put helpful descriptions in the filter editor UI, for both the person creating the filter and anyone looking at it later on. |
I agree with you. Haven't heard it coming up often yet. @tamros since you brought this up, what is your feeling of how often people are actually looking for this, or do you think it's coming up in trainings more often, because we kind of hint people towards it? |
We have students asking while we are teaching Kibana Data Analyst course, you could be right, that we hint people towards this as we teach lucene fuzziness they kind of want to know how to do the KQL, they really like the contextual search. Talking about use cases, for security use cases is useful as you can get variations of a website. |
Pinging @elastic/kibana-app-arch (Team:AppArch) |
This would be very helpful for many infosec use cases. Attackers will do typo squatting or rename files to confuse defenders. Typo squatting use case: The following makes a dead easy detect: File name confusion: Fuzzy search to the rescue: I get that this might be a tricky request to implement, but there are many who would benefit from this. |
I call that the "Like this but not this" querying pattern and it's very useful. You can also use it to find mis-classified content. One example was examining police intel reports searching for those tagged with one of their "customers" - entity:31567. I used significant_text aggregation on text of reports to discover the name of the cafe where he sold heroin, the girlfriend's name etc. These discriminating words associated with 31567 are then ORed and searched for - but NOT-ing entity:31567. Results are a relevance-ranked list of reports that should have been tagged with 31567 but weren't. Shame we don't support significant_text and therefore this technique. |
I think this is useful for quick POCs. Although the implementation may not require, but I had to write code just to test my concepts multiple times. |
Would be very useful for our "non-scripting-abled" users. They have to lookup people's names a lot and a lot of names are written in different ways. Fuzzy searching to the rescue, but then techies have to jump in to write the Lucene scripts for them. |
Pinging @elastic/kibana-data-discovery (Team:DataDiscovery) |
Closing this because it's not planned to be resolved in the foreseeable future. It will be tracked in our Icebox and will be re-opened if our priorities change. Feel free to re-open if you think it should be melted sooner. |
Currently, the Elasticsearch query string query supports fuzzy queries, which allows searching for terms similar to a search term.
For example,
quikc~
would search for terms similar to "quikc" (such as "quick"). The query string syntax also supports edit distances (see https://www.elastic.co/guide/en/elasticsearch/reference/8.0/query-dsl-fuzzy-query.html for more details).This issue is a placeholder for adding support directly in KQL for fuzzy queries. The syntax may not remain the same, but the concept & functionality would.
(Original description)
The text was updated successfully, but these errors were encountered: