Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KQL] Should wildcard queries default to case-insensitive search? #80591

Closed
Tracked by #166068
wylieconlon opened this issue Oct 14, 2020 · 9 comments
Closed
Tracked by #166068

[KQL] Should wildcard queries default to case-insensitive search? #80591

wylieconlon opened this issue Oct 14, 2020 · 9 comments
Labels
discuss enhancement New value added to drive a business result Feature:KQL KQL Feature:Search Querying infrastructure in Kibana Icebox impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:needs-research This issue requires some research before it can be worked on or estimated Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL.

Comments

@wylieconlon
Copy link
Contributor

wylieconlon commented Oct 14, 2020

Starting in 7.10, Elasticsearch supports an option to set case_insensitive: true on the wildcard search query. This works internally by rewriting the searches to regular expressions that match upper and lower case characters.

Options for how to expose this

a. Set this flag to be the default in all KQL wildcard searches, without changing the KQL grammar. This has some potential performance issues as described in a comment by @markharwood in the related issue about wildcard fields:

For example - if they support a *foo* style query in the KQL bar and assume, like normal whole-term based queries, that can be run across multiple fields then it may result in slow results or timeouts. Wildcard fields will be fast but hitting other fields which are keyword will involve an expensive linear scan.

b. Only set this flag by default when the user is running wildcard query on wildcard type fields. This would be the most performant option, but it would potentially be confusing to have two different behaviors.

c. Add something to the KQL grammar, like this request to add an UPPER() function to KQL. This could let users enable case insensitive queries as needed. I don't have a proposed grammar.

cc @elastic/kibana-app-arch @markharwood

@markharwood
Copy link
Contributor

No easy answers here:

option a)
Case sensitivity is not important - until it is. Where would the "off" switch be?
I had some benchmarks for case insensitive vs case sensitive on keyword fields here

option b)
Same point about the lack of off switch.
We have tried to make wildcard field == keyword field in terms of behaviour so in this respect it would not be good.

option c)
Adding a switch to the syntax is hard. If it's any consolation I'm struggling with adding this to regexes in Lucene's query parser too.

I don't think KQL/Lucene Query string should seek to expose all the search options available in Lucene - there are ways to resolve these syntaxes' shortcomings which I will address elsewhere.

I'd root for option A - or maybe make case insensitivity an index-time decision by adding the option for normalizers to the wildcard field (cc @jimczi )

@wylieconlon
Copy link
Contributor Author

@markharwood thanks for the response. It seems like your main concern is that users should be able to choose when to use case-insensitive search, or at least to disable it. What if, instead of adding it to KQL, we added a wildcard query to the Kibana filters editor? It's a more form-based way of building filters and could have a checkbox asking if users want to do a case-insensitive search.

@markharwood
Copy link
Contributor

Hard agree that graphical clause editors (aka filter panels) are the way to simplify the complexity in clause options.

Sadly, the challenge for Kibana is that filter pills can only be ANDed together. The only way of orchestrating custom assemblies of AND/OR logic is currently through text-based KQL. This is why I've been advocating graphical Boolean query builders for some time

@AlonaNadler
Copy link

I got multiple issues with the fact Kibana is case sensitive. There are few use cases where it is important but for most users it's an obstacle requires them to know how exactly the value was entered.

I prefer option A

a. Set this flag to be the default in all KQL wildcard searches, without changing the KQL grammar. This has some potential performance issues as described in a comment by @markharwood in the related issue about wildcard fields:

And introduce an advance setting that allows to turn that off

cc: @elastic-jb

@markharwood
Copy link
Contributor

markharwood commented Oct 22, 2020

And introduce an advance setting that allows to turn that off

If we can't make the off switch an inline per-query-clause we need to consider the other possible scopes:

  1. An option on the user's KQL bar
  2. A specific field in the mapping
  3. A specific index pattern
  4. Kibana-wide

While case sensitivity is not typically important for text it can matter for machine-facing content like

  • Base64 encoded values
  • Passwords
  • Unix-based file names
  • Cookies
  • Variable names in code or command line switches

We may want to be careful about simply introducing a Kibana-wide switch.

@elastic-jb
Copy link

I am option A also. I think the scope question is interesting. Could it be a per-space setting instead of Kibana wide? I was hit by this quite a few times recently working on a demo using NBA data. LeBron James' capital B kept tripping me up. Even after I was aware of it, I still sometimes didn't capitalize the B here and there, and the results were empty. More inclusive feels like a better default.

@exalate-issue-sync exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Jun 2, 2021
@exalate-issue-sync exalate-issue-sync bot removed the loe:small Small Level of Effort label May 17, 2022
@petrklapka petrklapka added Feature:Search Querying infrastructure in Kibana Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. and removed Team:AppServicesSv labels Nov 23, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

@GeorgeGkinis
Copy link

GeorgeGkinis commented Feb 7, 2023

Hello!

Is this issue related that i can do this in dev console:

GET logs-my-index-grokking/_search
{
  "query": {
    "wildcard": {
      "message": {
        "value":  "*certs chain from Key*"
      }
    }
  }
}

but not in kibana in the KQL bar:
message: "*certs chain from Key*"?

the message field is set as wilcard type.

@davismcphee davismcphee added the loe:needs-research This issue requires some research before it can be worked on or estimated label Sep 8, 2023
@kertal kertal added the Icebox label Oct 1, 2024
@kertal
Copy link
Member

kertal commented Oct 1, 2024

Closing this because it's not planned to be resolved in the foreseeable future. It will be tracked in our Icebox and will be re-opened if our priorities change. Feel free to re-open if you think it should be melted sooner.

@kertal kertal closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss enhancement New value added to drive a business result Feature:KQL KQL Feature:Search Querying infrastructure in Kibana Icebox impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:needs-research This issue requires some research before it can be worked on or estimated Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL.
Projects
None yet
Development

No branches or pull requests

9 participants