-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Kibana Query Language #12282
Comments
My 2 cents: I'd rather it not be "functional syntax" but more "natural language". The problem with "functional syntax" is it looks to foreign to non programmers/engineers and so business users will be frightened away from it. The timelion syntax is an example of a "functional syntax" that I have several users that don't like it because it is too "weird"/"complicated" looking. |
@trevan it is a good point that we should make sure to make the language as accessible as possible to non-technical users. The main problem when getting close to "natural language" often is the ambiguity of the grammar. But it should be possible to get quite far as long as we stick to something like
That looks like it should still be generatable by a context-free grammar. |
I agree, it's worth playing around with an even more natural syntax. One thing I don't like about languages that try to go too far in that direction is that they become more difficult to understand at a glance. Without consistent separators like @trevan do the users who dislike timelion's syntax also dislike the lucene query syntax? |
@Bargs, is the lucene query syntax https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax or https://lucene.apache.org/core/2_9_4/queryparsersyntax.html? I guess since we have an _all field, most just query for the values that they want and use AND/OR and grouping. Something like "(404 405) AND homepage". That syntax is pretty "natural" or at least Google and other search engines have instructed them in that manner. I think I've only heard disagreements about the range (field:[x TO y]) and gt/lt syntax (field:x). When I do point out that "exists:field" is possible or "field:value" is possible, they seem to grasp that fairly quickly. |
I agree that most people are comfortable with the Google syntax. In fact, that's the phrase I most use to help my users: "It's not Splunk. Think of it like the Google search bar." That works very well. I've also been trying to get my users to user +/- instead of AND/OR, which some success. |
How about field comparisons: (flow.bytes >= (flow.duration *100)) ? (even if we build natural forms, please keep the arithmetic forms for readability.) |
Another way to look at this is that if non-technical users need to fall back to writing a text query in Kibana, then we have a deficiency in our search UI. I agree that the timelion syntax is challenging for non-technical users, but I also think it is highly effective for technical users. If Timelion had a UI to build expressions in addition to the ability to fall back to the query language, then both sets of users might be nearly completely satisfied. Personally, I'd prefer a powerful, unambiguous query language for raw input alongside a higher level UI for constructing and representing queries. |
Here's a list of syntaxes used by other software (for comparison purposes): Query String Simple Query String Google Inbox Google Search GitHub Slack |
We talked about this on Zoom a bit today. For the first iteration, I'm going to work towards the syntax I outlined in the issue description, along with some shorthand aliases for very common queries, like While even more natural looking queries seem appealing in theory, we think we'd run into a few issues:
Note that none of this is set in stone though. Like I said in the issue description, this will be an experimental feature that will change over time. This is just the direction I'll head in first. Also important to note, I've been developing #11915 in such a way that we should be able to add support for new languages via plugins in the future. If we decide to stay away from natural language queries but someone else really wants them, they could develop their own plugin for it. |
@Bargs in In line with the forthcoming SQL interface in Elasticsearch (and the general trend in big data) i believe it would make a lot of sense if the basic syntax is compatible with the SQL where clause.. SQL like functions could be used for the extras. my2c. :) . |
Part of #10789. Motivations and overall goals are described in that ticket. This ticket is only for implementation of the language itself, additional enhancements like autocomplete will be separate.
This new query language will be merged as an experimental feature. It'll likely evolve over future iterations, so these are just our initial ideas.
The new language should have certain characteristics:
With that said, what should the language look like?
Here's my current thinking:
The new filter editor introduced a nice way to build queries in plain english.
I think we could mimic this in the query language with a more functional syntax. The general pattern would be
<function>(<params>)
.Support for named parameters:
Provides a better way to support advanced options:
It's easy to read and understand, it follows a consistent pattern, and allows for an infinite number of query types. How might we support a geo bounding box query?
If we decide after testing this is too verbose for the simplest cases, we could introduce shorthand aliases for some of the most common queries.
:
could be an alias foris
so that you can still doresponse:200
for example.Query Syntax
This is in development, but I'll try to keep this up to date as I flesh out the language.
Queries are represented as functions. Many functions take a field name as their first argument. Extremely common functions have shorthand notations.
is("response", 200)
will match documents where theresponse
field matches the value200
.response:200
does the same thing.:
is essentially an alias for theis
function.Multiple search terms are separated by whitespace:
response:200 extension:php
will match documents whereresponse
matches200
andextension
matchesphp
.All terms must match by default. The language supports boolean logic with
and/or
operators. The above query is equivalent toresponse:200 and extension:php
We can make terms optional by using
or
.response:200 or extension:php
will match documents whereresponse
matches200
,extension
matchesphp
, or both.By default,
and
has a higher precedence thanor
.response:200 and extension:php or extension:css
will match documents whereresponse
is200
andextension
isphp
OR documents whereextension
iscss
andresponse
is anything.We can override the default precedence with grouping.
response:200 and (extension:php or extension:css)
will match documents whereresponse
is200
andextension
is eitherphp
orcss
.Terms can be inverted by prefixing them with
!
.!response:200
will match all documents whereresponse
is not200
.Entire groups can also be inverted.
response:200 and !(extension:php or extension:css)
Some query functions have named arguments.
range("bytes", gt=1000, lt=8000)
will match documents where thebytes
field is greater than 1000 and less than 8000.Notes: Terms without fields will be matched against all fields. For example, a query for
200
will search for the value200
across all fields in your index.Function Reference
Function name:
and
Purpose: Match all given sub-queries
Alias:
and
as a binary operatorExample:
and(response:200, extension:php)
orresponse:200 and extension:php
Function name:
or
Purpose: Match one or more sub-queries
Alias:
or
as a binary operatorExample:
or(extension:css, extension:php)
orextension:css or extension:php
Function name:
not
Purpose: Negates a sub-query
Alias:
!
as a prefix operatorExample:
not(response:200)
or!response:200
Function name:
is
Purpose: Matches a field with a given term
Alias:
:
Example:
is("response", 200)
orresponse:200
Function name:
range
Purpose: Match a field against a range of values.
Alias:
:[]
Example:
range("bytes", gt=1000, lt=8000)
orbytes:[1000 to 8000]
Named arguments:
gt
- greater thangte
- greater than or equal tolt
- less thanlte
- less than or equal toFunction name:
exists
Purpose: Match documents where a given field exists
Example:
exists("response")
Function name:
geoBoundingBox
Purpose: Creates a geo_bounding_box query
Example:
geoBoundingBox("coordinates", topLeft="40.73, -74.1", bottomRight="40.01, -71.12")
Named arguments:
topLeft
- the top left corner of the bounding box as a "lat, lon" stringbottomRight
- the bottom right corner of the bounding box as a "lat, lon" stringFunction name:
geoPolygon
Purpose: Creates a geo_polygon query given 3 or more points as "lat, lon"
Example:
geoPolygon("geo.coordinates", "40.97, -127.26", "24.20, -84.375", "40.44, -66.09")
The text was updated successfully, but these errors were encountered: