Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EQL: Regex character classes are not supported #55231

Closed
jrodewig opened this issue Apr 15, 2020 · 5 comments
Closed

EQL: Regex character classes are not supported #55231

jrodewig opened this issue Apr 15, 2020 · 5 comments
Labels
:Analytics/EQL EQL querying :Search/Search Search-related issues that do not fall into other categories Team:QL (Deprecated) Meta label for query languages team Team:Search Meta label for search team

Comments

@jrodewig
Copy link
Contributor

jrodewig commented Apr 15, 2020

In the Python implementation, the EQL match function supports shorthand character classes for regular expressions (e.g., [\s], [\w], [\d]).

In the Elasticsearch implementation, the match function is converted into a SQL RLIKE function. The RLIKE function is then converted into a regexp query that uses Lucene's regular expression engine.

However, Lucene's regular expression syntax does not support these character classes.

Next steps

We should discuss whether it is reasonable to add support for shorthand character classes in Elasticsearch.

At the least, we should update the documentation for the EQL match function and the SQL RLIKE function to note that character classes are not supported and point users to our regexp syntax documentation.

@jrodewig jrodewig added >docs General docs changes :Search/Search Search-related issues that do not fall into other categories team-discuss :Analytics/EQL EQL querying labels Apr 15, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-docs (>docs)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (:Query Languages/EQL)

@colings86 colings86 removed the >docs General docs changes label Apr 15, 2020
@jrodewig jrodewig changed the title EQL: Regexp character classes are not supported EQL: Regex character classes are not supported Apr 15, 2020
@markharwood
Copy link
Contributor

I opened a Lucene issue to discuss.

jrodewig added a commit that referenced this issue Apr 28, 2020
The`RLIKE` function docs points users to [Java’s Pattern class doc][0]
for regular expression syntax. However, these docs include shorthand
character classes, such as `[\d]`, `[\s]`, and `[\w]`. These character
classes are not supported in Elasticsearch, which may confuse users.

This updates the SQL `RLIKE` docs to refer to the ES [regular expression
syntax docs][1], which only documents supported syntax.

[0]: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html
[1]: https://www.elastic.co/guide/en/elasticsearch/reference/master/regexp-syntax.html

Relates to #55231
jrodewig added a commit that referenced this issue Apr 28, 2020
The`RLIKE` function docs points users to [Java’s Pattern class doc][0]
for regular expression syntax. However, these docs include shorthand
character classes, such as `[\d]`, `[\s]`, and `[\w]`. These character
classes are not supported in Elasticsearch, which may confuse users.

This updates the SQL `RLIKE` docs to refer to the ES [regular expression
syntax docs][1], which only documents supported syntax.

[0]: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html
[1]: https://www.elastic.co/guide/en/elasticsearch/reference/master/regexp-syntax.html

Relates to #55231
jrodewig added a commit that referenced this issue Apr 28, 2020
The`RLIKE` function docs points users to [Java’s Pattern class doc][0]
for regular expression syntax. However, these docs include shorthand
character classes, such as `[\d]`, `[\s]`, and `[\w]`. These character
classes are not supported in Elasticsearch, which may confuse users.

This updates the SQL `RLIKE` docs to refer to the ES [regular expression
syntax docs][1], which only documents supported syntax.

[0]: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html
[1]: https://www.elastic.co/guide/en/elasticsearch/reference/master/regexp-syntax.html

Relates to #55231
jrodewig added a commit that referenced this issue Apr 28, 2020
The`RLIKE` function docs points users to [Java’s Pattern class doc][0]
for regular expression syntax. However, these docs include shorthand
character classes, such as `[\d]`, `[\s]`, and `[\w]`. These character
classes are not supported in Elasticsearch, which may confuse users.

This updates the SQL `RLIKE` docs to refer to the ES [regular expression
syntax docs][1], which only documents supported syntax.

[0]: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html
[1]: https://www.elastic.co/guide/en/elasticsearch/reference/master/regexp-syntax.html

Relates to #55231
jrodewig added a commit that referenced this issue Apr 28, 2020
The`RLIKE` function docs points users to [Java’s Pattern class doc][0]
for regular expression syntax. However, these docs include shorthand
character classes, such as `[\d]`, `[\s]`, and `[\w]`. These character
classes are not supported in Elasticsearch, which may confuse users.

This updates the SQL `RLIKE` docs to refer to the ES [regular expression
syntax docs][1], which only documents supported syntax.

[0]: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html
[1]: https://www.elastic.co/guide/en/elasticsearch/reference/master/regexp-syntax.html

Relates to #55231
@rjernst rjernst added Team:QL (Deprecated) Meta label for query languages team Team:Search Meta label for search team labels May 4, 2020
@jrodewig
Copy link
Contributor Author

jrodewig commented Jun 4, 2021

The ES EQL docs for the regex keyword point to our docs for Lucene's regular expression syntax. Now that LUCENE-9336 is resolved, we'll support the shorthand classes when ES updates to Lucene 9.0.

@jrodewig jrodewig closed this as completed Jun 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/EQL EQL querying :Search/Search Search-related issues that do not fall into other categories Team:QL (Deprecated) Meta label for query languages team Team:Search Meta label for search team
Projects
None yet
Development

No branches or pull requests

5 participants