Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Functions! #98545

Open
64 of 75 tasks
nik9000 opened this issue Aug 16, 2023 · 11 comments
Open
64 of 75 tasks

ESQL: Functions! #98545

nik9000 opened this issue Aug 16, 2023 · 11 comments
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@nik9000
Copy link
Member

nik9000 commented Aug 16, 2023

Description

Description

This is a list of "shovel ready" functions. It's functions we are fairly sure we want and we should be able to start working on them now. This list is not sorted at all, partly because making any one of these functions shouldn't be a huge effort so the cost of having a sorted list is comparatively high.

This list is not sacred. If you need a function to do something, stick it on the list. Maybe even build it yourself, it's fun!

NULL

IP

  • AUTO_BUCKET for IPs
  • MASK

Math

String

Date

  • DATE_ADD/DATEADD/TIMESTAMP_ADD/TIMESTAMPADD (surogate: binary operators with time periods)
  • DATE_DIFF/DATEDIFF/TIMESTAMP_DIFF/TIMESTAMPDIFF
  • DATE_PART/DATEPART for now folks can use EXTRACT
  • DAY_OF_MONTH/DOM/DAY for now folks can use EXTRACT
  • DAY_OF_WEEK/DAYOFWEEK/DOW for now folks can use EXTRACT
  • DAY_OF_YEAR/DOY for now folks can use EXTRACT
  • DAY_NAME/DAYNAME
  • EXTRACT (DATE_EXTRACT)
  • HOUR_OF_DAY/HOUR for now folks can use EXTRACT
  • ISO_DAY_OF_WEEK/ISODAYOFWEEK/ISODOW/IDOW for now folks can use EXTRACT
  • ISO_WEEK_OF_YEAR/ISOWEEKOFYEAR/ISOWEEK/IWOY/IW for now folks can use EXTRACT
  • MINUTE_OF_DAY for now folks can use EXTRACT
  • MINUTE_OF_HOUR/MINUTE for now folks can use EXTRACT
  • MONTH_OF_YEAR/MONTH for now folks can use EXTRACT
  • MONTH_NAME/MONTHNAME
  • SECOND_OF_MINUTE/SECOND for now folks can use EXTRACT
  • QUARTER for now folks can use EXTRACT
  • WEEK_OF_YEAR/WEEK for now folks can use EXTRACT
  • YEAR for now folks can use EXTRACT

Encode/Decode

Secure Hashing

  • SHA_0, SHA_1, SHA_256, SHA_512, SHAKE_128, SHAKE_256 (or SHA(type, string) and SHAKE(type, string) - we can offer both styles to allow the implementation type to be parameterized or be locked in into the query).
  • MD5
  • GENERIC SECURE_HASH() by relying on the underlying MessageDigest.getInstance

Aggregate (STATS ... BY)

@nik9000 nik9000 added >enhancement needs:triage Requires assignment of a team area label labels Aug 16, 2023
@not-napoleon not-napoleon added :Analytics/ES|QL AKA ESQL and removed needs:triage Requires assignment of a team area label labels Aug 16, 2023
@elasticsearchmachine elasticsearchmachine added the Team:QL (Deprecated) Meta label for query languages team label Aug 16, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

@nik9000
Copy link
Member Author

nik9000 commented Aug 19, 2023

I wrote up a guide to making new functions about six months ago that's gone stale. I'll try and build another guide for it on Monday.

@dreamquster
Copy link
Contributor

dreamquster commented Aug 21, 2023

@nik9000 I had wrote partial logic of 'left' function. Now I wonder how do i write the manual document about it.
Furthermore, the variable named 'length' should be optional or mandatory
This is my pull request: #98660

@nik9000
Copy link
Member Author

nik9000 commented Aug 21, 2023

Here are some instructions for adding functions.

elasticsearchmachine pushed a commit that referenced this issue Aug 28, 2023
@nik9000  Recheck out the main branch. Refactor the 'left' function to
cut the prefix string in place. But I meet a adversity that left failed
the test case 'testEvaluateInManyThreads'. I find that in multiple
thread situation,  `  EvalOperator.ExpressionEvaluator eval =
evalSupplier.get(); for (int c = 0; c < count; c++) {      
assertThat(toJavaObject(eval.eval(page), 0), testCase.getMatcher()); } `
toJavaObject function return a BytesRef with length=2, content is
[81,89]. However, assertThat function in junit4 receive the BytesRef
parameters that its length is 10. Can you give me some clues? I can't
find which variable is mutual.

Rerun failed test case's command: `gradlew ':x-pack:plugin:esql:test'
--tests
"org.elasticsearch.xpack.esql.expression.function.scalar.string.LeftTests.testEvaluateInManyThreads
{TestCase=Left basic test}" -Dtests.seed=44459C172243712
-Dtests.locale=lv-LV -Dtests.timezone=Asia/Irkutsk -Druntime.java=20`
@wchaparro wchaparro added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 2, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine elasticsearchmachine removed the Team:QL (Deprecated) Meta label for query languages team label Jan 2, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

ivancea added a commit that referenced this issue May 15, 2024
- Added the cube root function to ESQL (`CBRT(x)`). Nearly identical to SQRT, but without the negative numbers exception
- Added docs generation support for Windows end lines (CRLF), as within the examples, it was writing the "\r" without the "\n" (Which was being converted to "\\n"), and some other inconsistencies
- Some updates to `package-info.java` documentation over how to create functions
- Fixes #108675

Functions issue: #98545
@ioanatia
Copy link
Contributor

With DATE_EXTRACT being available I wonder if it still make sense to support many of the functions listed under the DATE section here.

@nik9000
Copy link
Member Author

nik9000 commented Jun 27, 2024

With DATE_EXTRACT being available I wonder if it still make sense to support many of the functions listed under the DATE section here.

Yeah, I think we probably can just zap them. At most they can be aliases to DATE_EXTRACT. But they aren't nearly as important.

@costin, what do you think of just crossing them out of hte list?

@drewdaemon
Copy link
Contributor

drewdaemon commented Oct 15, 2024

Does TYPEOF make sense on this list (comparable)? Union types can come up when using CASE (e.g. var0 = CASE(boolField, 1, "foo")). Also, potentially when using multiple indices with different types.

@nik9000
Copy link
Member Author

nik9000 commented Oct 21, 2024

I think TYPEOF makes a lot of sense with union types. I'd probably want to develop it with some union type work that uses it. And have a look to see if what we need lines up exactly with the sqllite function. If, say, we decide we should have different handling for null somehow we might not want to name it this. Not that we'd make that decision lightly - conforming to what folks are used to is quite an advantage. But I imagine a world where TYPEOF is always evaluated at query planning time on the data node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

7 participants