Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve MvPSeriesWeightedSum edge case and add more tests #111552

Merged
merged 7 commits into from
Aug 6, 2024

Conversation

machadoum
Copy link
Member

@machadoum machadoum commented Aug 2, 2024

  • Update MvPSeriesWeightedSum function to return null + warnings instead of Infinite values.
  • Add extra tests to MvPSeriesWeightedSum for edge case scenarios.

I ran the tests 100 times to ensure they didn't break due to random values.

@elasticsearchmachine elasticsearchmachine added external-contributor Pull request authored by a developer outside the Elasticsearch team v8.16.0 labels Aug 2, 2024
@nik9000 nik9000 added :Analytics/ES|QL AKA ESQL and removed external-contributor Pull request authored by a developer outside the Elasticsearch team labels Aug 2, 2024
@machadoum machadoum self-assigned this Aug 5, 2024
@machadoum machadoum added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) external-contributor Pull request authored by a developer outside the Elasticsearch team >enhancement labels Aug 5, 2024
@machadoum machadoum marked this pull request as ready for review August 5, 2024 08:22
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @machadoum, I've created a changelog YAML for you.

@machadoum machadoum changed the title Siem ea 9521 improve test Improve MvPSeriesWeightedSum edge case and add more tests Aug 5, 2024
@@ -1032,6 +1116,10 @@ private ProcessFunction(
args.add(new BlockProcessFunctionArg(type, name));
continue;
}
if (type.equals(WARNINGS)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure these changes are required after using warnExceptions (The other comment)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe they are because the other comment doesn't work. There might be another way, but I didn't see one at the time.

Comment on lines 99 to 101
if (p.dataType() == NULL) {
return resolution;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeResolutions.isType() already returns a "resolved" for nulls. So I believe this check isn't really doing anything

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! that one may be the case.

I'll have a look.

Comment on lines +37 to +49
cases = randomizeBytesRefsOffset(cases);
cases = anyNullIsNull(
cases,
(nullPosition, nullValueDataType, original) -> nullValueDataType == DataType.NULL ? DataType.NULL : original.expectedType(),
(nullPosition, nullData, original) -> {
if (nullData.isForceLiteral()) {
return equalTo("LiteralsEvaluator[lit=null]");
}
return nullData.type() == DataType.NULL ? equalTo("LiteralsEvaluator[lit=null]") : original;
}
);
cases = errorsForCasesWithoutExamples(cases, (valid, position) -> "double");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be easier to just not require a literal/foldable as the second parameter? We really don't need a literal there, everything should run as smoothly, and maybe we avoid having weird cases with nulls.

Saying this, as I find it weird that we need such extra testing logics for this function. Two birds with one stone (probably?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had talked with @machadoum about forcing a literal here because it's the way they run it and they can use the forced-literal-ness in the executor. We can always change it later if folks need it. I agree the force-literal-ness makes this function quite a bit more....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we're adding complexity, just to reduce the options. It's done now, so we can go with it, but I would personally remove the literal forcing before doing much more things

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivancea I pushed an update. There's a fun thing around null that kind of bothers me. I left a TODO around it though so we can get the extra testing in....

Copy link
Contributor

@ivancea ivancea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine now, but I'd do some continuations on it that should clean some code

Comment on lines +98 to +102
if (p.dataType() == NULL) {
// If the type is `null` this parameter doesn't have to be foldable. It's effectively foldable anyway.
// TODO figure out if the tests are wrong here, or if null is really different from foldable null
return resolution;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: if it's null, FoldNull rule will convert the full expression into a literal null. And we explicitly call that rule in the tests. See:

FoldNull checks if it's either NULL or is foldable and fold() == null. isFoldable() just checks foldability.

Btw, before touching anymore this method, I really think we should make it work with non-constants. Looks like we're "repeatedly" bumping into problems other functions don't have just because of it :')

@@ -130,10 +135,13 @@ protected NodeInfo<? extends Expression> info() {

@Override
public DataType dataType() {
if (p.dataType() == NULL) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this? Yes, it will return a NULL. But it's because of a planner rule, not because this functions can return nulls in any way (If that makes sense?).

This is not how other functions work, so I feel like we're overcomplicating things here

@nik9000 nik9000 removed the external-contributor Pull request authored by a developer outside the Elasticsearch team label Aug 6, 2024
@nik9000
Copy link
Member

nik9000 commented Aug 6, 2024

I'm going to merge this as is so we can add a few more test cases and move on. I do think this deserves more love though.

@nik9000 nik9000 merged commit b803c2a into elastic:main Aug 6, 2024
15 checks passed
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Aug 7, 2024
* upstream/main: (132 commits)
  Fix compile after several merges
  Update docs with new behavior on skip conditions (elastic#111640)
  Skip on any instance of node or version features being present (elastic#111268)
  Skip on any node capability being present (elastic#111585)
  [DOCS] Publishes Anthropic inference service docs. (elastic#111619)
  Introduce `ChunkedZipResponse` (elastic#109820)
  [Gradle] fix esql compile cacheability (elastic#111651)
  Mute org.elasticsearch.datastreams.logsdb.qa.StandardVersusLogsIndexModeChallengeRestIT testTermsQuery elastic#111666
  Mute org.elasticsearch.datastreams.logsdb.qa.StandardVersusLogsIndexModeChallengeRestIT testMatchAllQuery elastic#111664
  Mute org.elasticsearch.xpack.esql.analysis.VerifierTests testMatchCommand elastic#111661
  Mute org.elasticsearch.xpack.esql.optimizer.LocalPhysicalPlanOptimizerTests testMatchCommandWithMultipleMatches {default} elastic#111660
  Mute org.elasticsearch.xpack.esql.optimizer.LocalPhysicalPlanOptimizerTests testMatchCommand {default} elastic#111659
  Mute org.elasticsearch.xpack.esql.optimizer.LocalPhysicalPlanOptimizerTests testMatchCommandWithWhereClause {default} elastic#111658
  LogsDB qa tests - add specific matcher for source (elastic#111568)
  ESQL: Move `randomLiteral` (elastic#111647)
  [ESQL] Clean up UNSUPPORTED type blocks (elastic#111648)
  ESQL: Remove the `NESTED` DataType (elastic#111495)
  ESQL: Move more out of esql-core (elastic#111604)
  Improve MvPSeriesWeightedSum edge case and add more tests (elastic#111552)
  Add link to flood-stage watermark exception message (elastic#111315)
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
mhl-b pushed a commit that referenced this pull request Aug 8, 2024
* Update `MvPSeriesWeightedSum` function to return `null` + warnings instead of Infinite values.
* Add extra tests to `MvPSeriesWeightedSum` for edge case scenarios.

I ran the tests 100 times to ensure they didn't break due to random values.
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Sep 4, 2024
…1552)

* Update `MvPSeriesWeightedSum` function to return `null` + warnings instead of Infinite values.
* Add extra tests to `MvPSeriesWeightedSum` for edge case scenarios.

I ran the tests 100 times to ensure they didn't break due to random values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants