[CI] LogicalPlanOptimizerTests testSimplifyComparisionArithmetics_floatDivision failing #108524

not-napoleon · 2024-05-10T20:14:03Z

The SimplifyComparisonArithimetics optimizer fails to optimize the expression 2 / float < 4. It seems like on the first pass through, it generates the expression float * 4.0 > 2, but then doesn't further optimize that to float > 0.5. I think it detects a floating point multiplication in the second expression, which is considered an unsafe optimization.

Build scan:
https://gradle-enterprise.elastic.co/s/zn7kv6lwmdrow/tests/:x-pack:plugin:esql:test/org.elasticsearch.xpack.esql.optimizer.LogicalPlanOptimizerTests/testSimplifyComparisionArithmetics_floatDivision

Reproduction line:

./gradlew ':x-pack:plugin:esql:test' --tests "org.elasticsearch.xpack.esql.optimizer.LogicalPlanOptimizerTests.testSimplifyComparisionArithmetics_floatDivision" -Dtests.seed=A8AF70E4F3429963 -Dtests.locale=zh-TW -Dtests.timezone=Asia/Chungking -Druntime.java=21

Applicable branches:
main

Reproduces locally?:
Yes

Failure history:
Failure dashboard for org.elasticsearch.xpack.esql.optimizer.LogicalPlanOptimizerTests#testSimplifyComparisionArithmetics_floatDivision

Failure excerpt:

java.lang.AssertionError: Expected left side of comparison to be a field attribute but found [4]

  at __randomizedtesting.SeedInfo.seed([A8AF70E4F3429963:57AA5A3E32A7E354]:0)
  at org.junit.Assert.fail(Assert.java:89)
  at org.junit.Assert.assertTrue(Assert.java:42)
  at org.elasticsearch.xpack.esql.optimizer.LogicalPlanOptimizerTests.doTestSimplifyComparisonArithmetics(LogicalPlanOptimizerTests.java:4421)
  at org.elasticsearch.xpack.esql.optimizer.LogicalPlanOptimizerTests.testSimplifyComparisionArithmetics_floatDivision(LogicalPlanOptimizerTests.java:4487)
  at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
  at java.lang.reflect.Method.invoke(Method.java:580)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
  at java.lang.Thread.run(Thread.java:1583)

The text was updated successfully, but these errors were encountered:

elasticsearchmachine · 2024-05-10T20:14:26Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

fang-xing-esql · 2024-07-26T15:00:47Z

This test is originally from OptimizerRunTests in SQL, I compared the trace side by side between SQL and ES|QL, and it is related to dealing with fixed point numbers indeed.

Here is the trace from ESQL, after the first iteration to apply SimplifyComparisonsArithmetics, the predicate is transformed into 2[INTEGER] < 4.0[DOUBLE] * float{f}#9. In the second iteration of SimplifyComparisonsArithmetics, it is considered unsafe to transform it into float{f}#9 > 0.5, because the right hand side is a double, not a fixed point number.

[TRACE][o.e.x.e.o.LogicalPlanOptimizer] [testSimplifyComparisionArithmetics_floatDivision] Rule rules.SimplifyComparisonsArithmetics applied
Limit[1000[INTEGER]]                                                         = Limit[1000[INTEGER]]
\_Filter[2[INTEGER] / float{f}#9 < 4[INTEGER]]                               ! \_Filter[2[INTEGER] < 4.0[DOUBLE] * float{f}#9]
  \_EsRelation[types][!alias_integer, boolean{f}#4, byte{f}#5, constant_k..] =   \_EsRelation[types][!alias_integer, boolean{f}#4, byte{f}#5, constant_k..]

However, in SQL, after the first iteration of SimplifyComparisonsArithmetics, the predicate is transformed into one with different data type - 2[INTEGER] < 4.0[INTEGER] * test.float{f}#18, which makes it considered safe in the second iteration of SimplifyComparisonsArithmetics, 4.0[INTEGER] looks weird and confusing, could be a bug in SQL?

[TRACE][o.e.x.s.o.Optimizer      ] [testSimplifyComparisonArithmeticCommutativeVsNonCommutativeOps] Rule optimizer.OptimizerRules$SimplifyComparisonsArithmetics applied with changes
Project[[test.some.string{f}#6]]                                            = Project[[test.some.string{f}#6]]
\_Filter[2[INTEGER] / test.float{f}#18 < 4[INTEGER]]                        ! \_Filter[2[INTEGER] < 4.0[INTEGER] * test.float{f}#18]
  \_EsRelation[test][date{f}#4, some{f}#5, some.string{f}#6, some.string..] =   \_EsRelation[test][date{f}#4, some{f}#5, some.string{f}#6, some.string..]
...
[TRACE][o.e.x.s.o.Optimizer      ] [testSimplifyComparisonArithmeticCommutativeVsNonCommutativeOps] Rule optimizer.OptimizerRules$SimplifyComparisonsArithmetics applied with changes
Project[[test.some.string{f}#6]]                                            = Project[[test.some.string{f}#6]]
\_Filter[test.float{f}#18 * 4.0[INTEGER] > 2[INTEGER]]                      ! \_Filter[test.float{f}#18 > 0.5[INTEGER]]
  \_EsRelation[test][date{f}#4, some{f}#5, some.string{f}#6, some.string..] =   \_EsRelation[test][date{f}#4, some{f}#5, some.string{f}#6, some.string..]

If the isUnsafe checks are valid, this is working as expected.

not-napoleon · 2024-07-26T15:12:37Z

That 4.0[INTEGER] from the SQL output looks very familiar. I suspect fixing #108388 exposed this behavior. That class cast exception was caused by exactly that type of thing, having a double value with an integer data type. I wonder if SQL is doing the wrong thing here.

fang-xing-esql · 2024-07-26T15:25:25Z

That 4.0[INTEGER] from the SQL output looks very familiar. I suspect fixing #108388 exposed this behavior. That class cast exception was caused by exactly that type of thing, having a double value with an integer data type. I wonder if SQL is doing the wrong thing here.

Here are the difference between ES|QL's SimplifyComparisonsArithmetics and QL/SQL's, ES|QL specify DOUBLE when when creating a new Literal, in this case 4.0[DOUBLE], however QL/SQL uses the original Literal's type and created 4.0[INTEGER], IMO it is wrong.

ES|QL

    final Expression apply() {
            // force float point folding for FlP field
            Literal bcl = operation.dataType().isRationalNumber()
                ? new Literal(bcLiteral.source(), ((Number) bcLiteral.value()).doubleValue(), DataType.DOUBLE)
                : bcLiteral;

QL

            final Expression apply() {
                // force float point folding for FlP field
                Literal bcl = operation.dataType().isRational()
                    ? Literal.of(bcLiteral, ((Number) bcLiteral.value()).doubleValue())
                    : bcLiteral;

not-napoleon · 2024-07-26T15:29:30Z

Ah, yeah, I remember that change. Using Literal.of there definitely seemed wrong, for exactly that reason. I'm inclined to agree with you that this behavior is correct and SQL is doing the wrong thing, and we should consider fixing this in SQL's version of the optimization. @bpintea or @astefan what do you think?

bpintea · 2024-07-26T17:33:13Z

Yes, the QL code seems incorrect as it doesn't actually enforce the type -- operation's doesn't have to be same as bcLiteral's and Literal.of() invoked without a type takes first argument's type.
Wondering what masks this issue, how come it didn't surface earlier.

not-napoleon · 2024-07-26T17:48:12Z

I only found it in ES|QL because it was getting a class cast exception trying to reconcile the blocks. I suspect since we don't have typed blocks in SQL, we never saw the type miss-match.

not-napoleon added :Analytics/ES|QL AKA ESQL >test-failure Triaged test failures from CI labels May 10, 2024

elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) needs:risk Requires assignment of a risk label (low, medium, blocker) labels May 10, 2024

not-napoleon mentioned this issue May 10, 2024

[ESQL] Migrate SimplifiyComparisonArithmetics optimization rule #108200

Closed

fang-xing-esql added low-risk An open issue or test failure that is a low risk to future releases and removed needs:risk Requires assignment of a risk label (low, medium, blocker) labels May 22, 2024

alex-spies assigned alex-spies and unassigned alex-spies Jul 1, 2024

astefan assigned fang-xing-esql Jul 24, 2024

fang-xing-esql mentioned this issue Jul 27, 2024

[ES|QL] Modify SimplifyComparisionArithmetics tests #111376

Merged

fang-xing-esql closed this as completed in #111376 Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] LogicalPlanOptimizerTests testSimplifyComparisionArithmetics_floatDivision failing #108524

[CI] LogicalPlanOptimizerTests testSimplifyComparisionArithmetics_floatDivision failing #108524

not-napoleon commented May 10, 2024

elasticsearchmachine commented May 10, 2024

fang-xing-esql commented Jul 26, 2024 •

edited

Loading

not-napoleon commented Jul 26, 2024

fang-xing-esql commented Jul 26, 2024 •

edited

Loading

not-napoleon commented Jul 26, 2024

bpintea commented Jul 26, 2024

not-napoleon commented Jul 26, 2024

[CI] LogicalPlanOptimizerTests testSimplifyComparisionArithmetics_floatDivision failing #108524

[CI] LogicalPlanOptimizerTests testSimplifyComparisionArithmetics_floatDivision failing #108524

Comments

not-napoleon commented May 10, 2024

elasticsearchmachine commented May 10, 2024

fang-xing-esql commented Jul 26, 2024 • edited Loading

not-napoleon commented Jul 26, 2024

fang-xing-esql commented Jul 26, 2024 • edited Loading

not-napoleon commented Jul 26, 2024

bpintea commented Jul 26, 2024

not-napoleon commented Jul 26, 2024

fang-xing-esql commented Jul 26, 2024 •

edited

Loading

fang-xing-esql commented Jul 26, 2024 •

edited

Loading