Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: 'ValuesSources are mismatched' stats group by field that is missing in some indices #100438

Closed
craigtaverner opened this issue Oct 6, 2023 · 4 comments · Fixed by #100566
Labels
:Analytics/ES|QL AKA ESQL >bug Team:QL (Deprecated) Meta label for query languages team

Comments

@craigtaverner
Copy link
Contributor

Similar to the bug reported in #100186, it seems that stats over multiple indices where some might be missing the column generate errors:

FROM logs-* | STATS count=count(*) BY agent.version

Results in:

ValuesSources are mismatched

This happens in benchmarking tests where very little data is generated. The full benchmarks have all data generated, and the error does not occur, so we presume the issue might be either:

  • A mismatch between documents created and index mappings
  • Or missing data in indexes that do have this field mapped

The error trace is:

java.lang.IllegalStateException: ValuesSources are mismatched
	at org.elasticsearch.compute.operator.OrdinalsGroupingOperator.<init>(OrdinalsGroupingOperator.java:102)
	at org.elasticsearch.compute.operator.OrdinalsGroupingOperator.get(OrdinalsGroupingOperator.java:64)
	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.lambda-zsh(LocalExecutionPlanner.java:646)
	at java.base/java.util.stream.ReferencePipeline.accept(ReferencePipeline.java:197)
	at java.base/java.util.ArrayList.forEachRemaining(ArrayList.java:1625)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.ForEachOps.evaluateSequential(ForEachOps.java:150)
	at java.base/java.util.stream.ForEachOps.evaluateSequential(ForEachOps.java:173)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.operators(LocalExecutionPlanner.java:646)
	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.apply(LocalExecutionPlanner.java:730)
	at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.createDrivers(LocalExecutionPlanner.java:773)
	at org.elasticsearch.xpack.esql.plugin.ComputeService.runCompute(ComputeService.java:265)
	at org.elasticsearch.xpack.esql.plugin.ComputeService.lambda(ComputeService.java:432)
	at [email protected]/org.elasticsearch.action.ActionListener.onResponse(ActionListener.java:177)
	at [email protected]/org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:305)
	at org.elasticsearch.xpack.esql.plugin.ComputeService.lambda(ComputeService.java:301)
	at [email protected]/org.elasticsearch.index.shard.IndexShard.ensureShardSearchActive(IndexShard.java:3932)
	at org.elasticsearch.xpack.esql.plugin.ComputeService.acquireSearchContexts(ComputeService.java:299)
	at org.elasticsearch.xpack.esql.plugin.ComputeService.messageReceived(ComputeService.java:430)
	at org.elasticsearch.xpack.esql.plugin.ComputeService.messageReceived(ComputeService.java:422)
	at [email protected]/org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
	at [email protected]/org.elasticsearch.transport.TransportService.doRun(TransportService.java:1020)
	at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext.doRun(ThreadContext.java:983)
	at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
	at [email protected]/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
	at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext.doRun(ThreadContext.java:983)
	at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1623)
@craigtaverner craigtaverner added >bug Team:QL (Deprecated) Meta label for query languages team :Analytics/ES|QL AKA ESQL labels Oct 6, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

@craigtaverner
Copy link
Contributor Author

We needed to tweak the benchmark tests to include more data to avoid this bug. We increased the dataset time range form 2s to 1min, a 30x increase. This is a 30x increase, although only increased the total test time from about 45min to about 54min, but we've been asked to decrease this again once the bug is fixed.

@craigtaverner craigtaverner changed the title ESQL: ValuesSources are mismatched stats group by field that is missing in some indices ESQL: 'ValuesSources are mismatched' stats group by field that is missing in some indices Oct 9, 2023
@dnhatn
Copy link
Member

dnhatn commented Oct 10, 2023

I've opened #100566.

elasticsearchmachine pushed a commit that referenced this issue Oct 11, 2023
ValuesSource can be Null instead of Bytes when a shard has no data for a
specific field. This PR relaxes the check for ValueSources in the
OrdinalsGroupingOperator.

We will need to add more tests for OrdinalsGroupingOperator.

Closes #100438
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Oct 11, 2023
ValuesSource can be Null instead of Bytes when a shard has no data for a
specific field. This PR relaxes the check for ValueSources in the
OrdinalsGroupingOperator.

We will need to add more tests for OrdinalsGroupingOperator.

Closes elastic#100438
elasticsearchmachine pushed a commit that referenced this issue Oct 11, 2023
ValuesSource can be Null instead of Bytes when a shard has no data for a
specific field. This PR relaxes the check for ValueSources in the
OrdinalsGroupingOperator.

We will need to add more tests for OrdinalsGroupingOperator.

Closes #100438
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >bug Team:QL (Deprecated) Meta label for query languages team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants