Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Syntax support and operator for count all #99602

Merged
merged 15 commits into from
Sep 30, 2023

Conversation

costin
Copy link
Member

@costin costin commented Sep 14, 2023

Introduce support for COUNT(*) and along with a dedicated Lucene source for pushing it down instead of reading all the items one by one.

This PR doesn't handle COUNT( * ) outside Lucene (on blocks) - will take care of that (if needed) in a separate PR.
This PR takes advantage of certain stats that can be implemented at Lucene level through a query even when a filter exists (whether it is specified in the query or outside) - in this PR, count().
When just a count(
) aggregation is encountered (regardless of whether a filter exists or not), the Physical optimizer will convert it into a dedicated source (LuceneCountSource) which will return the intermediate aggregation state (the count and seen bool).

There are several improvements to be done in the future:

  • implement a similar Min and Max source to take advantage of BKD trees - this means running a sorting query (asc/desc) with limit 1
  • push-down count(field) - count over (filter exists query)
  • push down unique/stats by x as a terms query
  • support aggregations that can be pushed down alongside those that cannot - this means creating multiple sources and doing a union before moving forward.

@costin costin added the :Analytics/ES|QL AKA ESQL label Sep 14, 2023
@costin costin self-assigned this Sep 14, 2023
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @costin. I left some comments for the count operator. I can push these changes if you prefer.

// check to not go over limit
var count = Math.min(leafCount, remainingDocs);
totalHits += count;
remainingDocs -= count;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to advance the position of the scorer after we use the shortcut.

Page page = null;
if (remainingDocs <= 0 || scorer.isDone()) {
pagesEmitted++;
page = new Page(1, IntBlock.newConstantBlockWith(totalHits, 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This operator should emit only one page.

@dnhatn dnhatn self-requested a review September 15, 2023 04:37
Use internal aggs when pushing down count
@costin costin marked this pull request as ready for review September 20, 2023 22:22
@costin costin requested a review from dnhatn September 20, 2023 22:22
@elasticsearchmachine elasticsearchmachine added the Team:QL (Deprecated) Meta label for query languages team label Sep 20, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

@costin
Copy link
Member Author

costin commented Sep 20, 2023

Relates #99459

* 1. the count as a long (0 if no doc is seen)
* 2. a bool flag (seen) that's always true meaning that the group (all items) always exists
*/
public class LuceneCountOperator extends LuceneOperator {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is very similar to LuceneSourceOperator - I tried combining the two but there's not much code reuse and the inner state and semantics are fairly different.

Comment on lines +148 to +154
Weight weight() {
return weight;
}

int position() {
return position;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made these accessible for the count source.

Comment on lines -217 to -228
private PhysicalOperation planEsQueryNode(EsQueryExec esQuery, LocalExecutionPlannerContext context) {
if (esQuery.query() == null) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query check and initialization happens inside physicalOperatorProviders method.

Comment on lines +232 to +250
EsPhysicalOperationProviders esProvider = (EsPhysicalOperationProviders) physicalOperationProviders;

Function<SearchContext, Query> querySupplier = EsPhysicalOperationProviders.querySupplier(statsQuery.query());

Expression limitExp = statsQuery.limit();
int limit = limitExp != null ? (Integer) limitExp.fold() : NO_LIMIT;
final LuceneOperator.Factory luceneFactory = new LuceneCountOperator.Factory(
esProvider.searchContexts(),
querySupplier,
context.dataPartitioning(),
context.taskConcurrency(),
limit
);

Layout.Builder layout = new Layout.Builder();
layout.append(statsQuery.outputSet());
int instanceCount = Math.max(1, luceneFactory.taskConcurrency());
context.driverParallelism(new DriverParallelism(DriverParallelism.Type.DATA_PARALLELISM, instanceCount));
return PhysicalOperation.fromSource(luceneFactory, layout.build());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've copied the code from esSource for now - this is candidate for future factoring.

@astefan
Copy link
Contributor

astefan commented Sep 21, 2023

This issue is more visible with count(*). For example

FROM employees | stats min = min(emp_no), c = count(*)
FROM employees | stats c = count(*) by gender

both fail with that NPE with the difference that these types of queries are more likely than min(123).

@astefan
Copy link
Contributor

astefan commented Sep 21, 2023

FROM employees | stats min = min(salary) by gender | eval x = min + 1 | stats c = count(*) by gender this one fails almost the same but with a slightly different error message:

{
    "error": {
        "root_cause": [
            {
                "type": "null_pointer_exception",
                "reason": null
            }
        ],
        "type": "null_pointer_exception",
        "reason": null,
        "suppressed": [
            {
                "type": "task_cancelled_exception",
                "reason": "parent task was cancelled [cancelled]"
            },
            {
                "type": "task_cancelled_exception",
                "reason": "parent task was cancelled [cancelled]"
            }
        ]
    },
    "status": 500
}

@costin
Copy link
Member Author

costin commented Sep 22, 2023

@astefan

This PR doesn't handle COUNT( * ) outside Lucene (on blocks) - will take care of that (if needed) in a separate PR.

However from your comments I agree that this is confusing - I've pushed a PR that adds that in along with some (your) tests.
Thanks!

}

@Override
public int intermediateBlockCount() {
return intermediateStateDesc().size();
}

private int blockIndex() {
return countAll ? 0 : channels.get(0);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nik9000 not sure if this is the best way to count things in a block but it seems to be working.

@@ -51,33 +57,35 @@ public int intermediateBlockCount() {

@Override
public AddInput prepareProcessPage(SeenGroupIds seenGroupIds, Page page) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nik9000 likewise here

@@ -508,7 +508,7 @@ from employees | stats c = count(*) by gender | sort gender;
c:l | gender:s
33 | F
57 | M
10 | null
#10 | null
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nik9000 There's a discrepancy between EsqlActionIT (running in Lucene) and CsvTests - the former don't have a null group for gender while the latter do hence why the same csv-spec fails for one but passes for the other.
I've commented this line to make EsqlActionIT pass but it will make CsvTests fail.
Since the count works inside the group, I wonder if null group is either discarded when querying against Lucene or if the underlying filter doesn't work properly (ignores missing values somehow).
Any ideas?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR, this was caused by #100109

@@ -517,7 +517,7 @@ from employees | stats c = count(*), min = min(emp_no) by gender | sort gender;
c:l | min:i | gender:s
33 | 10002 | F
57 | 10001 | M
10 | 10010 | null
#10 | 10010 | null
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@@ -526,5 +526,5 @@ from employees | stats min = min(salary) by gender | eval x = min + 1 | stats c
c:l | gender:s
1 | F
1 | M
1 | null
#1 | null
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@costin costin changed the title Syntax support and operator for count all ESQL: Syntax support and operator for count all Sep 30, 2023
@costin costin merged commit f883dd9 into elastic:main Sep 30, 2023
@costin costin deleted the esql/count_all branch September 30, 2023 18:09
piergm pushed a commit to piergm/elasticsearch that referenced this pull request Oct 2, 2023
Introduce physical plan for representing query stats
Use internal aggs when pushing down count
Add support for count all outside Lucene
jakelandis pushed a commit to jakelandis/elasticsearch that referenced this pull request Oct 2, 2023
Introduce physical plan for representing query stats
Use internal aggs when pushing down count
Add support for count all outside Lucene
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >non-issue Team:QL (Deprecated) Meta label for query languages team v8.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants