Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Add memory accountancy and circuit breaking #99173

Closed
6 of 7 tasks
ChrisHegarty opened this issue Sep 4, 2023 · 4 comments
Closed
6 of 7 tasks

ESQL: Add memory accountancy and circuit breaking #99173

ChrisHegarty opened this issue Sep 4, 2023 · 4 comments
Assignees
Labels
:Analytics/ES|QL AKA ESQL Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@ChrisHegarty
Copy link
Contributor

ChrisHegarty commented Sep 4, 2023

Currently the ESQL runtime does not use circuit breakers, so is susceptible to OutOfMemoryErrors when working with large amounts of data or performing memory intensive operations . Similar to other parts of Elasticsearch, ESQL should circuit break its allocations and track overall memory usage.

The ESQL runtime on data nodes has a bounded queue in their exchange sink. This queue adds a natural bound to the amount of outstanding pages (and transitively bocks and vectors) that are in use at any one time, but is not sufficient when dealing with data with large numbers of columns or bytes refs that are enormous in size. Additionally, the coordinator may need to hold a lot of data in memory when processing output from the data nodes and/or computing the overall result.

ESQL driver pipelines are executed in a single threaded manner. The general idea is have a circuit breaker per driver, whereby this breaker would "checkout" chunks of memory from it's parent breaker. These chunks would be allotted to the code executing in the driver - a thread local allocation if you will. The driver breaker subsequently request more memory from its parent when this TLA is exhausted. The ESQL runtime will have an overall breaker for all of its allocations on any one particular node, which includes the tasks that it performs as both a data node as well as a coordinator.

A non-exhaustive list of specific tasks:

@ChrisHegarty ChrisHegarty added Team:QL (Deprecated) Meta label for query languages team :Analytics/ES|QL AKA ESQL labels Sep 4, 2023
@ChrisHegarty ChrisHegarty self-assigned this Sep 4, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine elasticsearchmachine removed the Team:QL (Deprecated) Meta label for query languages team label Jan 2, 2024
@ChrisHegarty
Copy link
Contributor Author

We're in a very good state with regard to memory tracking. And all the substantial subtasks in this meta issue are complete. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

4 participants