-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Add memory accountancy and circuit breaking #99173
Labels
Comments
ChrisHegarty
added
Team:QL (Deprecated)
Meta label for query languages team
:Analytics/ES|QL
AKA ESQL
labels
Sep 4, 2023
Pinging @elastic/es-ql (Team:QL) |
Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL) |
This was referenced Sep 6, 2023
This was referenced Sep 22, 2023
7 tasks
21 tasks
wchaparro
added
the
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
label
Jan 2, 2024
Pinging @elastic/es-analytics-geo (Team:Analytics) |
elasticsearchmachine
removed
the
Team:QL (Deprecated)
Meta label for query languages team
label
Jan 2, 2024
We're in a very good state with regard to memory tracking. And all the substantial subtasks in this meta issue are complete. Closing |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Currently the ESQL runtime does not use circuit breakers, so is susceptible to OutOfMemoryErrors when working with large amounts of data or performing memory intensive operations . Similar to other parts of Elasticsearch, ESQL should circuit break its allocations and track overall memory usage.
The ESQL runtime on data nodes has a bounded queue in their exchange sink. This queue adds a natural bound to the amount of outstanding pages (and transitively bocks and vectors) that are in use at any one time, but is not sufficient when dealing with data with large numbers of columns or bytes refs that are enormous in size. Additionally, the coordinator may need to hold a lot of data in memory when processing output from the data nodes and/or computing the overall result.
ESQL driver pipelines are executed in a single threaded manner. The general idea is have a circuit breaker per driver, whereby this breaker would "checkout" chunks of memory from it's parent breaker. These chunks would be allotted to the code executing in the driver - a thread local allocation if you will. The driver breaker subsequently request more memory from its parent when this TLA is exhausted. The ESQL runtime will have an overall breaker for all of its allocations on any one particular node, which includes the tasks that it performs as both a data node as well as a coordinator.
A non-exhaustive list of specific tasks:
The text was updated successfully, but these errors were encountered: