Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track memory usage for each individual operator #899

Open
alamb opened this issue Aug 16, 2021 · 3 comments
Open

Track memory usage for each individual operator #899

alamb opened this issue Aug 16, 2021 · 3 comments
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Aug 16, 2021

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
When reviewing a plan, it would be nice to know the amount of memory each individual ExecutionPlan allocated during its execution.

Describe the solution you'd like
Add two new metrics to all operators:

  1. Total memory allocated when it is read
  2. Peak memory allocated (maximum value of memory allocated)

"Allocated" should include both memory in created record batches as well as any internal memory (as described in #898 -- hopefully this code would just use the same underlying allocation measurement)

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Probably could follow the same model as #866 (baseline metrics for all operators) once that is implemented

#898 is for tracking overall memory allocations across all operators in a plan. This issue is for tracking the allocations for each individual operator

@boaz-codota
Copy link

This got me interesting, so I started looking into it, and I'm not sure how we aim to tackle it.

My 1st idea was to write a Decorator which implements ExecutionPlan and which replace the allocator in some way, but that seems non-feasible due to the fact that it both requires replacing the GlobalAllocator (which is way more intrusive than I imagine DataFusion wanting to ever be), and will not work well-given concurrency (can't know what thread allocated that memory). An example for a GlobalAllocator Decorator can be found in: https://github.com/cuviper/alloc_geiger

The other approach I found (from the article below) is implementing something similar to Servo's malloc_size_of which can be found in: https://github.com/servo/servo/blob/faf3a183f3755a9986ec4379abadf3523bd8b3c0/components/malloc_size_of/lib.rs
This solution is quite intrusive (from what I can see) and requires manually "registering" any memory allocation to add up to a per ExecutionPlan sum.

Not sure where to go from here, would love to hear some feedback.

Some references: https://rust-analyzer.github.io/blog/2020/12/04/measuring-memory-usage-in-rust.html

@alamb
Copy link
Contributor Author

alamb commented Aug 19, 2021

I was kind of imagining we would have to do something like manually registering memory allocations. the malloc_size_of trait is a cool idea.

While it would be likely be crazy complicated to do this for all allocations, I think all the built in DataFusion operators use most of their memory in intermediate RecordBatches and a potential single large structure (e.g. the hash tables in hash_join and hash_aggregate) If we captured these large sources I think that would get us most of the value

@boaz-codota
Copy link

Cool, so I dug through the code a bit, and this seems to be a bit out of my league (needs high familiarity with way too many things). Thank you for the response!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants