Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupedHashAggregateStream should create smaller spill batches #8003

Closed
milenkovicm opened this issue Oct 31, 2023 · 0 comments · Fixed by #8004
Closed

GroupedHashAggregateStream should create smaller spill batches #8003

milenkovicm opened this issue Oct 31, 2023 · 0 comments · Fixed by #8004
Labels
enhancement New feature or request

Comments

@milenkovicm
Copy link
Contributor

Is your feature request related to a problem or challenge?

At the moment GroupedHashAggregateStream will spill state as a single batch, which is not optimal when merging as it loads whole file in memory as a single batch.

Describe the solution you'd like

I'd like to spit spill batch into smaller chunks with default batch size same as default batch size set in configuration property.

Describe alternatives you've considered

I have considered setting batch size to a fixed size or read from configuration property, but at the moment I did not do it as it would be bigger change.

Additional context

Relates to #7858

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant