Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Support hash-join on larger than memory datasets #31769

Open
asfimport opened this issue Apr 28, 2022 · 4 comments
Open

[C++] Support hash-join on larger than memory datasets #31769

asfimport opened this issue Apr 28, 2022 · 4 comments

Comments

@asfimport
Copy link
Collaborator

asfimport commented Apr 28, 2022

The current implementation of the hash-join node current queues in memory the hashtable, the entire build side input, and the entire probe side input (e.g. the entire dataset). This means the current implementation will run out of memory and crash if the input dataset is larger than the memory on the system.

By spilling to disk when memory starts to fill up we can allow the hash-join node to process datasets larger than the available memory on the machine.

Reporter: Weston Pace / @westonpace

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-16389. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

@asfimport
Copy link
Collaborator Author

Apache Arrow JIRA Bot:
This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per project policy. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

@asfimport
Copy link
Collaborator Author

Zane Wilbert Keeler:
dr jim
data sets on proper hdwe sfwr for your info.

z

On Thu, Jul 21, 2022 at 1:04 AM ASF GitHub Bot (Jira) [email protected]

@asfimport
Copy link
Collaborator Author

Zane Wilbert Keeler:
dr jim
changing server
hdwe update.
z

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant