Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement data streaming to reduce memory needs #122

Open
willbradshaw opened this issue Dec 10, 2024 · 0 comments
Open

Implement data streaming to reduce memory needs #122

willbradshaw opened this issue Dec 10, 2024 · 0 comments
Labels
enhancement New feature or request priority_1 time&cost Changes to improve the pipeline's runtime and computational cost

Comments

@willbradshaw
Copy link
Contributor

We keep running into issues with memory allocation when processing large files. We can keep this at bay by increasing memory allocations, but this is messy and slow and fails when even larger files are input. For most parts of the pipeline, we should be able to solve this by rewriting processes to stream input files rather than reading them in entirely at the start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority_1 time&cost Changes to improve the pipeline's runtime and computational cost
Projects
None yet
Development

No branches or pull requests

1 participant