Implement data streaming to reduce memory needs #122

willbradshaw · 2024-12-10T15:48:49Z

We keep running into issues with memory allocation when processing large files. We can keep this at bay by increasing memory allocations, but this is messy and slow and fails when even larger files are input. For most parts of the pipeline, we should be able to solve this by rewriting processes to stream input files rather than reading them in entirely at the start.

willbradshaw added enhancement New feature or request priority_2 labels Dec 10, 2024

willbradshaw added priority_1 time&cost Changes to improve the pipeline's runtime and computational cost and removed priority_2 labels Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement data streaming to reduce memory needs #122

Implement data streaming to reduce memory needs #122

willbradshaw commented Dec 10, 2024

Implement data streaming to reduce memory needs #122

Implement data streaming to reduce memory needs #122

Comments

willbradshaw commented Dec 10, 2024