Implement data streaming to reduce memory needs #122
Labels
enhancement
New feature or request
priority_1
time&cost
Changes to improve the pipeline's runtime and computational cost
We keep running into issues with memory allocation when processing large files. We can keep this at bay by increasing memory allocations, but this is messy and slow and fails when even larger files are input. For most parts of the pipeline, we should be able to solve this by rewriting processes to stream input files rather than reading them in entirely at the start.
The text was updated successfully, but these errors were encountered: