Request: Run flow accumulation in batches for larger DEMs #221

InsolublePancake · 2022-01-17T09:25:05Z

I have a workflow that uses several whitebox tools (Fill depressions, D8_flow_pointer, D8_flow_accumulation, Extract Streams, and a few others that process the streams). I have large DEMs that are too big to process in one go due to memory limitations, but which could be chunked and run quite easily. Except that this doesn't make sense for the flow accumulation step.

I wonder whether it would be possible to do this by running flow accumulation on one chunk of DEM, then 'seeding' the adjacent DEM with the accumulation values along its edge.

Any consideration of this would be greatly appreciated. Thanks

jfbourdon · 2022-01-24T16:25:03Z

We also have to run DEMs that are too large and we deal with that by using Isobasins (on a lower resolution DEM) in order to split the DEM in a hydro-logical way. Using GDAL when then can crop the original DEM into the chunks we need to pass to WBT. We then reconnect the streams together afterwards. However, when using this method (Isobasins on a lower resolution DEM), you will need to add a buffer on each chunk as the basins perimeters might not fit exactly when used on the higher resolution DEM. Some cleanup might be needed also.

Being able to process rasters too large to fit in memory by letting WBT do chunking would be great. Could be tricky for some operations like FillDepressions and BreachDepressions where neigboring values (outside of the chunk) have an effect. I suppose that a buffer on each chunk would be necessary but even then some differences between a chunked and non-chunked raster could remain. If ever implemented, it would need to be set via a parameter to ensure reproducibility.

jblindsay · 2022-01-29T21:25:06Z

What you are asking for is a fundamental change in the approach for flow accumulation used in WBT, and even more lower-level the way that it deals with reading/writing raster data. I'm not convinced that this is realistic change as it would impact a far greater surface area than simply the flow accumulation tools. There are existing tools and libraries that are geared towards flow accumulation on massive DEMs. The approach that I have adopted is intended to provide good performance for a large proportion of users that are working with more moderate sized DEMs.

InsolublePancake · 2022-02-03T09:30:15Z

Ok, fair enough if it's too difficult to implement. Can you recommend any of these libraries you allude to? We need to generate catchments and drainage lines from a DEM or a pointer dataset. Whitebox looked so promising for us!

jblindsay · 2022-02-03T14:30:55Z

How large are the DEMs that you are trying to process and what are the memory limits on your system? Also, specifically which tools were you using in WBT for your flow accumulation workflow and which one(s) raised the out-of-memory error? I can certainly try to reduce the memory requirements of these tools further, but ultimately they will always need to read the entirety of the DEM into memory, given the way the tools are designed.

jfbourdon · 2022-02-04T03:37:19Z

TauDEM is certainly a possible alternative, but I don't think that you really need to look elsewere. There are ways to circumvent this memory constrain with WBT. We used WBT to produce 1 m resolution rasters (breached DEMs, flow direction, flow accumulation, etc.) for over 400 000 km². Chunking your source DEM is the solution, or at least it's the solution we chose.

We are doing it by pre-processing a very large (>1000 km²) DEM at a lower resolution (say 5 m) using GDAL and then breach/fill this corse DEM before using Isobasins to split it in manageable chunks. We then process these chunks (plus a buffer) at 1 m resolution. Merging everything together afterward is not without some challenge, but it is very doable.

InsolublePancake · 2022-02-08T14:23:56Z

@jblindsay we recently ran a DEM that was 2 gb (compressed) and 22 gb (uncompressed). We have larger areas that we would like to model. I trialled chunking the DEM, processing, then recombining, but most of the gained efficiency is lost, and potential for error .
A lot of our DEM in this recent project was no-data because the region included lots of islands and coastline. I suppose if there was efficiencies to be gained there you would have already applied them.

InsolublePancake · 2022-02-08T14:42:06Z

@jfbourdon thanks for your suggestions, TauDEM looks promising and I will look into it.
Chunking is the way I was considering. It feels a bit Heath Robinson to chunk and recombine this. The resampling and running isobasins is a good idea. I'll have a go with this, thanks.

jblindsay closed this as completed Jan 29, 2022

geotom mentioned this issue Apr 19, 2024

FlowAccumulationFullWorkflow gracefully exits without writing all 3 results #401

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request: Run flow accumulation in batches for larger DEMs #221

Request: Run flow accumulation in batches for larger DEMs #221

InsolublePancake commented Jan 17, 2022

jfbourdon commented Jan 24, 2022

jblindsay commented Jan 29, 2022

InsolublePancake commented Feb 3, 2022

jblindsay commented Feb 3, 2022

jfbourdon commented Feb 4, 2022

InsolublePancake commented Feb 8, 2022

InsolublePancake commented Feb 8, 2022

Request: Run flow accumulation in batches for larger DEMs #221

Request: Run flow accumulation in batches for larger DEMs #221

Comments

InsolublePancake commented Jan 17, 2022

jfbourdon commented Jan 24, 2022

jblindsay commented Jan 29, 2022

InsolublePancake commented Feb 3, 2022

jblindsay commented Feb 3, 2022

jfbourdon commented Feb 4, 2022

InsolublePancake commented Feb 8, 2022

InsolublePancake commented Feb 8, 2022