Custom aggregation of dynamic branches #1385

wlandau · 2024-11-21T21:40:54Z

Originally discussed in ropensci/tarchetypes#204.

Dynamic branch aggregation in targets is not as efficient as it could be. First you download and read every single branch sequentially, then apply the aggregation method determined by the iteration method of the target. The technique proposed in ropensci/tarchetypes#204 is much faster, but it's a low-level hack. It would be nice to arrive at something that fits well into targets natively.

It's tricky because this seems to break a lot of assumptions around how I designed targets, and it's hard to know where to fit it in at the interface level. If we could assume all files are on disk, then we could add it to tar_format(). But if you have native cloud storage, or if you have CAS storage backed by a database, then it's trickier. For maximum flexibility, we might consider adding custom aggregation to tar_repository_cas(), but then you would need to know how to read every branch, and that information has to be consistent with the choice of storage format. We might have to stick with user-defined wrappers like in ropensci/tarchetypes#204, but it's worth thinking about all these options.

The text was updated successfully, but these errors were encountered:

Aariq · 2024-12-09T18:00:28Z

This would be interesting for geotargets as there are many potential ways to iterate on rasters—e.g. by layer or by tile—and different corresponding ways to re-combine the results

wlandau added the type: new feature label Nov 21, 2024

wlandau self-assigned this Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom aggregation of dynamic branches #1385

Custom aggregation of dynamic branches #1385

wlandau commented Nov 21, 2024

Aariq commented Dec 9, 2024

Custom aggregation of dynamic branches #1385

Custom aggregation of dynamic branches #1385

Comments

wlandau commented Nov 21, 2024

Aariq commented Dec 9, 2024