This work was done as part of the National Center for Atmospheric Research (NCAR) Summer Internship in Parallel Computational Science (SIParCS)
This repo is a series of workflow tests for various cloud-hosted data access methods. Additional material, past presentations, posters, and interactive code tutorials are available here: https://lucassterzinger.com/2021-siparcs-poster/
Tests were performed on the Microsoft Planetary Computer with data hosted on Azure Blob object storage. Both Planetary Computer and the data used were performed in the EU-West region.
More data for testing may be added in the future
Format | Preprocess Time | Dataset Open Time | Workflow Time | Storage Needed |
---|---|---|---|---|
netCDF4 (native) | 0 min | 10 min | 46 min | 0 GB |
Zarr | 1 hr 38 min | 30 sec | 4 min 10 sec | 52 GB |
RefereceMaker (Kerchunk) | 1 hr 25 min | 35 sec | 4 min 30 sec | 416 MB |
Email: [email protected]
Twitter: @lucassterzinger