-
I'm confused as to why using An example dataset can be obtained using import earthaccess
import xarray as xr
auth = earthaccess.login()
results = earthaccess.search_data(
short_name = 'M2SDNXSLV',
temporal = ("2024-01-01", "2024-05-31"))
# Could take about 1 minute on a broadband connection
earthaccess.download(results, 'data_raw/MERRA2') This example easily fits into memory, but is an example for a tutorial I'm working on. A more salient reason for wanting a specific kind of chunking would be, e.g., a desire to calculate long-term trends, so all the elements along the Specifically, when ds = xr.open_mfdataset('./data_raw/MERRA2/*.nc4', chunks = 'auto')
ds['T2MMEAN'].chunksizes
It creates a separate chunk for each file (each time step), which is not ideal. If I instead try: # This doesn't give the desired result
ds = xr.open_mfdataset('./data_raw/MERRA2/*.nc4', chunks = {'time': 122})
# Nor does this
ds = xr.open_mfdataset('./data_raw/MERRA2/*.nc4', chunks = {'lat': 91, 'lon': 144, 'time': 122}) I still get chunks that do not have 122 elements along the # This finally does it
ds.chunk({'time': 122}) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
The We'd happily merge a PR making this point clear in the documentation: Lines 861 to 866 in 9237f90 |
Beta Was this translation helpful? Give feedback.
The
chunks
argument is applied on a per-file basis so this is expected.We'd happily merge a PR making this point clear in the documentation:
xarray/xarray/backends/api.py
Lines 861 to 866 in 9237f90