Create a conventional netcdf using to_netcdf() #7329
-
Hi, I'm having trouble understanding how to write a netcdf4 file with 2 groups (two sets with different dimensions) using the xarray.to_netcdf() in Python.
I would also not like to create very large 5D and 4 D datasets and then dump it to a file but rather write one 2d array at a time and for 3d variables, one level at a time, so the chunking in this case would be a 2d (lat, lon) variable size. The reason for this is because I have only 5GB to work with. In the past, I've done this in python by first creating the netcdf container and then writing to it as described above, but this is slow and I would like to leverage Xarray in this and perhaps utilize its Dask interface. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
You should be able to do this using the import xarray as xr
ds1 = xr.Dataset({"a": 0})
ds2 = xr.Dataset({"b": 1})
ds1.to_netcdf("file.nc", group="A")
ds2.to_netcdf("file.nc", group="B", mode="a")
Another way would be to use xarray-datatree.
Do you mean you only have 5GB of RAM? Or 5GB of data? Dask is intended to help if it's the former.
Consider using Zarr. It is similar to netCDF, also supports groups, but is chunked natively and will help you avoid various performance bottlenecks when you later use dask on your data. |
Beta Was this translation helpful? Give feedback.
-
One more thing I would like to ask: I was not able to get xarray to write conventional variable attributes as seen in the dimensions:
How do I setup a time variable that is not a pandas string, for example? |
Beta Was this translation helpful? Give feedback.
You should be able to do this using the
group
andmode="a"
arguments toto_netcdf()
, like thisAnother way would be to use xarray-datatree.
D…