-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert xarray dataset to dask dataframe or delayed objects #1093
Comments
This is a good use case for dask collection duck typing: dask/dask#1068 |
I'm not sure if I follow how this is a duck typing use case. I'd write this as a method, following your suggestion on SO:
Can you explain why you think this could benefit from collection duck typing? |
Then we could use xarray's normal indexing operations to create a new sub-datasets, wrap them with |
The other component that would help for this is some utility function inside xarray to split a def split_by_chunks(dataset):
chunk_slices = {}
for dim, chunks in dataset.chunks.items():
slices = []
start = 0
for chunk in chunks:
stop = start + chunk
slices.append(slice(start, stop))
start = stop
chunk_slices[dim] = slices
for slices in itertools.product(*chunk_slices.values()):
selection = dict(zip(chunk_slices.keys(), slices))
yield (selection, dataset[selection]) |
I think this was closed by mistake. Is there a way to split up Dataset chunks into dask delayed objects where each object is a Dataset? |
In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here or remove the |
It would be great to have a function like dask's to_delayed in order to take xarray datasets and convert them to pandas dataframes chunkily.
http://stackoverflow.com/questions/40475884/how-to-convert-an-xarray-dataset-to-pandas-dataframes-inside-a-dask-dataframe
The text was updated successfully, but these errors were encountered: