-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow setting (or skipping) new indexes in open_dataset #8051
base: main
Are you sure you want to change the base?
Conversation
@pydata/xarray any thoughts on which option among those above (top comment) would be best? |
I vote for this. |
Is there any way for packages to say they support V1 or V2 of an entrypoint? I just realized that if we turned off indexes by default, it would be a big win for the open_mfdataset case in many cases. |
Doesn't help with the problem at this moment, but could we add having |
Agreed, adding **kwargs to the standard would help! However, to be honest I find it already a bit confusing how kwargs are handled in |
This looks great to me! I agree with adding this into For what it's worth, I think it's OK to require backend developers to update their code more frequently -- we don't need the same level of stabily that we need for user level APIs. |
I would love to see this merged so that I can try this out! |
be aware that merging now will break compatibility with any 3rd party backend, which I believe is not something we should do, even if we think that the transition window can be shorter than usual. I my eyes the easiest way forward would be:
We don't have an easy way to contact all backend developers, unfortunately. Edit: let's discuss in the meeting today |
In the meeting just now we decided to inspect the signature of the backend's We should still change the spec to require |
whats-new.rst
api.rst
This PR introduces a new boolean parameter
set_indexes=True
toxr.open_dataset()
, which may be used to skip the creation of default (pandas) indexes when opening a dataset.Currently works with the Zarr backend:
I'll add it to the other Xarray backends as well, but I'd like to get your thoughts about the API first.
xr.open_dataset()
? There are already many...BackendEntrypoint.open_dataset()
API?xr.open_dataset()
set_indexes
in the signature in addition to thedrop_variables
parameter, this is a breaking change for all existing 3rd-party backends. Or should we groupset_indexes
with the other xarray decoder kwargs? This would feel a bit odd to me as setting indexes is different from decoding data.Currently 1 and 2 are implemented in this PR, although as I write this comment I think that I would prefer 3. I guess this depends on whether we prefer
open_***
vs.xr.open_dataset(engine="***")
and unless I missed something there is still no real consensus about that? (e.g., #7496).