You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
preprocess (callable(), optional) – If provided, call this function on each dataset prior to concatenation. You can find the file-name from which each dataset was loaded in ds.encoding["source"].
I expected to be able to use ds.encoding["source"] in my preprocess function to retrieve the filename. However I get
What did you expect to happen?
I expected the doc to be correct? unless I missed something trivial.
Minimal Complete Verifiable Example
defpreprocess_xarray_no_class(ds):
filename=ds.encoding["source"]
ds=ds.assign(
filename=("time"), [filename])
) # add new filename variable with time dimensionreturndsds=xr.open_mfdataset(
fileset,
preprocess=preprocess_xarray_no_class,
engine='h5netcdf',
concat_characters=True,
mask_and_scale=True,
decode_cf=True,
decode_times=True,
use_cftime=True,
parallel=True,
decode_coords=True,
compat="equals",
)
MVCE confirmation
Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
Complete example — the example is self-contained, including all data and the text of any traceback.
Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
New issue — a search of GitHub Issues suggests this is not a duplicate.
Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
...
1defpreprocess_xarray_no_class(ds):
---->2filename=ds.encoding["source"]
3ds=ds.assign(
4filename=("time",), [filename])
5 ) # add new filename variable with time dimensionKeyError: 'source'
this depends on what fileset is, unfortunately. If it contains a list of strings (filepaths, urls, or a glob), then yes, it not working might be a bug. If that's fsspec objects, though, we need #8923.
What happened?
Looking at the doc https://docs.xarray.dev/en/stable/generated/xarray.open_mfdataset.html
I expected to be able to use ds.encoding["source"] in my preprocess function to retrieve the filename. However I get
What did you expect to happen?
I expected the doc to be correct? unless I missed something trivial.
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
No response
Environment
xarray: 2024.6.0
pandas: 2.2.2
numpy: 1.26.4
scipy: 1.13.1
netCDF4: 1.7.1
pydap: None
h5netcdf: 1.3.0
h5py: 3.11.0
zarr: 2.18.2
cftime: 1.6.4
nc_time_axis: 1.4.1
iris: None
bottleneck: 1.3.8
dask: 2024.6.0
distributed: 2024.6.0
matplotlib: 3.9.0
cartopy: None
seaborn: 0.13.2
numbagg: 0.8.1
fsspec: 2024.6.0
cupy: None
pint: None
sparse: None
flox: 0.9.7
numpy_groupies: 0.11.1
setuptools: 70.0.0
pip: 24.0
conda: None
pytest: 8.2.2
mypy: 1.10.0
IPython: 7.34.0
sphinx: None
The text was updated successfully, but these errors were encountered: