Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fill values are not preserved in rechunking. #132

Closed
flamingbear opened this issue Feb 16, 2023 · 2 comments
Closed

fill values are not preserved in rechunking. #132

flamingbear opened this issue Feb 16, 2023 · 2 comments

Comments

@flamingbear
Copy link
Contributor

This is a follow on from the previous report, #131.

I also noticed that when looking at diffs between the source.zarr and target.zarr after running through rechunker that the fill_values are not preserved. Below is basically same script as #131 with an additional call to consolidate_metadata

If you run this script you will see the fillvalue of "foo/bar/.zarray" changes from "fill_value": 1.0, to "fill_value": null, between the source and target zarr stores.

Thanks,
Matt

import zarr
from rechunker import rechunk
import shutil


def run_create_input_store():
    shutil.rmtree('testoutput/', ignore_errors=True)
    store = zarr.DirectoryStore('testoutput/source.zarr')
    root = zarr.group(store=store, overwrite=True)
    foo = root.create_group('foo')
    root.attrs['description'] = 'root description'
    foo.attrs['description'] = 'foo description'
    bar = foo.ones('bar', shape=(10, 10))
    bar[5, 5] = 3
    bar.attrs['description'] = 'foo description'
    zarr.consolidate_metadata(store)


def rechunkit():
    openstore = zarr.open_consolidated('testoutput/source.zarr')
    array_plan = rechunk(openstore, {'foo/bar': (5, 5)},
                         '1MB',
                         'testoutput/target.zarr',
                         temp_store='testoutput/temp.zarr')
    array_plan.execute()
    zarr.consolidate_metadata('testoutput/target.zarr')


if __name__ == '__main__':
    run_create_input_store()
    rechunkit()
    print('Compare the .zmetadata files in both your source.zarr and target.zarr directories')
    print('You will see that the "fill_value" in the source is 1.0 and it is null in the target.')
    source = zarr.open('testoutput/source.zarr')
    target = zarr.open('testoutput/target.zarr')
    print(source['foo']['bar'].fill_value)
    print(target['foo']['bar'].fill_value)
@flamingbear
Copy link
Contributor Author

Maybe the fill value is put into the data? The grids themselves are looking the same. I will check my other "real" output.

@flamingbear
Copy link
Contributor Author

I'm closing this and will re-open a different one, relating to the same issue, but that shows the problem better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant