-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when writing to disk a dask-backed xarray #219
Comments
Hi, apologies for the delay... |
I have tried that branch and the code now works, thank you! |
Sorry to reopen already, I actually noticed a problem. There is no error in writing the files and the output looks the same, but what is written is slightly different in the Here is the first chunk of the image, first written from an in-memory array and then from the delayed representation. As you can see the
|
Thanks for the feedback @LucaMarconato, I wonder whether you are encountering the same issue that was attempted to be fixed in #197. As this branch is currently conflicting with the inclusion of the dask writing code, could you try updating the corresponding location in the writer code to use |
I changed the
but the coordinate transformations remain the same as in the screenshot above. |
@LucaMarconato could you try #220 which should fix the |
I don't know the reason for the compressor being different. I can only assume different default behaviour between |
I tried the fix in #197 but it doesn't change the default behaviour in the example code above (when no compressor is specified) |
I tried #220 and it fixes the
I have tried both passing |
Since the full resolution image ( |
I think this is now failing due to the bug fixed in #197.
I now get no difference between downscaled chunks. |
Thanks, I have tried with #221 but I still get different chunks. Here is the code that I use, updated from above. I also check for numerical approximation errors (it's not the cause).
|
Thanks for that...
It's hard to "see" what's going on with the random pixel data, so I tried using a sample generated with...
Then I tweaked your script to be able to use this (not overwrite if full script
Running the script.... the
To visually compare them I opened the lower resolution array in napari...
Then I dragged the dir I'm not entirely sure why this is happening. In order to dig deeper, we'd need a test that directly compares the resize methods themselves, comparing resize of different shaped arrays etc. Hopefully that's possible, but if they don't get any closer in their output, how much of a problem is that? Anyway, thanks for highlighting these differences. It's good to know, and I'll see if I can understand what's going on a bit better... |
Perfect thanks a lot for the detailed explanation! I have modified our tests so that we compare only the full resolution image, since those small differences should not be a problem for visualization of downstream tasks based on the lower resolution images. |
see discussion here ome/ome-zarr-py#219
Dask writing support now released in |
The following code example produces an error on the last line.
The code does the following:
The error is the following:
Investigating the cause of the problem I found what follows. In the first
write_image()
call, theim
array, which is anp.ndarray
, is converted to azarr.core.Array
increation.py
(see the bold line in the stack trace). More precisely, first an empty object is initialized and then it is filled with data.After the instantiation line, the object
z
has an attributechunks
which is(3, 100, 100)
. So far so good.When instead
write_image()
is called for the second time, theim
array is adask.array.core.Array
, and after the intantiation line above the attributez.chunks
is([3], [100], [100])
. This triggers the error that I am reporting at the line that follows (fillingz
with data).From a performance point of view, what my code is asking is to take data that is on disk and write it somewhere else on disk, without the need to load the content in memory. So a workaround would be that I modify the
write_to_zarr()
function so that if theim
data that I want to copy is adask.array.core.Array
, I instead manually do a copy at the file level, without using theome-zarr-py
APIs.Any advice in how to proceed?
The text was updated successfully, but these errors were encountered: