byteskip -1 fails #70

tashrifbillah · 2018-11-05T23:53:23Z

When I set byteskip: -1 in the header, the data file should load from the end. Instead, I see the following error message.

May be we need to add a condition to check if byteskip is -1, and then rewrite line 383.

@ihnorton , UKFTractography ran well, generated .vtk file fine. We just need to make this fix to run other Pynrrd dependent CLIs.

In [2]: nrrd.read('EddyCorrect-DWI.nhdr')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-d45a7d091315> in <module>()
----> 1 nrrd.read('EddyCorrect-DWI.nhdr')

~/anaconda3/lib/python3.6/site-packages/nrrd/reader.py in read(filename, custom_field_map)
    427     with open(filename, 'rb') as fh:
    428         header = read_header(fh, custom_field_map)
--> 429         data = read_data(header, fh, filename)
    430 
    431         return data, header

~/anaconda3/lib/python3.6/site-packages/nrrd/reader.py in read_data(header, fh, filename)
    381         # Byte skip is applied AFTER the decompression. Skip first x bytes of the decompressed data and parse it using
    382         # NumPy
--> 383         data = np.frombuffer(decompressed_data[byte_skip:], dtype)
    384 
    385     # Close the file

ValueError: buffer size must be a multiple of element size

@addisonElliott , any idea/commit would be appreciated.

The text was updated successfully, but these errors were encountered:

addisonElliott · 2018-11-06T00:54:28Z

NRRD documentation states the following about byte skip (reference):

If skip is greater than or equal to zero, it tells how many bytes to skip in a data file in order to get to the beginning of the data. As an idiom copied from the MetaImage file format, the value of skip can be -1. This is valid only with raw encoding. The action of this byte skip is to fseek() backwards from the end of the data file, to the beginning of the data. The distance to seek is calculated from the nrrd type and axis sizes. This is a useful trick for getting at binary data in other formats with unknown (or variable) length binary headers, such as DICOM, TIFF, and BMP, but only if the data is uncompressed, and only if the end of the data is contiguous with the end of the file (which can fail to be the case in DICOM and TIFF).

Therefore, line 383 shouldn't be edited because that only occurs for compressed data. Rather, line 353 is more appropriate.

The pseudocode should be something like:

if byte_skip == -1:
    fh.seek(-1 * sizeof(data_type) * np.prod(shape), os.SEEK_END)
else:
    fh.seek(byte_skip, os.SEEK_CUR)

With this in mind, maybe it would be a good idea to check if byte_skip is negative for anything besides raw encoded data and throw an error. Furthermore, even throw an error if the byte skip is negative and not equal to -1, which is undefined. Can even apply this to line skip which can only be >= 0.

This is a good catch, I never noticed this about the NRRD documentation. Pull requests are welcome! (Not sure when I would have time to create a PR on this).

tashrifbillah · 2018-11-06T01:41:31Z

I'll try to work on this PR this week.

Just a heads up, even if data is gzip encoded, it was correctly loaded by Slicer (may be itk/vtk takes care of it) when byteskip is set to -1.

addisonElliott · 2018-11-06T02:24:25Z

Oh okay, wasn't aware of this, then I think we should add support for byteskip = -1 for raw and compressed data. Maybe even do this for ASCII as well?

Slicer uses VTK/ITK to load NRRD's I believe which uses Teem's implementation of NRRD format. Maybe they support byteskip = -1 for compressed data then?

tashrifbillah · 2018-11-07T00:14:17Z

@addisonElliott , I have been working on this for a while. It looks like you have put a comment before _READ_CHUNKSIZE. If I follow that comment and try to read backward chunk by chunk, it probably needs a nontrivial programming. Given that, do you think it's okay to uncompress the whole gzipped data at once?

To avoid seeking from end, we have to set byte_skip for the first chunk:

# whole data - just image data + 1
byte_skip= decompobj.decompress(fh.read( )) - dtype.itemsize * total_data_points +1

and recondition Ln 389.

addisonElliott · 2018-11-07T04:03:43Z

I probably wouldn't revert the changes of decompressing chunk by chunk because this was likely an issue in the past someone had for large NRRD files.

With that in mind, does Slicer/VTK apply the byteskip for compressed data before or after decompressing? I thought it was applied after decompression in which case the chunk size is irrelevant, you just change the index into decompressed data.

In other words, why aren't you doing something like:

if byteskip == -1:
    data = np.fromstring(decompressed_data[-dtype.itemsize * total_data_points + 1:], dtype)

tashrifbillah · 2018-11-07T04:55:38Z

Sorry, if it was not clear, I exactly meant that.
I used the following:

data = np.frombuffer(decompressed_data[-total_data_points*dtype.itemsize: ], dtype)

It works and I'll submit a PR soon.

tashrifbillah · 2018-11-08T15:51:24Z

@addisonElliott , once I am done, shall I run

python -m unittest discover -v nrrd/tests

Before submitting a PR?

addisonElliott · 2018-11-08T17:56:29Z

Preferably, yes. It saves effort that Travis CI has to do and I don't get spammed with messages. I say this from experience where I did 15 more pushes to try and fix my tests...

1. Fixed byte_skip -1 reading to work on compressed data (issue #70) 2. Add error checking for line_skip and byte_skip < -1 3. Added tests for error checking and byte_skip functionality

This was referenced Nov 10, 2018

byteskip -1 fixed #73

Closed

Byteskip minus1 fixing #74

Merged

addisonElliott pushed a commit that referenced this issue Nov 12, 2018

Byteskip minus1 fixing (#74)

3b084ed

1. Fixed byte_skip -1 reading to work on compressed data (issue #70) 2. Add error checking for line_skip and byte_skip < -1 3. Added tests for error checking and byte_skip functionality

addisonElliott closed this as completed Nov 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

byteskip -1 fails #70

byteskip -1 fails #70

tashrifbillah commented Nov 5, 2018

addisonElliott commented Nov 6, 2018 •

edited

Loading

tashrifbillah commented Nov 6, 2018

addisonElliott commented Nov 6, 2018

tashrifbillah commented Nov 7, 2018

addisonElliott commented Nov 7, 2018

tashrifbillah commented Nov 7, 2018

tashrifbillah commented Nov 8, 2018

addisonElliott commented Nov 8, 2018

byteskip -1 fails #70

byteskip -1 fails #70

Comments

tashrifbillah commented Nov 5, 2018

addisonElliott commented Nov 6, 2018 • edited Loading

tashrifbillah commented Nov 6, 2018

addisonElliott commented Nov 6, 2018

tashrifbillah commented Nov 7, 2018

addisonElliott commented Nov 7, 2018

tashrifbillah commented Nov 7, 2018

tashrifbillah commented Nov 8, 2018

addisonElliott commented Nov 8, 2018

addisonElliott commented Nov 6, 2018 •

edited

Loading