Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H5Dchunk_iter() does not account for user block when reporting chunk file address #3003

Closed
ajelenak opened this issue May 24, 2023 · 5 comments · Fixed by #4236
Closed

H5Dchunk_iter() does not account for user block when reporting chunk file address #3003

ajelenak opened this issue May 24, 2023 · 5 comments · Fixed by #4236
Assignees
Labels
Component - C Library Core C library issues (usually in the src directory) Priority - 2. Medium ⏹ It would be nice to have this in the next release Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub
Milestone

Comments

@ajelenak
Copy link
Contributor

ajelenak commented May 24, 2023

Describe the bug

For HDF5 files with user block the H5Dchunk_iter() function does not account for user block size when reporting chunk file address. To correct this one has to add user block size to locate the actual chunk in that file. The function's documentation does not mention this.

The function for contiguous datasets, H5Dget_offset(), does correctly take into account user block size.

Expected behavior

H5Dchunk_iter() should report chunk file addresses corrected for user block size if present. The documentation for the libhdf5 releases with this function should have a note to alert users.

Platform (please complete the following information)

  • HDF5 version: 1.14.0
  • OS: AWS EC2 Linux
  • Compiler and version: N/A
  • Build system: obtained with the conda package manager from the conda-forge channel
  • Any configure options you specified: N/A
  • No MPI or parallel build

Additional context

Sample files attached:

The chunk file addresses from both files will be the same.

@ajelenak ajelenak added Component - C Library Core C library issues (usually in the src directory) Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub labels May 24, 2023
@ajelenak ajelenak changed the title [BUG] H5Dchunk_iter() does not account for user block when reporting chunk file address H5Dchunk_iter() does not account for user block when reporting chunk file address May 24, 2023
@glennsong09 glennsong09 added the Priority - 2. Medium ⏹ It would be nice to have this in the next release label May 25, 2023
@mkitti
Copy link
Contributor

mkitti commented May 25, 2023

Does this match the behavior of H5Dget_chunk_info?
https://docs.hdfgroup.org/hdf5/v1_12/group___h5_d.html#title12

@ajelenak
Copy link
Contributor Author

Yes, H5Dget_chunk_info() also does not account for user block size.

@mkitti
Copy link
Contributor

mkitti commented May 25, 2023

I think we might be stuck with the current behavior then. I think we should just document this for now.

Alternatively, we could introduce version 2 of the function and also bump the others.

@derobins derobins added this to the 1.14.3 milestone Oct 9, 2023
@derobins derobins modified the milestones: 1.14.3, 1.14.4 Oct 23, 2023
@derobins
Copy link
Member

I think we might be stuck with the current behavior then. I think we should just document this for now.

Alternatively, we could introduce version 2 of the function and also bump the others.

We don't version API calls for semantics. Also, this is a bug and we don't enshrine buggy behavior in the API.

This will be fixed in 1.14.4.

@mkitti
Copy link
Contributor

mkitti commented Mar 15, 2024

As long as the chunk APIs are consistent with each other, I'm fine with the changes. I noticed this during development of H5Dchunk_iter but thought it would be better to be consistent.

derobins added a commit to derobins/hdf5 that referenced this issue Mar 25, 2024
Both H5Dchunk_iter() and H5Dget_chunk_info(_by_coord)() did not take
the size of the user block into account when reporting addresses. Since
the HDFGroup#1 use of these functions is to root around in the file for the raw
data, this is kind of a problem.

Fixes GitHub issue HDFGroup#3003
@derobins derobins linked a pull request Mar 25, 2024 that will close this issue
derobins added a commit that referenced this issue Mar 26, 2024
Both H5Dchunk_iter() and H5Dget_chunk_info(_by_coord)() did not take
the size of the user block into account when reporting addresses. Since
the #1 use of these functions is to root around in the file for the raw
data, this is kind of a problem.

Fixes GitHub issue #3003
lrknox pushed a commit to lrknox/hdf5 that referenced this issue Mar 29, 2024
…#4236)

Both H5Dchunk_iter() and H5Dget_chunk_info(_by_coord)() did not take
the size of the user block into account when reporting addresses. Since
the HDFGroup#1 use of these functions is to root around in the file for the raw
data, this is kind of a problem.

Fixes GitHub issue HDFGroup#3003
lrknox added a commit that referenced this issue Mar 29, 2024
* Take user block into account when returning chunk addresses (#4236)

Both H5Dchunk_iter() and H5Dget_chunk_info(_by_coord)() did not take
the size of the user block into account when reporting addresses. Since
the #1 use of these functions is to root around in the file for the raw
data, this is kind of a problem.

Fixes GitHub issue #3003

* Fix a minor warning in h5test.c (#4242)

* Turn on -Werror for Java in GitHub -Werror workflows (#4243)

* Update Windows CI to not install ninja (#4230)

* Rework Fortran macros to use the proper code. (#4240)

* Correct reference copy for 16 API (#4244)

* Determine MPI LOGICAL during build, used in tests. (#4246)

* Skip userblock test in chunk_info.c for multi-file VFDs (#4249)

* Match generators with real cmake -G output on Windows (#4252)

* Add Julia GitHub Actions. (#4123)

* Re-revert to using autoreconf in autogen.sh (#4253)

We previously tried removing the per-tool invocation of the Autotools
and instead simply invoked autoreconf (PR #1906). This was reverted
when it turned out that the NAG Fortran compiler had trouble with an
undecorated -shared linker flag.

It turns out that this is due to a bug in libtool 2.4.2 and earlier.
Since this version of libtool is over a decade old, we're un-reverting
the change. We've added a release note for anyone who has to build
from source on elderly platforms.

Fixes #1343

* Rewrite H5T__path_find_real for clarity (#4225)

* Move conversion path free logic to helper function

* Add tgz extensions on names (#4255)

* Remove an error check regarding large cache objects (#4254)

* Remove an error check regarding large cache objects

In PR#4231 an assert() call was converted to a normal HDF5 error
check. It turns out that the original assert() was added by a
developer as a way of being alerted that large cache objects
existed instead of as a guard against incorrect behavior, making
it unnecessary in either debug or release builds.

The error check has been removed.

* Update RELEASE.txt

* File format security issues (#4234)

* Add job timeout to cygwin workflow (#4260)

* Replace user-define with user-defined (#4261)

* Improve the CMake clang -fsanitize=memory flags (#4267)

-fsanitize=memory is almost useless without
using -fsanitize-memory-track-origins=2 and we shoud probably add
-fno-optimize-sibling-calls as well.

* Add documentation (H5M) (#4259)

* Add documentation (H5P) (#4262)

* MPI type correction (#4268)

* corrected type for MPI_*_f2c APIs

* fixed return type of callback

* reset compilation flags of logical test program

* Clean up test/cmpd_dtransform.c (#4270)

* Clean up test/cmpd_dtransform.c

* Fix uninitialized memory warning from sanitizers
* FAIL_STACK_ERROR --> TEST_ERROR
* Emit output
* Delete test file when done

* Fix typo

* H5Fdelete() --> remove()

* Fix uninitialized memory issues in packet table (#4271)

* replace deprecated CMAKE_COMPILER_IS_GNU** (#4272)

* Prevent stack overflows in H5E__push_stack (#4264)

* Minor fixes after merge of file format security fixes (#4263)

* Update H5_IS_BUFFER_OVERFLOW to account for 'size' of 0

* Invert a few checks to avoid function call

* CHECK --> CHECK_PTR in tmisc.c (#4274)

* Add release note for CVE-2017-17507 (#4275)

* Update Cygwin installation guide (#4265)

* Addresses configuration fortran testing flags (#4276)

* turn warnings to errors in fortran configure test

* Intel fortran test fix

* Merge julia workflows into standard ci format (#4273)

* Fix range check in H5_addr_overlap (#4278)

When the H5_addr_overlap macro was updated to use H5_RANGE_OVERLAP,
it failed to take into account that H5_RANGE_OVERLAP expects the
range to be inclusive. This lead to an assertion failure in
H5MM_memcpy due to a memcpy operation on overlapping memory.
This has been fixed by subtracting 1 from the calculated high
bound values passed to H5_RANGE_OVERLAP

* Fix potential buffer read overflows in H5PB_read (#4279)

H5PB_read previously did not account for the fact that the size of the
read it's performing could overflow the page buffer pointer, depending
on the calculated offset for the read. This has been fixed by adjusting
the size of the read if it's determined that it would overflow the page.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - C Library Core C library issues (usually in the src directory) Priority - 2. Medium ⏹ It would be nice to have this in the next release Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants