Skip to content

Commit

Permalink
Merge branch 'develop' into fix_GH-4737
Browse files Browse the repository at this point in the history
  • Loading branch information
bmribler authored Sep 4, 2024
2 parents f0c507d + 902131f commit ab98633
Show file tree
Hide file tree
Showing 116 changed files with 1,008 additions and 1,658 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/clang-format-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ jobs:
steps:
- uses: actions/[email protected]
- name: Run clang-format style check for C and Java code
uses: DoozyX/clang-format-lint-action@v0.17
uses: DoozyX/clang-format-lint-action@v0.18
with:
source: '.'
extensions: 'c,h,cpp,hpp,java'
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/clang-format-fix.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@ jobs:
permissions:
contents: write # In order to allow EndBug/add-and-commit to commit changes
steps:
- uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- uses: actions/checkout@9a9194f87191a7e9055e3e9b95b8cfb13023bb08 # v4.1.7
- name: Fix C and Java formatting issues detected by clang-format
uses: DoozyX/clang-format-lint-action@d3c7f85989e3b6416265a0d12f8b4a8aa8b0c4ff # v0.13
uses: DoozyX/clang-format-lint-action@caa179272c6ee7f1d25dfb503ee0c410c26ebd98 # v0.13
with:
source: '.'
extensions: 'c,h,cpp,hpp,java'
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/julia-auto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Get Sources
uses: actions/[email protected].1
uses: actions/[email protected].7

- name: Install Dependencies
shell: bash
Expand Down Expand Up @@ -60,7 +60,7 @@ jobs:
arch: 'x64'

- name: Get julia hdf5 source
uses: actions/[email protected].1
uses: actions/[email protected].7
with:
repository: JuliaIO/HDF5.jl
path: .
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/julia-cmake.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Get Sources
uses: actions/[email protected].1
uses: actions/[email protected].7

- name: Install Dependencies
shell: bash
Expand Down Expand Up @@ -63,7 +63,7 @@ jobs:
arch: 'x64'

- name: Get julia hdf5 source
uses: actions/[email protected].1
uses: actions/[email protected].7
with:
repository: JuliaIO/HDF5.jl
path: .
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/publish-branch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- name: Get Sources
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@9a9194f87191a7e9055e3e9b95b8cfb13023bb08 # v4.1.7
with:
fetch-depth: 0
ref: '${{ github.head_ref || github.ref_name }}'
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/publish-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ jobs:
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- name: Get Sources
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@9a9194f87191a7e9055e3e9b95b8cfb13023bb08 # v4.1.7
with:
fetch-depth: 0
ref: '${{ github.head_ref || github.ref_name }}'
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release-files.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ jobs:
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- name: Get Sources
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@9a9194f87191a7e9055e3e9b95b8cfb13023bb08 # v4.1.7
with:
fetch-depth: 0
ref: '${{ github.head_ref || github.ref_name }}'
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/scorecard.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ jobs:

steps:
- name: "Checkout code"
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
uses: actions/checkout@9a9194f87191a7e9055e3e9b95b8cfb13023bb08 # v4.1.7
with:
persist-credentials: false

Expand Down Expand Up @@ -67,6 +67,6 @@ jobs:

# Upload the results to GitHub's code scanning dashboard.
- name: "Upload to code-scanning"
uses: github/codeql-action/upload-sarif@afb54ba388a7dca6ecae48f608c4ff05ff4cc77a # v3.25.15
uses: github/codeql-action/upload-sarif@4dd16135b69a43b6c8efb853346f8437d92d3c93 # v3.26.6
with:
sarif_file: results.sarif
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ authors:
website: 'https://www.hdfgroup.org'
repository-code: 'https://github.com/HDFGroup/hdf5'
url: 'https://www.hdfgroup.org/HDF5/'
repository-artifact: 'https://www.hdfgroup.org/downloads/hdf5/'
repository-artifact: 'https://support.hdfgroup.org/downloads/HDF5'
12 changes: 6 additions & 6 deletions HDF5Examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,17 +48,17 @@ HDF5 SNAPSHOTS, PREVIOUS RELEASES AND SOURCE CODE
--------------------------------------------
Full Documentation and Programming Resources for this HDF5 can be found at

https://portal.hdfgroup.org/documentation/index.html
https://support.hdfgroup.org/documentation/HDF5/index.html

Periodically development code snapshots are provided at the following URL:
https://gamma.hdfgroup.org/ftp/pub/outgoing/hdf5/snapshots/

https://github.com/HDFGroup/hdf5/releases

Source packages for current and previous releases are located at:
https://portal.hdfgroup.org/downloads/

https://support.hdfgroup.org/releases/hdf5/downloads/

Development code is available at our Github location:

https://github.com/HDFGroup/hdf5.git

2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ Periodically development code snapshots are provided at the following URL:

Source packages for current and previous releases are located at:

https://portal.hdfgroup.org/Downloads
https://support.hdfgroup.org/downloads/HDF5

Development code is available at our Github location:

Expand Down
2 changes: 1 addition & 1 deletion config/cmake/README.md.cmake.in
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,6 @@ For more information see USING_CMake_Examples.txt in the install folder.
===========================================================================

Documentation for this release can be found at the following URL:
https://portal.hdfgroup.org/documentation/index.html#hdf5
https://support.hdfgroup.org/hdf5/@HDF5_PACKAGE_NAME@-@HDF5_PACKAGE_VERSION@/documentation/doxygen/index.html

Bugs should be reported to [email protected].
6 changes: 3 additions & 3 deletions configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -3833,10 +3833,10 @@ AC_DEFINE_UNQUOTED([DEFAULT_PLUGINDIR], ["$default_plugindir"],
## for the speed optimization of hard conversions. Soft conversions can
## actually benefit little.
##
AC_MSG_CHECKING([whether exception handling functions is checked during data conversions])
AC_MSG_CHECKING([whether exception handling functions are checked during data conversions])
AC_ARG_ENABLE([dconv-exception],
[AS_HELP_STRING([--enable-dconv-exception],
[if exception handling functions is checked during
[Check exception handling functions during
data conversions [default=yes]])],
[DCONV_EXCEPTION=$enableval], [DCONV_EXCEPTION=yes])

Expand All @@ -3857,7 +3857,7 @@ fi
AC_MSG_CHECKING([whether data accuracy is guaranteed during data conversions])
AC_ARG_ENABLE([dconv-accuracy],
[AS_HELP_STRING([--enable-dconv-accuracy],
[if data accuracy is guaranteed during
[Guarantee data accuracy during
data conversions [default=yes]])],
[DATA_ACCURACY=$enableval], [DATA_ACCURACY=yes])

Expand Down
55 changes: 36 additions & 19 deletions doc/parallel-compression.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,9 @@ H5Dwrite(..., dxpl_id, ...);
The following are two simple examples of using the parallel
compression feature:

[ph5_filtered_writes.c](https://github.com/HDFGroup/hdf5/blob/develop/HDF5Examples/C/H5PAR/ph5_filtered_writes.c)
[ph5_filtered_writes.c][u1]

[ph5_filtered_writes_no_sel.c](https://github.com/HDFGroup/hdf5/blob/develop/HDF5Examples/C/H5PAR/ph5_filtered_writes_no_sel.c)
[ph5_filtered_writes_no_sel.c][u2]

The former contains simple examples of using the parallel
compression feature to write to compressed datasets, while the
Expand All @@ -79,7 +79,7 @@ participate in the collective write call.
## Multi-dataset I/O support

The parallel compression feature is supported when using the
multi-dataset I/O API routines ([H5Dwrite_multi](https://hdfgroup.github.io/hdf5/develop/group___h5_d.html#gaf6213bf3a876c1741810037ff2bb85d8)/[H5Dread_multi](https://hdfgroup.github.io/hdf5/develop/group___h5_d.html#ga8eb1c838aff79a17de385d0707709915)), but the
multi-dataset I/O API routines ([H5Dwrite_multi][u3]/[H5Dread_multi][u4]), but the
following should be kept in mind:

- Parallel writes to filtered datasets **must** still be collective,
Expand All @@ -99,7 +99,7 @@ following should be kept in mind:

## Incremental file space allocation support

HDF5's [file space allocation time](https://hdfgroup.github.io/hdf5/develop/group___d_c_p_l.html#ga85faefca58387bba409b65c470d7d851)
HDF5's [file space allocation time][u5]
is a dataset creation property that can have significant effects
on application performance, especially if the application uses
parallel HDF5. In a serial HDF5 application, the default file space
Expand All @@ -118,20 +118,20 @@ While this strategy has worked in the past, it has some noticeable
drawbacks. For one, the larger the chunked dataset being created,
the more noticeable overhead there will be during dataset creation
as all of the data chunks are being allocated in the HDF5 file.
Further, these data chunks will, by default, be [filled](https://hdfgroup.github.io/hdf5/develop/group___d_c_p_l.html#ga4335bb45b35386daa837b4ff1b9cd4a4)
Further, these data chunks will, by default, be [filled][u6]
with HDF5's default fill data value, leading to extraordinary
dataset creation overhead and resulting in pre-filling large
portions of a dataset that the application might have been planning
to overwrite anyway. Even worse, there will be more initial overhead
from compressing that fill data before writing it out, only to have
it read back in, unfiltered and modified the first time a chunk is
written to. In the past, it was typically suggested that parallel
HDF5 applications should use [H5Pset_fill_time](https://hdfgroup.github.io/hdf5/develop/group___d_c_p_l.html#ga6bd822266b31f86551a9a1d79601b6a2)
HDF5 applications should use [H5Pset_fill_time][u7]
with a value of `H5D_FILL_TIME_NEVER` in order to disable writing of
the fill value to dataset chunks, but this isn't ideal if the
application actually wishes to make use of fill values.

With [improvements made](https://www.hdfgroup.org/2022/03/parallel-compression-improvements-in-hdf5-1-13-1/)
With [improvements made][u8]
to the parallel compression feature for the HDF5 1.13.1 release,
"incremental" file space allocation is now the default for datasets
created in parallel *only if they have filters applied to them*.
Expand All @@ -154,7 +154,7 @@ optimal performance out of the parallel compression feature.

### Begin with a good chunking strategy

[Starting with a good chunking strategy](https://portal.hdfgroup.org/documentation/hdf5-docs/chunking_in_hdf5.html)
[Starting with a good chunking strategy][u9]
will generally have the largest impact on overall application
performance. The different chunking parameters can be difficult
to fine-tune, but it is essential to start with a well-performing
Expand All @@ -166,7 +166,7 @@ chosen chunk size becomes a very important factor when compression
is involved, as data chunks have to be completely read and
re-written to perform partial writes to the chunk.

[Improving I/O performance with HDF5 compressed datasets](https://docs.hdfgroup.org/archive/support/HDF5/doc/TechNotes/TechNote-HDF5-ImprovingIOPerformanceCompressedDatasets.pdf)
[Improving I/O performance with HDF5 compressed datasets][u10]
is a useful reference for more information on getting good
performance when using a chunked dataset layout.

Expand Down Expand Up @@ -220,14 +220,14 @@ chunks to end up at addresses in the file that do not align
well with the underlying file system, possibly leading to
poor performance. As an example, Lustre performance is generally
good when writes are aligned with the chosen stripe size.
The HDF5 application can use [H5Pset_alignment](https://hdfgroup.github.io/hdf5/develop/group___f_a_p_l.html#gab99d5af749aeb3896fd9e3ceb273677a)
The HDF5 application can use [H5Pset_alignment][u11]
to have a bit more control over where objects in the HDF5
file end up. However, do note that setting the alignment
of objects generally wastes space in the file and has the
potential to dramatically increase its resulting size, so
caution should be used when choosing the alignment parameters.

[H5Pset_alignment](https://hdfgroup.github.io/hdf5/develop/group___f_a_p_l.html#gab99d5af749aeb3896fd9e3ceb273677a)
[H5Pset_alignment][u11]
has two parameters that control the alignment of objects in
the HDF5 file, the "threshold" value and the alignment
value. The threshold value specifies that any object greater
Expand Down Expand Up @@ -264,19 +264,19 @@ in a file, this can create significant amounts of free space
in the file over its lifetime and eventually cause performance
issues.

An HDF5 application can use [H5Pset_file_space_strategy](https://hdfgroup.github.io/hdf5/develop/group___f_c_p_l.html#ga167ff65f392ca3b7f1933b1cee1b9f70)
An HDF5 application can use [H5Pset_file_space_strategy][u12]
with a value of `H5F_FSPACE_STRATEGY_PAGE` to enable the paged
aggregation feature, which can accumulate metadata and raw
data for dataset data chunks into well-aligned, configurably
sized "pages" for better performance. However, note that using
the paged aggregation feature will cause any setting from
[H5Pset_alignment](https://hdfgroup.github.io/hdf5/develop/group___f_a_p_l.html#gab99d5af749aeb3896fd9e3ceb273677a)
[H5Pset_alignment][u11]
to be ignored. While an application should be able to get
comparable performance effects by [setting the size of these pages](https://hdfgroup.github.io/hdf5/develop/group___f_c_p_l.html#gad012d7f3c2f1e1999eb1770aae3a4963) to be equal to the value that
would have been set for [H5Pset_alignment](https://hdfgroup.github.io/hdf5/develop/group___f_a_p_l.html#gab99d5af749aeb3896fd9e3ceb273677a),
comparable performance effects by [setting the size of these pages][u13]
to be equal to the value that would have been set for [H5Pset_alignment][u11],
this may not necessarily be the case and should be studied.

Note that [H5Pset_file_space_strategy](https://hdfgroup.github.io/hdf5/develop/group___f_c_p_l.html#ga167ff65f392ca3b7f1933b1cee1b9f70)
Note that [H5Pset_file_space_strategy][u12]
has a `persist` parameter. This determines whether or not the
file free space manager should include extra metadata in the
HDF5 file about free space sections in the file. If this
Expand All @@ -300,12 +300,12 @@ hid_t file_id = H5Fcreate("file.h5", H5F_ACC_TRUNC, fcpl_id, fapl_id);

While the parallel compression feature requires that the HDF5
application set and maintain collective I/O at the application
interface level (via [H5Pset_dxpl_mpio](https://hdfgroup.github.io/hdf5/develop/group___d_x_p_l.html#ga001a22b64f60b815abf5de8b4776f09e)),
interface level (via [H5Pset_dxpl_mpio][u14]),
it does not require that the actual MPI I/O that occurs at
the lowest layers of HDF5 be collective; independent I/O may
perform better depending on the application I/O patterns and
parallel file system performance, among other factors. The
application may use [H5Pset_dxpl_mpio_collective_opt](https://hdfgroup.github.io/hdf5/develop/group___d_x_p_l.html#gacb30d14d1791ec7ff9ee73aa148a51a3)
application may use [H5Pset_dxpl_mpio_collective_opt][u15]
to control this setting and see which I/O method provides the
best performance.

Expand All @@ -318,7 +318,7 @@ H5Dwrite(..., dxpl_id, ...);

### Runtime HDF5 Library version

An HDF5 application can use the [H5Pset_libver_bounds](https://hdfgroup.github.io/hdf5/develop/group___f_a_p_l.html#gacbe1724e7f70cd17ed687417a1d2a910)
An HDF5 application can use the [H5Pset_libver_bounds][u16]
routine to set the upper and lower bounds on library versions
to use when creating HDF5 objects. For parallel compression
specifically, setting the library version to the latest available
Expand All @@ -332,3 +332,20 @@ H5Pset_libver_bounds(fapl_id, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST);
hid_t file_id = H5Fcreate("file.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl_id);
...
```

[u1]: https://github.com/HDFGroup/hdf5/blob/develop/HDF5Examples/C/H5PAR/ph5_filtered_writes.c
[u2]: https://github.com/HDFGroup/hdf5/blob/develop/HDF5Examples/C/H5PAR/ph5_filtered_writes_no_sel.c
[u3]: https://hdfgroup.github.io/hdf5/develop/group___h5_d.html#gaf6213bf3a876c1741810037ff2bb85d8
[u4]: https://hdfgroup.github.io/hdf5/develop/group___h5_d.html#ga8eb1c838aff79a17de385d0707709915
[u5]: https://hdfgroup.github.io/hdf5/develop/group___d_c_p_l.html#ga85faefca58387bba409b65c470d7d851
[u6]: https://hdfgroup.github.io/hdf5/develop/group___d_c_p_l.html#ga4335bb45b35386daa837b4ff1b9cd4a4
[u7]: https://hdfgroup.github.io/hdf5/develop/group___d_c_p_l.html#ga6bd822266b31f86551a9a1d79601b6a2
[u8]: https://support.hdfgroup.org/documentation/HDF5/parallel-compression-improvements-in-hdf5-1-13-1
[u9]: https://support.hdfgroup.org/documentation/HDF5/chunking_in_hdf5.html
[u10]: https://support.hdfgroup.org/documentation/HDF5/technotes/TechNote-HDF5-ImprovingIOPerformanceCompressedDatasets.pdf
[u11]: https://hdfgroup.github.io/hdf5/develop/group___f_a_p_l.html#gab99d5af749aeb3896fd9e3ceb273677a
[u12]: https://hdfgroup.github.io/hdf5/develop/group___f_c_p_l.html#ga167ff65f392ca3b7f1933b1cee1b9f70
[u13]: https://hdfgroup.github.io/hdf5/develop/group___f_c_p_l.html#gad012d7f3c2f1e1999eb1770aae3a4963
[u14]: https://hdfgroup.github.io/hdf5/develop/group___d_x_p_l.html#ga001a22b64f60b815abf5de8b4776f09e
[u15]: https://hdfgroup.github.io/hdf5/develop/group___d_x_p_l.html#gacb30d14d1791ec7ff9ee73aa148a51a3
[u16]: https://hdfgroup.github.io/hdf5/develop/group___f_a_p_l.html#gacbe1724e7f70cd17ed687417a1d2a910
Loading

0 comments on commit ab98633

Please sign in to comment.