Review H5Easy "extend/part" API. #1018

1uc · 2024-06-10T06:52:27Z

In H5Easy there's API for reading and writing one element at a time:

HighFive/include/highfive/h5easy_bits/H5Easy_scalar.hpp

Lines 66 to 70 in 5f3ded6

    
           inline static DataSet dump_extend(File& file, 
        
                                             const std::string& path, 
        
                                             const T& data, 
        
                                             const std::vector<size_t>& idx, 
        
                                             const DumpOptions& options) {

HighFive/include/highfive/h5easy_bits/H5Easy_scalar.hpp

Lines 120 to 122 in 5f3ded6

    
           inline static T load_part(const File& file, 
        
                                     const std::string& path, 
        
                                     const std::vector<size_t>& idx) {

It does this by creating a dataset that can be extended in all directions; and automatically grows if the index of the element written requires it to do so. (Negating our ability to spot off-by-one programming errors.)

The API for reading/writing one element at a time feels like it would tempt users into writing files that way in a loop. Which is a rather serious issue on common HPC hardware (and not great on consumer hardware).

To enable this API it must make a default choice for the chunk size, currently 10^n. That seems very small and is at risk of creating files that can't be read efficiently. Picking it reasonably large might inflate the size of the file by a factor 100 or more.

I think it might be fine to allow users to read and write single elements of an existing dataset, i.e. without the automatically growing aspect; and a warning in the documentation to not use it in a loop. In core we support various selection APIs that are reasonably compact: list of random points, regular hyperslabs (general too) and there's a proposal to allow Cartesian products of simple selections along each axes.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review H5Easy "extend/part" API. #1018

Review H5Easy "extend/part" API. #1018

1uc commented Jun 10, 2024

Review H5Easy "extend/part" API. #1018

Review H5Easy "extend/part" API. #1018

Comments

1uc commented Jun 10, 2024