Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure when using array of array of compound with vlen #3246

Closed
johnstairs opened this issue Jul 14, 2023 · 1 comment
Closed

Failure when using array of array of compound with vlen #3246

johnstairs opened this issue Jul 14, 2023 · 1 comment
Assignees
Labels
Component - C Library Core C library issues (usually in the src directory) Priority - 2. Medium ⏹ It would be nice to have this in the next release Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub UNCONFIRMED New issues are unconfirmed until a maintainer can duplicate them

Comments

@johnstairs
Copy link

Describe the bug

When I try to create a dataset with a an array containing an array containing a compound type with a variable-length field, I get a failure when reading the dataset. I get the same error when using h5dump. The datatype (as formatted by h5dump) is:

DATATYPE  H5T_ARRAY { [2] H5T_ARRAY { [3] H5T_COMPOUND {
   H5T_VLEN { H5T_STD_I32LE} "vlen";
} } }

It might be that the data is not written correctly and the file is corrupted. Here is some C++ that shows the issue:

#include <iostream>
#include <string>
#include <vector>
#include <array>
#include "H5Cpp.h"

struct StructWithVlen {
    hvl_t vlen; // v-len of NATIVE_INT32
};

int
main(void)
{
    // define HDF5 types
    H5::VarLenType vlen_of_int32_type(H5::PredType::NATIVE_INT32);

    H5::CompType record_with_vlen_type(sizeof(StructWithVlen));
    record_with_vlen_type.insertMember("vlen", HOFFSET(StructWithVlen, vlen), vlen_of_int32_type);

    size_t        inner_array_size = 3;
    H5::ArrayType inner_array_type(record_with_vlen_type, 1, &inner_array_size);

    size_t        outer_array_size = 2;
    H5::ArrayType outer_array_type(inner_array_type, 1, &outer_array_size);
    H5::DataType  dataset_type = outer_array_type;

    // create the HDF5 file
    std::string filename = "vlen-bug.h5";
    std::remove(filename.c_str());

    {
        H5::H5File    file(filename, H5F_ACC_CREAT | H5F_ACC_RDWR);
        H5::DataSpace dataspace;

        auto dataset = file.createDataSet("vlen-bug", dataset_type, dataspace);

        std::vector<int>                             int_vec = {1, 2};
        hvl_t                                        int_vlen{int_vec.size(), int_vec.data()};
        StructWithVlen                               rec_with_vlen{int_vlen};
        std::array<StructWithVlen, 3>                inner_array{rec_with_vlen, rec_with_vlen, rec_with_vlen};
        std::array<std::array<StructWithVlen, 3>, 2> outer_array{inner_array, inner_array};

        dataset.write(&outer_array, dataset_type);
    }

    // now read the file back in
    H5::H5File file(filename, H5F_ACC_RDONLY);

    auto dataset = file.openDataSet("vlen-bug");

    std::array<std::array<StructWithVlen, 3>, 2> outer_array;
    dataset.read(&outer_array, dataset_type); // this fails
}

On a debug build, the assertion failure on reading is

src/H5HG.c:563: H5HG_read: Assertion `heap->obj[hobj->idx].begin' failed.
Aborted (core dumped)

On a release build, the error stack looks like:

HDF5-DIAG: Error detected in HDF5 (1.15.0) thread 0:
  #000: /workspaces/hdf5/src/H5D.c line 1061 in H5Dread(): can't synchronously read data
    major: Dataset
    minor: Read failed
  #001: /workspaces/hdf5/src/H5D.c line 1008 in H5D__read_api_common(): can't read data
    major: Dataset
    minor: Read failed
  #002: /workspaces/hdf5/src/H5VLcallback.c line 2092 in H5VL_dataset_read_direct(): dataset read failed
    major: Virtual Object Layer
    minor: Read failed
  #003: /workspaces/hdf5/src/H5VLcallback.c line 2048 in H5VL__dataset_read(): dataset read failed
    major: Virtual Object Layer
    minor: Read failed
  #004: /workspaces/hdf5/src/H5VLnative_dataset.c line 362 in H5VL__native_dataset_read(): can't read data
    major: Dataset
    minor: Read failed
  #005: /workspaces/hdf5/src/H5Dio.c line 381 in H5D__read(): can't read data
    major: Dataset
    minor: Read failed
  #006: /workspaces/hdf5/src/H5Dcontig.c line 840 in H5D__contig_read(): contiguous read failed
    major: Dataset
    minor: Read failed
  #007: /workspaces/hdf5/src/H5Dscatgath.c line 533 in H5D__scatgath_read(): datatype conversion failed
    major: Dataset
    minor: Can't convert datatypes
  #008: /workspaces/hdf5/src/H5T.c line 5300 in H5T_convert(): datatype conversion failed
    major: Datatype
    minor: Can't convert datatypes
  #009: /workspaces/hdf5/src/H5Tconv.c line 3570 in H5T__conv_array(): datatype conversion failed
    major: Datatype
    minor: Unable to initialize object
  #010: /workspaces/hdf5/src/H5T.c line 5300 in H5T_convert(): datatype conversion failed
    major: Datatype
    minor: Can't convert datatypes
  #011: /workspaces/hdf5/src/H5Tconv.c line 3570 in H5T__conv_array(): datatype conversion failed
    major: Datatype
    minor: Unable to initialize object
  #012: /workspaces/hdf5/src/H5T.c line 5300 in H5T_convert(): datatype conversion failed
    major: Datatype
    minor: Can't convert datatypes
  #013: /workspaces/hdf5/src/H5Tconv.c line 2582 in H5T__conv_struct_opt(): unable to convert compound datatype member
    major: Datatype
    minor: Unable to initialize object
  #014: /workspaces/hdf5/src/H5T.c line 5300 in H5T_convert(): datatype conversion failed
    major: Datatype
    minor: Can't convert datatypes
  #015: /workspaces/hdf5/src/H5Tconv.c line 3327 in H5T__conv_vlen(): can't read VL data
    major: Datatype
    minor: Read failed
  #016: /workspaces/hdf5/src/H5Tvlen.c line 840 in H5T__vlen_disk_read(): unable to get blob
    major: Datatype
    minor: Can't get value
  #017: /workspaces/hdf5/src/H5VLcallback.c line 7396 in H5VL_blob_get(): blob get failed
    major: Virtual Object Layer
    minor: Can't get value
  #018: /workspaces/hdf5/src/H5VLcallback.c line 7367 in H5VL__blob_get(): blob get callback failed
    major: Virtual Object Layer
    minor: Can't get value
  #019: /workspaces/hdf5/src/H5VLnative_blob.c line 123 in H5VL__native_blob_get(): Expected global heap object size does not match
    major: Virtual Object Layer
    minor: Unable to decode value

Some other observations:

  • I notice that if the v-len length is 0 or if the length of the inner or outer array is 1, there is not failure.
  • Also, if only the last element of the outer array contains non-empty vlens, there is no failure:
        for (auto& struct_with_vlen : outer_array[0]) {
            struct_with_vlen.vlen.len = 0;
            struct_with_vlen.vlen.p = nullptr;
        }

Expected behavior
We should be able to read and write this kind of data without errors.

Platform (please complete the following information)

  • HDF5 version: 1.14.1 and latest on develop branch (26059fc)
  • OS and version: Ubuntu 20.04.1
  • Compiler and version: GCC 10.2.1
  • Build system: cmake version 3.22.2
  • Any configure options you specified: none
  • MPI library and version (parallel HDF5): n/a
@fortnern fortnern self-assigned this Jul 14, 2023
@derobins derobins added Priority - 2. Medium ⏹ It would be nice to have this in the next release Component - C Library Core C library issues (usually in the src directory) Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub UNCONFIRMED New issues are unconfirmed until a maintainer can duplicate them labels Jul 20, 2023
@fortnern
Copy link
Member

Addressed by #4218

@fortnern fortnern closed this as completed Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - C Library Core C library issues (usually in the src directory) Priority - 2. Medium ⏹ It would be nice to have this in the next release Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub UNCONFIRMED New issues are unconfirmed until a maintainer can duplicate them
Projects
None yet
Development

No branches or pull requests

3 participants