Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Windows ROS3 broken in 1.14.0? #2406

Closed
hmaarrfk opened this issue Jan 15, 2023 · 12 comments · Fixed by #2798
Closed

[BUG] Windows ROS3 broken in 1.14.0? #2406

hmaarrfk opened this issue Jan 15, 2023 · 12 comments · Fixed by #2798
Assignees
Labels
Component - C Library Core C library issues (usually in the src directory) Priority - 0. Blocker ⛔ This MUST be merged for the release to happen Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub

Comments

@hmaarrfk
Copy link

Describe the bug
I've been trying to build HDF5 1.14.0 for Windows at conda-forge and I am having trouble using the ROS3 driver.

xref: conda-forge/hdf5-feedstock#188

We have a simple test to check the functionality, but it is currently broken in the latest build.

# The test
h5dump --filedriver=ros3 "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/tall.h5"

works on:

  • Windows + HDF5 1.12.2
  • Linux + HDF5 1.12.2
  • Linux + HDF5 1.14.0

but broken on Windows + 1.14.0

It seems that there was some discussion that ROS3 + Windows might not be tested by CIs:
#2101

Expected behavior
For that h5dump to work. It works on windows.

Platform (please complete the following information)

  • HDF5 version 1.14.0
  • OS and version: Windows 10 ish
  • Compiler and version: VS2019
  • Build system (e.g. CMake, Autotools) and version: Cmake
  • Any configure options you specified: HDF5_ENABLE_ROS3_VFD=ON among others
  • MPI library and version (parallel HDF5)

Additional context
Happy to try certain patches if you would like.

@mkitti
Copy link
Contributor

mkitti commented Jan 16, 2023

I see the same error when trying 1.14.0 from MWING-W64 in MSYS2:

$ h5dump --filedriver=ros3 --enable-error-stack "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/tall.h5"
h5dump error: unable to create FAPL for file access
HDF5-DIAG: Error detected in HDF5 (1.14.0) thread 0:
  #000: D:/repos/MINGW-packages/mingw-w64-hdf5/src/hdf5-1.14.0/src/H5P.c line 1486 in H5Pclose(): not a property list
    major: Invalid arguments to routine
    minor: Inappropriate type

@mkitti
Copy link
Contributor

mkitti commented Jan 16, 2023

On the develop branch, I can get a similar error on x86_64 Linux (Ubuntu 22.04). Strangely it fixes itself if I specify --s3-cred="(,,)"

$ ./h5dump --filedriver=ros3 --enable-error-stack "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/tall.h5"
h5dump error: unable to create FAPL for file access
HDF5-DIAG: Error detected in HDF5 (1.15.0) thread 0:
  #000: /home/mkitti/src/hdf5/src/H5P.c line 1486 in H5Pclose(): not a property list
    major: Invalid arguments to routine
    minor: Inappropriate type

$ ./h5dump --filedriver=ros3 --enable-error-stack --s3-cred="(,,)" "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/tall.h5"
HDF5 "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/tall.h5" {
GROUP "/" {
   ATTRIBUTE "attr1" {
      DATATYPE  H5T_STD_I8BE
      DATASPACE  SIMPLE { ( 10 ) / ( 10 ) }
      DATA {
      (0): 97, 98, 99, 100, 101, 102, 103, 104, 105, 0
      }
   }
...

@mkitti
Copy link
Contributor

mkitti commented Jan 16, 2023

In HDF5 1.12.2, vfd_info.info was initialized to ros3_fa_g when the ROS3 driver was enabled.

hdf5/tools/src/h5dump/h5dump.c

Lines 1380 to 1388 in 3e847e0

if (driver_name_g != NULL) {
h5tools_vfd_info_t vfd_info;
vfd_info.info = NULL;
vfd_info.name = driver_name_g;
if (!HDstrcmp(driver_name_g, drivernames[ROS3_VFD_IDX])) {
#ifdef H5_HAVE_ROS3_VFD
vfd_info.info = (void *)&ros3_fa_g;

On the develop branch and HDF5 1.14.0, vfd_info.info is only set to ros3_fa_g if --s3-cred is used.

hdf5/tools/src/h5dump/h5dump.c

Lines 1244 to 1255 in 5543d6e

case '$':
#ifdef H5_HAVE_ROS3_VFD
if (h5tools_parse_ros3_fapl_tuple(H5_optarg, ',', &ros3_fa_g) < 0) {
error_msg("failed to parse S3 VFD credential info\n");
usage(h5tools_getprogname());
free_handler(hand, argc);
hand = NULL;
h5tools_setstatus(EXIT_FAILURE);
goto done;
}
vfd_info_g.info = &ros3_fa_g;

In the absence of --s3-cred,

  1. vfd_info.info is not initialized
  2. The fapl_id cannot be obtained
  3. The fapl_id cannot be closed

This is in contradiction with previous behavior and the documentation, which suggests that anonymous credentials would be used if the option is absent.

     --s3-cred=<cred>     Supply S3 authentication information to "ros3" vfd.
                          <cred> :: "(<aws-region>,<access-id>,<access-key>)"
                          If absent or <cred> -> "(,,)", no authentication.
                          Has no effect if filedriver is not "ros3".

Rather than failing when vfd_info->info is NULL, perhaps this should default to the anonymous credentials?

hdf5/tools/lib/h5tools.c

Lines 567 to 568 in 5543d6e

if (!vfd_info->info)
H5TOOLS_GOTO_ERROR(FAIL, "Read-only S3 VFD info is invalid");

@hmaarrfk
Copy link
Author

Thank you for looking into this so quickly. Let me know if you need anything specific from my end.

@hmaarrfk
Copy link
Author

Rather than failing when vfd_info->info is NULL, perhaps this should default to the anonymous credentials?

I think this is probably the right thing to do. My guess is that by part of this change:
13985a7#diff-bafed7f2f94716c2099686506de5943bb4b21e24925eed736f5768dabef6cbdcR28

defaults were setup correctly, and this quirk was revealed.

It is likely that for files that don't require authentication, the provided (buggy) authentication creds were just ignored by S3 "hiding" this error.

@hmaarrfk
Copy link
Author

Its really strange. I booted up my windows machine again. and I tried to specify s3-cred:

(hdf5_114) C:\Users\mark>h5dump --enable-error-stack --filedriver=ros3 --s3-cred="(,,)" "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/tall.h5"

(hdf5_114) C:\Users\mark>h5dump --enable-error-stack --filedriver=ros3 "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/tall.h5"
h5dump error: unable to create FAPL for file access
HDF5-DIAG: Error detected in HDF5 (1.14.0) thread 18716:
  #000: C:\Users\mark\mambaforge\conda-bld\hdf5_1673802684638\work\src\H5P.c line 1486 in H5Pclose(): not a property list
    major: Invalid arguments to routine
    minor: Inappropriate type

the output is blank... If I don't specify --s3-cred i get the same error you showed

@mkitti
Copy link
Contributor

mkitti commented Jan 17, 2023

There is more happening on Windows. It segfaults I believe.

@jhendersonHDF
Copy link
Collaborator

Hi All,

This was a slight mistake on my part. In the initial release of HDF5 1.13, there were some changes in the tools to support using File Drivers that are loaded as plugins. With these changes there was a user report about the ROS3 VFD having issues with unauthenticated access, so I made the change noted by @hmaarrfk above, which ended up only initializing the VFD's info when the --s3-cred option is specified. I believe the simplest fix here would be to detect the specified VFD and always initialize the ROS3 VFD's info to ros3_fa_g iff the ROS3 VFD is being used, which should cover both the authenticated and unauthenticated cases. For the time being, specifying --s3-cred="(,,)" as mentioned here should suffice as a workaround, but I'm happy to try and help if there are still issues occurring with the VFD even when that workaround is used.

Rather than failing when vfd_info->info is NULL, perhaps this should default to the anonymous credentials?

Eventually, I would like to refactor the tools code to handle this fallback mechanism so that all of the handling of VFD-specific "stuff" is done in a single place rather than in each separate tool. However, that's a bit more work.

@mkitti
Copy link
Contributor

mkitti commented Jan 17, 2023

Here is the segmentation fault that I see with mingw64. I believe there is a second issue on this platform.

$ h5dump --enable-error-stack --filedriver=ros3 --s3-cred="(,,)" "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/tall.h5"
Segmentation fault

$ gdb h5dump
GNU gdb (GDB) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-w64-mingw32".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from h5dump...
(No debugging symbols found in h5dump)
(gdb) run --enable-error-stack --filedriver=ros3 --s3-cred="(,,)" "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/tall.h5"
Starting program: D:\msys64\mingw64\bin\h5dump.exe --enable-error-stack --filedriver=ros3 --s3-cred="(,,)" "https://s3.amazonaws.com/hdfgroup/d
ata/hdf5demo/tall.h5"
[New Thread 42344.0xb49c]
[New Thread 42344.0x3fa4]
[New Thread 42344.0x4e54]
[New Thread 42344.0x8b10]
[Thread 42344.0x8b10 exited with code 0]
[New Thread 42344.0x6c50]

Thread 1 received signal SIGSEGV, Segmentation fault.
0x00007ffdbd5cdb22 in strstr () from C:\windows\System32\msvcrt.dll
(gdb) bt
#0  0x00007ffdbd5cdb22 in strstr () from C:\windows\System32\msvcrt.dll
#1  0x00007ff792521568 in ?? ()
#2  0x00007ff792521d67 in ?? ()
#3  0x00007ff79251c400 in ?? ()
#4  0x00007ff7924f9f2a in ?? ()
#5  0x00007ff7924ea534 in ?? ()
#6  0x00007ff7927104e7 in ?? ()
#7  0x00007ff7926f7ac0 in ?? ()
#8  0x00007ff7924d9940 in ?? ()
#9  0x00007ff7924dba8b in ?? ()
#10 0x00007ff7924335c9 in ?? ()
#11 0x00007ff7927a2db4 in ?? ()
#12 0x00007ff7924212ee in ?? ()
#13 0x00007ff792421406 in ?? ()
#14 0x00007ffdbe8e7614 in KERNEL32!BaseThreadInitThunk () from C:\windows\System32\kernel32.dll
#15 0x00007ffdbf5826a1 in ntdll!RtlUserThreadStart () from C:\windows\SYSTEM32\ntdll.dll
#16 0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

@mkitti
Copy link
Contributor

mkitti commented Jan 17, 2023

With mingw32, the command succeeds on Windows.

$ h5dump --enable-error-stack --filedriver=ros3 --s3-cred="(,,)" "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/tall.h5"
HDF5 "https://s3.amazonaws.com/hdfgroup/data/hdf5demo/tall.h5" {
GROUP "/" {
   ATTRIBUTE "attr1" {
      DATATYPE  H5T_STD_I8BE
...

@mkitti
Copy link
Contributor

mkitti commented Jan 17, 2023

I'm starting to think that #2407 fixes the segmentation fault.

@hmaarrfk
Copy link
Author

#2407 fixes the segfault on windows with VS compilers. I've tested in conda-forge/hdf5-feedstock#188

@derobins derobins removed the bug label Mar 3, 2023
@jhendersonHDF jhendersonHDF self-assigned this Mar 6, 2023
@jhendersonHDF jhendersonHDF added Priority - 0. Blocker ⛔ This MUST be merged for the release to happen Component - C Library Core C library issues (usually in the src directory) Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub labels Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - C Library Core C library issues (usually in the src directory) Priority - 0. Blocker ⛔ This MUST be merged for the release to happen Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants