-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to fix for clang 15 #93
Conversation
There is this warning on CI: /home/runner/micromamba/envs/pp_env/lib/python3.12/site-packages/h5py/__init__.py:36: UserWarning: h5py is running against HDF5 1.14.3 when it was built against 1.14.2, this may cause problems
_warn(("h5py is running against HDF5 {0} when it was built against {1}, " It looks similar to these comments: conda-forge/h5py-feedstock#122 (comment) AFAICT this is just a warning (not an error) A bit later in the log it looks like something caused a segfault: Progress (CPU): 0 / 28Segmentation fault (core dumped)
Traceback (most recent call last):
2.1.2
90ac5fc6064660e6814c86f47b3679ac1050388f
File "/home/runner/work/pp-sketchlib/pp-sketchlib/test/run_test.py", line 29, in <module>
subprocess.run("python ../sketchlib-runner.py sketch -l references.txt -o test_db -s 10000 -k 15,29,4 --cpus 2", shell=True, check=True)
File "/home/runner/micromamba/envs/pp_env/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'python ../sketchlib-runner.py sketch -l references.txt -o test_db -s 10000 -k 15,29,4 --cpus 2' returned non-zero exit status 139. Are we able to debug this offline? Maybe there is a stacktrace that would help identify what caused the segfault |
Thanks for looking too. The segfault is consistent in time with when the HDF5 file is created, and when I was building I was getting the wrong version linked versus the one used at runtime which made me suspicious. Annoyingly my local version to debug this doesn't segfault, so it's going to take me more time to sort this out. (but note to self I should try running under valgrind in case the segfault is happening but not caught) |
Given the warning, maybe a first step would be to try pinning |
Appears to be an issue with openmp:
Using single thread the command runs. Trying this in the CI now |
Confirmed openmp. It would still be worth checking the compile and run time versions and the cmakelists to see if I can fix this in the CI. Otherwise omitting the multithread test would be fine |
Very annoying to debug, but the underlying reason for this appears to be calling
Some interaction/difference between openmp/pthreads and python threads perhaps. And I guess it may have always been segfaulting, but just not caught. Making sure you are in the main thread when checking this seems to work |
Closes #92