Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{chem}[foss/2021a] CP2K v9.1 #15146

Closed

Conversation

branfosj
Copy link
Member

@branfosj branfosj commented Mar 19, 2022

(created using eb --new-pr)

Draft while we check the tests and dependencies.

@branfosj
Copy link
Member Author

Test report by @branfosj
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0211u03a.bear.cluster - Linux RHEL 8.5, x86_64, Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz (cascadelake), Python 3.6.8
See https://gist.github.com/0c28f1d8afcfeb13e83afd7a9e56e597 for a full test report.

@branfosj branfosj marked this pull request as draft March 19, 2022 16:56
@branfosj
Copy link
Member Author

Test report by @branfosj
SUCCESS
Build succeeded for 6 out of 6 (1 easyconfigs in total)
bear-pg0105u36b.bear.cluster - Linux RHEL 8.5, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), Python 3.6.8
See https://gist.github.com/34cd135db70f5e2a8db99f74bed7d763 for a full test report.

@boegel
Copy link
Member

boegel commented Apr 26, 2022

@branfosj Any updates on this?

@boegel boegel added this to the 4.x milestone Apr 26, 2022
@branfosj
Copy link
Member Author

--------------------------------- Summary --------------------------------
Number of FAILED  tests 18
Number of WRONG   tests 6
Number of CORRECT tests 3604
Total number of   tests 3628
--------------------------------------------------------------------------
Number of LEAKING tests 0
Number of memory  leaks 0
--------------------------------------------------------------------------
GREPME 18 6 3604 0 3628 0

Summary: correct: 3604 / 3628; wrong: 6; failed: 18; 128min
Status: FAILED

The 18 failed look to be SIGSEGV's, with backtraces similar to:

[bear-pg0211u03a:2616826:0:2616850] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffc0e325670)
==== backtrace (tid:2616850) ====
 0 0x00000000000211fe ucs_debug_print_backtrace()  /dev/shm/build-branfosj-admin/branfosj-admin-up/UCX/1.10.0/GCCcore-10.3.0/ucx-1.10.0/src/ucs/debug/debug.c:656
 1 0x0000000000012c20 .annobin_sigaction.c()  sigaction.c:0
 2 0x000000000118bb6f __hfx_load_balance_methods_MOD_estimate_block_cost.constprop.0()  hfx_load_balance_methods.F90:0
 3 0x000000000119441d __hfx_load_balance_methods_MOD_hfx_load_balance()  ???:0
 4 0x0000000000b716ea __hfx_energy_potential_MOD_integrate_four_center._omp_fn.0()  hfx_energy_potential.F90:0
 5 0x000000000001a046 gomp_thread_start()  /dev/shm/build-branfosj-admin/branfosj-admin-up/GCCcore/10.3.0/system-system/gcc-10.3.0/stage3_obj/x86_64-pc-linux-gnu/libgomp/../../../libgomp/team.c:123
 6 0x000000000000817a start_thread()  pthread_create.c:0
 7 0x00000000000fcdc3 __GI___clone()  :0

6 wrong - 4 are just outside tolerance and 2 are quite a bit out:

  • QS/regtest-gpw-4/H2O-debug-5.inp.out: relative error : 1.52789104e-07 > numerical tolerance = 4e-10
  • QS/regtest-gpw-4/H2O-debug-6.inp.out: relative error : 1.52790209e-07 > numerical tolerance = 4e-10
  • QS/regtest-mp2-grad/H2O_grad_mme.inp.out: relative error : 6.67224302e-09 > numerical tolerance = 6e-09
  • QS/regtest-mp2-grad/H2O_grad_gpw.inp.out: relative error : 7.32319726e-08 > numerical tolerance = 7e-08
  • QS/regtest-mp2-grad/H2O_grad_ri-hfx.inp.out: relative error : 6.40217302e-09 > numerical tolerance = 6e-09
  • Fist/regtest-1-4/multipole_dipole.dbg_f_real.inp.out: relative error : 1.03449783e-14 > numerical tolerance = 1.0E-14

@boegel
Copy link
Member

boegel commented Apr 27, 2022

Test report by @boegel
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
node3519.doduo.os - Linux RHEL 8.4, x86_64, AMD EPYC 7552 48-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/bb6101b81f21d3c7c11c7e22bb7a6c9b for a full test report.

@alinelena
Copy link
Contributor

ok I am doing 9.1 and 2022.1 for 2022a I see similar issues with tolerance misses...
the segfaults run fine if I run it by hand so I suspect is just something wrong the way in which tests are run.

@alinelena
Copy link
Contributor

see #16007

@branfosj
Copy link
Member Author

There are PRs for more recent versions, so closing this one

@branfosj branfosj closed this Feb 10, 2023
@branfosj branfosj deleted the 20220319135932_new_pr_CP2K91 branch February 10, 2023 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants