Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: superlu-dist test from E4S Testsuite on E4S 22.11 #183

Open
shahzebsiddiqui opened this issue Jun 6, 2023 · 0 comments
Open

[Bug]: superlu-dist test from E4S Testsuite on E4S 22.11 #183

shahzebsiddiqui opened this issue Jun 6, 2023 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@shahzebsiddiqui
Copy link
Contributor

CDASH Build

https://my.cdash.org/test/81709491

Link to buildspec file

https://github.com/buildtesters/buildtest-nersc/blob/devel/buildspecs/e4s/E4S-Testsuite/perlmutter/22.11/superlu-dist.yml

Please describe the issue?

@wspear this test is failing but it looks like its just invoking the spack smoke test so should we have this run using the spack test and not have this test in E4S testsuite in buildtest. We do have superlu test see https://github.com/buildtesters/buildtest-nersc/blob/devel/buildspecs/e4s/spack_test/perlmutter/22.11/superlu.yml but not superlu-dist.

The one from gcc spack envrionment is not built with CUDA support so we should probably do a module load cpu because this GTL error is related to having the gpu module loaded so its causing this weird error.

Relevant log output

superlu-dist %gcc: okaupsq
Running /global/cfs/cdirs/m3503/buildtest/runs/perlmutter_scheduled_test/2023-06-01/perlmutter.slurm.regular/superlu-dist/superlu-dist_e4s_testsuite_22.11/6d8192a4/stage/testsuite/validation_tests/superlu-dist
Skipping load: Environment already setup
==> Error: TestFailure: 1 tests failed.


Command exited with status 139:
    '/usr/bin/srun' '-n' '4' '/global/common/software/spackecp/perlmutter/e4s-22.11/83104/spack/opt/spack/linux-sles15-zen3/gcc-11.2.0/superlu-dist-7.2.0-okaupsqmcqqwph5bocy5dlhmij34c4y3/lib/EXAMPLE/pddrive' '-r' '2' '-c' '2' 'g20.rua'
[CRAYBLAS_WARNING] Application linked against multiple cray-libsci libraries
MPICH ERROR [Rank 0] [job id 9831746.0] [Thu Jun  1 19:33:48 2023] [nid002349] - Abort(-1) (rank 0 in comm 0): MPIDI_CRAY_init: GPU_SUPPORT_ENABLED is requested, but GTL library is not linked
 (Other MPI error)

aborting job:
MPIDI_CRAY_init: GPU_SUPPORT_ENABLED is requested, but GTL library is not linked

srun: error: nid002349: task 0: Exited with exit code 255
srun: Terminating StepId=9831746.0
srun: error: nid002349: tasks 1-3: Segmentation fault



1 error found in test log:
     20    
     21    aborting job:
     22    MPIDI_CRAY_init: GPU_SUPPORT_ENABLED is requested, but GTL library is not linked
     23    
     24    srun: error: nid002349: task 0: Exited with exit code 255
     25    srun: Terminating StepId=9831746.0
  >> 26    srun: error: nid002349: tasks 1-3: Segmentation fault
     27    
     28      File "/global/common/software/spackecp/perlmutter/e4s-22.11/default/spack/bin/spack", line 100, in <module>



/global/common/software/spackecp/perlmutter/e4s-22.11/83104/spack/lib/spack/spack/build_environment.py:1086, in _setup_pkg_and_run:
       1083        tb_string = traceback.format_exc()
       1084
       1085        # build up some context from the offending package so we can
  >>   1086        # show that, too.
       1087        package_context = get_package_context(tb)
       1088
       1089        logfile = None

See test log for details:
  /global/homes/b/bdtest/.spack/test/xq4h4sghsfgnroykbcji4rr6hpqfol7f/superlu-dist-7.2.0-okaupsq-test-out.txt

==> Error: 1 test(s) in the suite failed.

Run failed
@shahzebsiddiqui shahzebsiddiqui added the bug Something isn't working label Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants