You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Run directory on NHC_COLAB_2 cluster: /lustre/hurricanes/florence_2018_eb4f92f0-a1d8-427e-be70-f05d9886789b/setup/ensemble.dir/runs
3 out of 11 runs failed with errors like this:
+ pushd /lustre/hurricanes/florence_2018_eb4f92f0-a1d8-427e-be70-f05d9886789b/setup/ensemble.dir/runs/vortex_4_variable_korobov_3
/lustre/hurricanes/florence_2018_eb4f92f0-a1d8-427e-be70-f05d9886789b/setup/ensemble.dir/runs/vortex_4_variable_korobov_3 /contrib/Fariborz.Daneshvar/home/ondemand-storm-workflow/singularity/scripts
+ mkdir -p outputs
+ mpirun -np 36 singularity exec --bind /lustre /lustre/singularity_images//solve.sif pschism_PAHM_TVD-VL 4
--------------------------------------------------------------------------
A call to mkdir was unable to create the desired directory:
Directory: /lustre
Error: File exists
Please check to ensure you have adequate permissions to perform
the desired operation.
--------------------------------------------------------------------------
[sorooshmani-nhccolab2-00006-1-0014:12853] [[30827,1],17] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12853] [[30827,1],17] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12853] [[30827,1],17] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12898] [[30827,1],1] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12898] [[30827,1],1] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12898] [[30827,1],1] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12889] [[30827,1],10] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12889] [[30827,1],10] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12889] [[30827,1],10] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12920] [[30827,1],28] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12920] [[30827,1],28] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12920] [[30827,1],28] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12893] [[30827,1],19] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12893] [[30827,1],19] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12893] [[30827,1],19] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12869] [[30827,1],20] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12869] [[30827,1],20] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12869] [[30827,1],20] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12910] [[30827,1],27] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12910] [[30827,1],27] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12910] [[30827,1],27] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12887] [[30827,1],33] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12887] [[30827,1],33] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12887] [[30827,1],33] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12871] [[30827,1],34] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12871] [[30827,1],34] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12871] [[30827,1],34] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_session_dir failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[sorooshmani-nhccolab2-00006-1-0014:12874] [[30827,1],24] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12874] [[30827,1],24] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12874] [[30827,1],24] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12882] [[30827,1],26] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12882] [[30827,1],26] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12882] [[30827,1],26] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12865] [[30827,1],32] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12865] [[30827,1],32] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12865] [[30827,1],32] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12904] [[30827,1],4] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12904] [[30827,1],4] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12904] [[30827,1],4] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12908] [[30827,1],5] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12908] [[30827,1],5] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12908] [[30827,1],5] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:13089] [[30827,1],15] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:13089] [[30827,1],15] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:13089] [[30827,1],15] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12943] [[30827,1],9] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12943] [[30827,1],9] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12943] [[30827,1],9] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12906] [[30827,1],12] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12906] [[30827,1],12] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12906] [[30827,1],12] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12911] [[30827,1],7] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12911] [[30827,1],7] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12911] [[30827,1],7] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:13035] [[30827,1],21] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:13035] [[30827,1],21] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:13035] [[30827,1],21] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12948] [[30827,1],23] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12948] [[30827,1],23] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12948] [[30827,1],23] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12876] [[30827,1],3] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12876] [[30827,1],3] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12876] [[30827,1],3] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12866] [[30827,1],18] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12866] [[30827,1],18] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12866] [[30827,1],18] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12875] [[30827,1],30] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12875] [[30827,1],30] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12875] [[30827,1],30] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12873] [[30827,1],0] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12873] [[30827,1],0] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12873] [[30827,1],0] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12945] [[30827,1],6] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12945] [[30827,1],6] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12945] [[30827,1],6] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12870] [[30827,1],8] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12870] [[30827,1],8] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12870] [[30827,1],8] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12885] [[30827,1],35] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12885] [[30827,1],35] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12885] [[30827,1],35] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12914] [[30827,1],22] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12914] [[30827,1],22] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12914] [[30827,1],22] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12872] [[30827,1],31] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12872] [[30827,1],31] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12872] [[30827,1],31] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12877] [[30827,1],11] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12877] [[30827,1],11] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12877] [[30827,1],11] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12884] [[30827,1],13] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12884] [[30827,1],13] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12884] [[30827,1],13] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12912] [[30827,1],29] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12912] [[30827,1],29] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12912] [[30827,1],29] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12909] [[30827,1],14] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12909] [[30827,1],14] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12909] [[30827,1],14] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12883] [[30827,1],25] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12883] [[30827,1],25] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12883] [[30827,1],25] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
[sorooshmani-nhccolab2-00006-1-0014:12907] [[30827,1],16] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12907] [[30827,1],16] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12907] [[30827,1],16] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_ess_init failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
ompi_mpi_init: ompi_rte_init failed
--> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12898] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[sorooshmani-nhccolab2-00006-1-0014:12889] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12853] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12920] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12893] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12910] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12887] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12871] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12869] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12908] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[sorooshmani-nhccolab2-00006-1-0014:12874] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12882] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12865] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:13089] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12904] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12911] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12906] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12876] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[sorooshmani-nhccolab2-00006-1-0014:12943] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:13035] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[sorooshmani-nhccolab2-00006-1-0014:12948] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12945] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[sorooshmani-nhccolab2-00006-1-0014:12866] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[sorooshmani-nhccolab2-00006-1-0014:12875] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12885] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[sorooshmani-nhccolab2-00006-1-0014:12870] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12914] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12884] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12872] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12877] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12912] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12883] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12909] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12907] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12873] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[sorooshmani-nhccolab2-00006-1-0014:12967] [[30827,1],2] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 107
[sorooshmani-nhccolab2-00006-1-0014:12967] [[30827,1],2] ORTE_ERROR_LOG: Error in file ../../../orte/util/session_dir.c at line 346
[sorooshmani-nhccolab2-00006-1-0014:12967] [[30827,1],2] ORTE_ERROR_LOG: Error in file ../../../../../../orte/mca/ess/pmi/ess_pmi_module.c at line 487
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[sorooshmani-nhccolab2-00006-1-0014:12967] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[30827,1],18]
Exit code: 1
--------------------------------------------------------------------------
[sorooshmani-nhccolab2-00006-1-0014:12124] 35 more processes have sent help message help-opal-util.txt / mkdir-failed
[sorooshmani-nhccolab2-00006-1-0014:12124] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[sorooshmani-nhccolab2-00006-1-0014:12124] 35 more processes have sent help message help-orte-runtime.txt / orte_init:startup:internal-failure
[sorooshmani-nhccolab2-00006-1-0014:12124] 35 more processes have sent help message help-orte-runtime / orte_init:startup:internal-failure
[sorooshmani-nhccolab2-00006-1-0014:12124] 35 more processes have sent help message help-mpi-runtime.txt / mpi_init:startup:internal-failure
The text was updated successfully, but these errors were encountered:
Run directory on NHC_COLAB_2 cluster:
/lustre/hurricanes/florence_2018_eb4f92f0-a1d8-427e-be70-f05d9886789b/setup/ensemble.dir/runs
3 out of 11 runs failed with errors like this:
The text was updated successfully, but these errors were encountered: