Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding ESPResSo test PR #144

Merged
merged 19 commits into from
Jun 11, 2024
Merged

Adding ESPResSo test PR #144

merged 19 commits into from
Jun 11, 2024

Conversation

satishskamath
Copy link
Collaborator

ESPResSo P3M weak-scaling and strong scaling test.

@satishskamath
Copy link
Collaborator Author

Currently this PR includes files such as the job script and the benchmarking file which will be removed later. Right now it is here for reference.

sub-dirs as part of the package. Sanity for the weak scaling is already
added. Next step is to extract the performance timing and log it.
Comment on lines 119 to 133
print("Executing sanity checks...\n")
if (np.all([np.allclose(energy, ref_energy, atol=atol_energy, rtol=rtol_energy),
np.allclose(p_scalar, np.trace(ref_pressure) / 3.,
atol=atol_pressure, rtol=rtol_pressure),
np.allclose(p_tensor, ref_pressure, atol=atol_pressure, rtol=rtol_pressure),
np.allclose(forces, 0., atol=atol_forces, rtol=rtol_forces),
np.allclose(np.median(np.abs(forces)), 0., atol=atol_abs_forces, rtol=rtol_abs_forces)])):
print("Final convergence met with tolerances: \n\
energy: ", atol_energy, "\n\
p_scalar: ", atol_pressure, "\n\
p_tensor: ", atol_pressure, "\n\
forces: ", atol_forces, "\n\
abs_forces: ", atol_abs_forces, "\n")
else:
print("At least one parameter did not meet the tolerance, see the log above.\n")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sanity checks have actually already executed at lines 111 to 116, and they will interrupt the Python interpreter with an exception if any check fails, so I would suspect the else branch is unreachable. I would also recommend against re-expressing the assertions as np.allclose in the conditional to avoid redundancy and prevent the risk that the assertions and conditional diverge over time, for example due to changes to tolerance values.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jngrad ,

I have used the same values for tolerances in the both the assertions so the values should not diverge between the assertions and the conditional.

Do these assertions also end the python executions? In that case, I will move the conditional above your original assertions so that the execution can also reach the else part of the code.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have used the same values for tolerances in the both the assertions so the values should not diverge between the assertions and the conditional.

They don't diverge today. But they might in a month's time if multiple people contribute to this file, or if you forget that the conditional block must exactly mirror the assertion block above it.

Do these assertions also end the python executions?

np.testing.assert_allclose() raises an AssertionError which halts the Python interpreter with a non-zero exit code.

In that case, I will move the conditional above your original assertions so that the execution can also reach the else part of the code.

Is the else branch truly needed? np.testing.assert_allclose() already generates a clear error message:

Traceback (most recent call last):
  File "/work/jgrad/espresso/src/madelung.py", line 116, in <module>
    np.testing.assert_allclose(np.median(np.abs(forces)), 0., atol=atol_abs_forces, rtol=rtol_abs_forces)
  File "/tikhome/jgrad/.local/lib/python3.10/site-packages/numpy/testing/_private/utils.py", line 1527, in assert_allclose
    assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
  File "/tikhome/jgrad/.local/lib/python3.10/site-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Not equal to tolerance rtol=0, atol=2e-06

Mismatched elements: 1 / 1 (100%)
Max absolute difference: 2.1e-06
Max relative difference: inf
 x: array(2.1e-06)
 y: array(0.)

I do not totally understand the purpose of this if/else statement. If you need to log the tolerances to stdout, you can do so independently of the success or failure of the assertions, since they are constants. If you need to report the success of failure of the assertions, that is already done by numpy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the assertion reports a failure in the manner that you have pointed out but how can I extract a success? Or printing a message right below it would suffice? Since it exits the program anyways.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can print("Success") or check if Python returned exit code 0.

Comment on lines 149 to 152
if pathlib.Path(args.output).is_file():
header = ""
with open(args.output, "a") as f:
f.write(header + report)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This write operation is superfluous if the ReFrame runner captures stdout.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have to remove this and is a part of my TODO. :)

@boegel
Copy link
Contributor

boegel commented May 23, 2024

I guess the scripts_Espresso.tar.gz file was included by mistake?

@laraPPr
Copy link
Collaborator

laraPPr commented May 24, 2024

First run on hortense:

[       OK ] ( 1/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_core %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b3d35aba @hortense:cpu_rome_256gb+default

[       OK ] ( 2/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /8e2461b7 @hortense:cpu_rome_256gb+default

[       OK ] ( 3/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_cpn_2_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /6d1b46d4 @hortense:cpu_rome_256gb+default

[       OK ] ( 4/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_cpn_4_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /e8e809a5 @hortense:cpu_rome_256gb+default

[       OK ] ( 5/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /486d66e0 @hortense:cpu_rome_256gb+default

[       OK ] ( 6/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_2_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /35169ab5 @hortense:cpu_rome_256gb+default

[     FAIL ] ( 7/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /78fa4a41 @hortense:cpu_rome_256gb+default

==> test failed during 'sanity': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_78fa4a41'

[       OK ] ( 8/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_8_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /10ad715b @hortense:cpu_rome_256gb+default

[     FAIL ] ( 9/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /dd36273c @hortense:cpu_rome_256gb+default

==> test failed during 'sanity': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_dd36273c'

[       OK ] (10/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_4_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /93ab1ac0 @hortense:cpu_rome_256gb+default

[     FAIL ] (11/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /076517db @hortense:cpu_rome_256gb+default

==> test failed during 'sanity': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_076517db'

[     FAIL ] (12/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=16_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /1ec07d06 @hortense:cpu_rome_256gb+default

==> test failed during 'sanity': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_1ec07d06'

[     FAIL ] (13/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /a160edb4 @hortense:cpu_rome_256gb+default

==> test failed during 'sanity': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_a160edb4'

@laraPPr
Copy link
Collaborator

laraPPr commented May 24, 2024

scale=8_nodes and scale=16_nodes hit the time limit
scale=1_nodes: Some of the step tasks have been OOM Killed.
scale=2_nodes: Some of the step tasks have been OOM Killed
scale=4_nodes: Some of the step tasks have been OOM Killed.

@laraPPr
Copy link
Collaborator

laraPPr commented May 24, 2024

I now ran it on another partition with some more memory:

all the tests that do not request a or multiple nodes passed.

The following scales 1_node and 2_nodes failed with similar errors as on the other partition.

The 4_nodes passed here just under 30mins.

Now waiting on the 8_nodes and 16_nodes tests.

@laraPPr
Copy link
Collaborator

laraPPr commented May 24, 2024

scale 8_nodes and 16_nodes hit the time limit

[       OK ] ( 1/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /486d66e0 @hortense:cpu_rome_512gb+default

[       OK ] ( 2/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_cpn_2_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /6d1b46d4 @hortense:cpu_rome_512gb+default

[       OK ] ( 3/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_cpn_4_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /e8e809a5 @hortense:cpu_rome_512gb+default



[       OK ] ( 4/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /8e2461b7 @hortense:cpu_rome_512gb+default

[       OK ] ( 5/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_core %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b3d35aba @hortense:cpu_rome_512gb+default

[       OK ] ( 6/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_8_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /10ad715b @hortense:cpu_rome_512gb+default

[       OK ] ( 7/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_2_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /35169ab5 @hortense:cpu_rome_512gb+default

[       OK ] ( 8/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_4_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /93ab1ac0 @hortense:cpu_rome_512gb+default

[     FAIL ] ( 9/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /dd36273c @hortense:cpu_rome_512gb+default

==> test failed during 'sanity': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_512gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_dd36273c'

[     FAIL ] (10/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /78fa4a41 @hortense:cpu_rome_512gb+default

==> test failed during 'sanity': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_512gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_78fa4a41'

[       OK ] (11/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /076517db @hortense:cpu_rome_512gb+default

[     FAIL ] (12/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /a160edb4 @hortense:cpu_rome_512gb+default

==> test failed during 'sanity': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_512gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_a160edb4'`

[     FAIL ] (13/13) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=16_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /1ec07d06 @hortense:cpu_rome_512gb+default

==> test failed during 'sanity': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_512gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_1ec07d06'

@satishskamath
Copy link
Collaborator Author

I guess the scripts_Espresso.tar.gz file was included by mistake?

Yes, it was a mistake which I intend to correct. :)

@run_after('init')
def set_mem(self):
""" Setting an extra job option of memory. """
self.extra_resources = {'memory': {'size': '50GB'}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use

def req_memory_per_node(test: rfm.RegressionTest, app_mem_req):
to request a certain amount of memory.

Also, I assume the memory requirement isn't a fixed 50GB, but depends on the scale at which this is run (i.e. number of tasks)? Or doesn't it? If it does, please define an approximate function to compute the memory requirement as a function of task count. It's fine if it is somewhat conservative (i.e. asks for too much), but be aware that the test will be skipped on systems where insufficient memory is available (so don't over-do it).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the problem causing OOM and I have fixed it. I am yet to push it into the PR as I am making some more incremental changes to it. I am also observing some crashes on zen4 such as:
https://gitlab.com/eessi/support/-/issues/37#note_1927317164

@satishskamath satishskamath changed the title Adding ESPResSo test PR [WIP] Adding ESPResSo test PR Jun 4, 2024
@satishskamath
Copy link
Collaborator Author

satishskamath commented Jun 4, 2024

@laraPPr can you test this PR at your local system again? The 16 node test takes way too long for this PR to run. I had a discussion with @jngrad and we have a strategy to overcome this limitation but that will be a part of a subsequent PR to this test. Other than that the other tests should be manageable.

Other than that I have re-requested a review for this PR from @casparvl and @jngrad .

@satishskamath
Copy link
Collaborator Author

{EESSI 2023.06} [satishk@tcn3 projects]$ reframe -C test-suite/config/surf_snellius.py -c test-suite/eessi/testsuite/tests/apps/espresso/espresso.py -t "^1_node|^2_node|^4_nodes|8_nodes" --system="snellius:rome" -r
  [ReFrame Setup]
    version:           4.3.3
    command:           '/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/software/ReFrame/4.3.3/bin/reframe -C test-suite/config/surf_snellius.py -c test-suite/eessi/testsuite/tests/apps/espresso/espresso.py -t ^1_node|^2_node|^4_nodes|8_nodes --system=snellius:rome -r'
    launched by:       [email protected]
    working directory: '/gpfs/home5/satishk/projects'
    settings files:    '<builtin>', 'test-suite/config/surf_snellius.py'
    check search path: '/gpfs/home5/satishk/projects/test-suite/eessi/testsuite/tests/apps/espresso/espresso.py'
    stage directory:   '/scratch-shared/satishk/reframe_output/staging'
    output directory:  '/home/satishk/reframe_runs/output'
    log files:         '/home/satishk/reframe_runs/logs/reframe_20240603_173654.log'

  [==========] Running 8 check(s)
  [==========] Started on Mon Jun  3 17:36:56 2024

  [----------] start processing checks
  [ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /962e9d54 @snellius:rome+default
  [ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /a160edb4 @snellius:rome+default
  [ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /132b4077 @snellius:rome+default
  [ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /076517db @snellius:rome+default
  [ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /d65ed4f6 @snellius:rome+default
  [ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /dd36273c @snellius:rome+default
  [ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /64b695bd @snellius:rome+default
  [ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /78fa4a41 @snellius:rome+default
  [       OK ] (1/8) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /64b695bd @snellius:rome+default
  [       OK ] (2/8) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /78fa4a41 @snellius:rome+default
  [       OK ] (3/8) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /d65ed4f6 @snellius:rome+default
  [       OK ] (4/8) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /dd36273c @snellius:rome+default
  [       OK ] (5/8) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /132b4077 @snellius:rome+default
  [       OK ] (6/8) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /076517db @snellius:rome+default
  [       OK ] (7/8) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /962e9d54 @snellius:rome+default
  [       OK ] (8/8) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /a160edb4 @snellius:rome+default
  [----------] all spawned checks have finished

  [  PASSED  ] Ran 8/8 test case(s) from 8 check(s) (0 failure(s), 0 skipped, 0 aborted)
  [==========] Finished on Mon Jun  3 21:17:35 2024

Latest result on Snellius rome partition.

@laraPPr
Copy link
Collaborator

laraPPr commented Jun 7, 2024

{EESSI 2023.06} [vsc46128@login55 test-suite]$ reframe --config-file config/vsc_hortense.py --checkpath eessi/testsuite/tests/apps -R --name  ESPResSo --system hortense:cpu_rome_256gb --run --performance-report

[ReFrame Setup]

  version:           4.3.3

  command:           '/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/software/ReFrame/4.3.3/bin/reframe --config-file config/vsc_hortense.py -checkpath eessi/testsuite/tests/apps -R --name ESPResSo --system hortense:cpu_rome_256gb --run --performance-report'

  launched by:       [email protected]

  working directory: '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite'

  settings files:    '<builtin>', 'config/vsc_hortense.py'

  check search path: (R) '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/eessi/testsuite/tests/apps'

  stage directory:   '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage'

  output directory:  '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/output'

  log files:         '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/logs/reframe_20240607_170321.log'



WARNING: skipping test file '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/eessi/testsuite/tests/apps/QuantumESPRESSO.py': module not found error: eessi/testsuite/tests/apps/QuantumESPRESSO.py:32: No module named 'hpctestlib.sciapps.qespresso'

from hpctestlib.sciapps.qespresso.benchmarks import QEspressoPWCheck

 (rerun with '-v' for more information)

[==========] Running 26 check(s)

[==========] Started on Fri Jun  7 17:03:53 2024 



[----------] start processing checks

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=16_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /319c602b @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=16_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /1ec07d06 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /962e9d54 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /a160edb4 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /132b4077 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /076517db @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /d65ed4f6 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /dd36273c @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /64b695bd @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /78fa4a41 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_2_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c5e7adeb @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_2_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /35169ab5 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_4_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /4f790477 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_4_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /93ab1ac0 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_8_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c421b8bf @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_8_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /10ad715b @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_4nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b3994b39 @hortense:cpu_rome_256gb+default

[     FAIL ] ( 1/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=16_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /319c602b @hortense:cpu_rome_256gb+default

==> test failed during 'run': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_319c602b'

[     FAIL ] ( 2/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=16_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /1ec07d06 @hortense:cpu_rome_256gb+default

==> test failed during 'run': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_1ec07d06'

[     FAIL ] ( 3/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /962e9d54 @hortense:cpu_rome_256gb+default

==> test failed during 'run': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_962e9d54'

[     FAIL ] ( 4/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /a160edb4 @hortense:cpu_rome_256gb+default

==> test failed during 'run': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_a160edb4'

[     FAIL ] ( 5/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /132b4077 @hortense:cpu_rome_256gb+default

==> test failed during 'run': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_132b4077'

[     FAIL ] ( 6/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /076517db @hortense:cpu_rome_256gb+default

==> test failed during 'run': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_076517db'

[     FAIL ] ( 7/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /d65ed4f6 @hortense:cpu_rome_256gb+default

==> test failed during 'run': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_d65ed4f6'

[     FAIL ] ( 8/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /dd36273c @hortense:cpu_rome_256gb+default

==> test failed during 'run': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_dd36273c'

[     FAIL ] ( 9/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /64b695bd @hortense:cpu_rome_256gb+default

==> test failed during 'run': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_64b695bd'

[     FAIL ] (10/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /78fa4a41 @hortense:cpu_rome_256gb+default

==> test failed during 'run': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_78fa4a41'

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_4nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b112c2ad @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_2nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0527d10c @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_2nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0f5bb625 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_cores %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0283eda2 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /486d66e0 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_cores %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /65a0a3fa @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /8e2461b7 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_core %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c94f54f2 @hortense:cpu_rome_256gb+default

[ RUN      ] EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_core %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b3d35aba @hortense:cpu_rome_256gb+default

[       OK ] (11/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_core %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c94f54f2 @hortense:cpu_rome_256gb+default

P: perf: 0.0861 s/step (r:0, l:None, u:None)

[       OK ] (12/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_core %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b3d35aba @hortense:cpu_rome_256gb+default

P: perf: 0.08919 s/step (r:0, l:None, u:None)

[       OK ] (13/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /8e2461b7 @hortense:cpu_rome_256gb+default

P: perf: 0.182 s/step (r:0, l:None, u:None)

[       OK ] (14/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_cores %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /65a0a3fa @hortense:cpu_rome_256gb+default

P: perf: 0.1893 s/step (r:0, l:None, u:None)

[       OK ] (15/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_4nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b3994b39 @hortense:cpu_rome_256gb+default

P: perf: 0.2036 s/step (r:0, l:None, u:None)

[       OK ] (16/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_4nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b112c2ad @hortense:cpu_rome_256gb+default

P: perf: 0.2158 s/step (r:0, l:None, u:None)

[       OK ] (17/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_2nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0527d10c @hortense:cpu_rome_256gb+default

P: perf: 0.1589 s/step (r:0, l:None, u:None)

[       OK ] (18/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_2nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0f5bb625 @hortense:cpu_rome_256gb+default

P: perf: 0.1579 s/step (r:0, l:None, u:None)

[       OK ] (19/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /486d66e0 @hortense:cpu_rome_256gb+default

P: perf: 0.2701 s/step (r:0, l:None, u:None)

[       OK ] (20/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_cores %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0283eda2 @hortense:cpu_rome_256gb+default

P: perf: 0.2705 s/step (r:0, l:None, u:None)

[       OK ] (21/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_2_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c5e7adeb @hortense:cpu_rome_256gb+default

P: perf: 0.3853 s/step (r:0, l:None, u:None)

[       OK ] (22/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_2_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /35169ab5 @hortense:cpu_rome_256gb+default

P: perf: 0.3888 s/step (r:0, l:None, u:None)

[       OK ] (23/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_8_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /10ad715b @hortense:cpu_rome_256gb+default

P: perf: 1.855 s/step (r:0, l:None, u:None)

[       OK ] (24/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_8_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c421b8bf @hortense:cpu_rome_256gb+default

P: perf: 2.038 s/step (r:0, l:None, u:None)

[       OK ] (25/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_4_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /4f790477 @hortense:cpu_rome_256gb+default

P: perf: 3.573 s/step (r:0, l:None, u:None)

[       OK ] (26/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_4_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /93ab1ac0 @hortense:cpu_rome_256gb+default

P: perf: 4.39 s/step (r:0, l:None, u:None)

[----------] all spawned checks have finished



[  FAILED  ] Ran 26/26 test case(s) from 26 check(s) (10 failure(s), 0 skipped, 0 aborted)

[==========] Finished on Fri Jun  7 17:14:24 2024 

====================================================================================================

SUMMARY OF FAILURES

----------------------------------------------------------------------------------------------------

FAILURE INFO for EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=16_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m (run: 1/1)

  * Description: 

  * System partition: hortense:cpu_rome_256gb

  * Environment: default

  * Stage directory: /dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_319c602b

  * Node list: 

  * Job type: batch job (id=None)

  * Dependencies (conceptual): []

  * Dependencies (actual): []

  * Maintainers: []

  * Failing phase: run

  * Rerun with '-n /319c602b -p default --system hortense:cpu_rome_256gb -r'

  * Reason: spawned process error: command 'sbatch rfm_job.sh' failed with exit code 1:

--- stdout ---

--- stdout ---

--- stderr ---

sbatch: error: Batch job submission failed: Memory required by task is not available



--- stderr ---

----------------------------------------------------------------------------------------------------

FAILURE INFO for EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=16_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m (run: 1/1)

  * Description: 

  * System partition: hortense:cpu_rome_256gb

  * Environment: default

  * Stage directory: /dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_1ec07d06

  * Node list: 

  * Job type: batch job (id=None)

  * Dependencies (conceptual): []

  * Dependencies (actual): []

  * Maintainers: []

  * Failing phase: run

  * Rerun with '-n /1ec07d06 -p default --system hortense:cpu_rome_256gb -r'

  * Reason: spawned process error: command 'sbatch rfm_job.sh' failed with exit code 1:

--- stdout ---

--- stdout ---

--- stderr ---

sbatch: error: Batch job submission failed: Memory required by task is not available



--- stderr ---

----------------------------------------------------------------------------------------------------

FAILURE INFO for EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m (run: 1/1)

  * Description: 

  * System partition: hortense:cpu_rome_256gb

  * Environment: default

  * Stage directory: /dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_962e9d54

  * Node list: 

  * Job type: batch job (id=None)

  * Dependencies (conceptual): []

  * Dependencies (actual): []

  * Maintainers: []

  * Failing phase: run

  * Rerun with '-n /962e9d54 -p default --system hortense:cpu_rome_256gb -r'

  * Reason: spawned process error: command 'sbatch rfm_job.sh' failed with exit code 1:

--- stdout ---

--- stdout ---

--- stderr ---

sbatch: error: Batch job submission failed: Memory required by task is not available



--- stderr ---

----------------------------------------------------------------------------------------------------

FAILURE INFO for EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m (run: 1/1)

  * Description: 

  * System partition: hortense:cpu_rome_256gb

  * Environment: default

  * Stage directory: /dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_a160edb4

  * Node list: 

  * Job type: batch job (id=None)

  * Dependencies (conceptual): []

  * Dependencies (actual): []

  * Maintainers: []

  * Failing phase: run

  * Rerun with '-n /a160edb4 -p default --system hortense:cpu_rome_256gb -r'

  * Reason: spawned process error: command 'sbatch rfm_job.sh' failed with exit code 1:

--- stdout ---

--- stdout ---

--- stderr ---

sbatch: error: Batch job submission failed: Memory required by task is not available



--- stderr ---

----------------------------------------------------------------------------------------------------

FAILURE INFO for EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m (run: 1/1)

  * Description: 

  * System partition: hortense:cpu_rome_256gb

  * Environment: default

  * Stage directory: /dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_132b4077

  * Node list: 

  * Job type: batch job (id=None)

  * Dependencies (conceptual): []

  * Dependencies (actual): []

  * Maintainers: []

  * Failing phase: run

  * Rerun with '-n /132b4077 -p default --system hortense:cpu_rome_256gb -r'

  * Reason: spawned process error: command 'sbatch rfm_job.sh' failed with exit code 1:

--- stdout ---

--- stdout ---

--- stderr ---

sbatch: error: Batch job submission failed: Memory required by task is not available



--- stderr ---

----------------------------------------------------------------------------------------------------

FAILURE INFO for EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m (run: 1/1)

  * Description: 

  * System partition: hortense:cpu_rome_256gb

  * Environment: default

  * Stage directory: /dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_076517db

  * Node list: 

  * Job type: batch job (id=None)

  * Dependencies (conceptual): []

  * Dependencies (actual): []

  * Maintainers: []

  * Failing phase: run

  * Rerun with '-n /076517db -p default --system hortense:cpu_rome_256gb -r'

  * Reason: spawned process error: command 'sbatch rfm_job.sh' failed with exit code 1:

--- stdout ---

--- stdout ---

--- stderr ---

sbatch: error: Batch job submission failed: Memory required by task is not available



--- stderr ---

----------------------------------------------------------------------------------------------------

FAILURE INFO for EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m (run: 1/1)

  * Description: 

  * System partition: hortense:cpu_rome_256gb

  * Environment: default

  * Stage directory: /dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_d65ed4f6

  * Node list: 

  * Job type: batch job (id=None)

  * Dependencies (conceptual): []

  * Dependencies (actual): []

  * Maintainers: []

  * Failing phase: run

  * Rerun with '-n /d65ed4f6 -p default --system hortense:cpu_rome_256gb -r'

  * Reason: spawned process error: command 'sbatch rfm_job.sh' failed with exit code 1:

--- stdout ---

--- stdout ---

--- stderr ---

sbatch: error: Batch job submission failed: Memory required by task is not available



--- stderr ---

----------------------------------------------------------------------------------------------------

FAILURE INFO for EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m (run: 1/1)

  * Description: 

  * System partition: hortense:cpu_rome_256gb

  * Environment: default

  * Stage directory: /dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_dd36273c

  * Node list: 

  * Job type: batch job (id=None)

  * Dependencies (conceptual): []

  * Dependencies (actual): []

  * Maintainers: []

  * Failing phase: run

  * Rerun with '-n /dd36273c -p default --system hortense:cpu_rome_256gb -r'

  * Reason: spawned process error: command 'sbatch rfm_job.sh' failed with exit code 1:

--- stdout ---

--- stdout ---

--- stderr ---

sbatch: error: Batch job submission failed: Memory required by task is not available



--- stderr ---

----------------------------------------------------------------------------------------------------

FAILURE INFO for EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m (run: 1/1)

  * Description: 

  * System partition: hortense:cpu_rome_256gb

  * Environment: default

  * Stage directory: /dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_64b695bd

  * Node list: 

  * Job type: batch job (id=None)

  * Dependencies (conceptual): []

  * Dependencies (actual): []

  * Maintainers: []

  * Failing phase: run

  * Rerun with '-n /64b695bd -p default --system hortense:cpu_rome_256gb -r'

  * Reason: spawned process error: command 'sbatch rfm_job.sh' failed with exit code 1:

--- stdout ---

--- stdout ---

--- stderr ---

sbatch: error: Batch job submission failed: Memory required by task is not available



--- stderr ---

----------------------------------------------------------------------------------------------------

FAILURE INFO for EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m (run: 1/1)

  * Description: 

  * System partition: hortense:cpu_rome_256gb

  * Environment: default

  * Stage directory: /dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_78fa4a41

  * Node list: 

  * Job type: batch job (id=None)

  * Dependencies (conceptual): []

  * Dependencies (actual): []

  * Maintainers: []

  * Failing phase: run

  * Rerun with '-n /78fa4a41 -p default --system hortense:cpu_rome_256gb -r'

  * Reason: spawned process error: command 'sbatch rfm_job.sh' failed with exit code 1:

--- stdout ---

--- stdout ---

--- stderr ---

sbatch: error: Batch job submission failed: Memory required by task is not available



--- stderr ---

----------------------------------------------------------------------------------------------------



====================================================================================================

PERFORMANCE REPORT

----------------------------------------------------------------------------------------------------

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_2_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c5e7adeb @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 64

  num_tasks_per_node: 64

  performance:

    - perf: 0.3853 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_2_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /35169ab5 @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 64

  num_tasks_per_node: 64

  performance:

    - perf: 0.3888 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_4_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /4f790477 @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 32

  num_tasks_per_node: 32

  performance:

    - perf: 3.573 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_4_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /93ab1ac0 @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 32

  num_tasks_per_node: 32

  performance:

    - perf: 4.39 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_8_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c421b8bf @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 16

  num_tasks_per_node: 16

  performance:

    - perf: 2.038 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_8_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /10ad715b @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 16

  num_tasks_per_node: 16

  performance:

    - perf: 1.855 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_4nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b3994b39 @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 4

  num_tasks_per_node: 1

  performance:

    - perf: 0.2036 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_4nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b112c2ad @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 4

  num_tasks_per_node: 1

  performance:

    - perf: 0.2158 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_2nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0527d10c @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 2

  num_tasks_per_node: 1

  performance:

    - perf: 0.1589 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_2nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0f5bb625 @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 2

  num_tasks_per_node: 1

  performance:

    - perf: 0.1579 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_cores %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0283eda2 @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 4

  num_tasks_per_node: 4

  performance:

    - perf: 0.2705 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /486d66e0 @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 4

  num_tasks_per_node: 4

  performance:

    - perf: 0.2701 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_cores %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /65a0a3fa @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 2

  num_tasks_per_node: 2

  performance:

    - perf: 0.1893 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /8e2461b7 @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 2

  num_tasks_per_node: 2

  performance:

    - perf: 0.182 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_core %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c94f54f2 @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 1

  num_tasks_per_node: 1

  performance:

    - perf: 0.0861 s/step (r: 0 s/step l: -inf% u: +inf%)

[EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_core %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b3d35aba @hortense:cpu_rome_256gb:default]

  num_cpus_per_task: 1

  num_tasks: 1

  num_tasks_per_node: 1

  performance:

    - perf: 0.08919 s/step (r: 0 s/step l: -inf% u: +inf%)

----------------------------------------------------------------------------------------------------

Log file(s) saved in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/logs/reframe_20240607_170321.log'

@laraPPr
Copy link
Collaborator

laraPPr commented Jun 8, 2024

Everything except the 16_node (which I cancelled) one passes with this fix #151

[       OK ] ( 1/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_2nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0527d10c @hortense:cpu_rome_256gb+default

P: perf: 0.1725 s/step (r:0, l:None, u:None)

[       OK ] ( 2/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_2nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0f5bb625 @hortense:cpu_rome_256gb+default

P: perf: 0.1783 s/step (r:0, l:None, u:None)

[       OK ] ( 3/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_4nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b3994b39 @hortense:cpu_rome_256gb+default

P: perf: 0.256 s/step (r:0, l:None, u:None)

[       OK ] ( 4/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1cpn_4nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b112c2ad @hortense:cpu_rome_256gb+default

P: perf: 0.2219 s/step (r:0, l:None, u:None)

[       OK ] ( 5/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_cores %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /65a0a3fa @hortense:cpu_rome_256gb+default

P: perf: 0.1844 s/step (r:0, l:None, u:None)

[       OK ] ( 6/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /486d66e0 @hortense:cpu_rome_256gb+default

P: perf: 0.2576 s/step (r:0, l:None, u:None)

[       OK ] ( 7/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_cores %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /0283eda2 @hortense:cpu_rome_256gb+default

P: perf: 0.2555 s/step (r:0, l:None, u:None)

[       OK ] ( 8/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_cores %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /8e2461b7 @hortense:cpu_rome_256gb+default

P: perf: 0.1806 s/step (r:0, l:None, u:None)

[       OK ] ( 9/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_core %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c94f54f2 @hortense:cpu_rome_256gb+default

P: perf: 0.09656 s/step (r:0, l:None, u:None)

[       OK ] (10/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_core %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /b3d35aba @hortense:cpu_rome_256gb+default

P: perf: 0.0927 s/step (r:0, l:None, u:None)

[       OK ] (11/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_2_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c5e7adeb @hortense:cpu_rome_256gb+default

P: perf: 0.3988 s/step (r:0, l:None, u:None)

[       OK ] (12/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_8_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /10ad715b @hortense:cpu_rome_256gb+default

P: perf: 2.115 s/step (r:0, l:None, u:None)

[       OK ] (13/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_8_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /c421b8bf @hortense:cpu_rome_256gb+default

P: perf: 2.149 s/step (r:0, l:None, u:None)

[       OK ] (14/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_4_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /93ab1ac0 @hortense:cpu_rome_256gb+default

P: perf: 3.568 s/step (r:0, l:None, u:None)

[       OK ] (15/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_4_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /4f790477 @hortense:cpu_rome_256gb+default

P: perf: 4.133 s/step (r:0, l:None, u:None)

[       OK ] (16/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_2_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /35169ab5 @hortense:cpu_rome_256gb+default

P: perf: 8.833 s/step (r:0, l:None, u:None)

[     FAIL ] (17/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=16_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /319c602b @hortense:cpu_rome_256gb+default

==> test failed during 'sanity': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_319c602b'

[     FAIL ] (18/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=16_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /1ec07d06 @hortense:cpu_rome_256gb+default

==> test failed during 'sanity': test staged in '/dodrio/scratch/projects/gadminforever/vsc46128/test-suite/stage/hortense/cpu_rome_256gb/default/EESSI_ESPRESSO_P3M_IONIC_CRYSTALS_1ec07d06'

[       OK ] (19/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /132b4077 @hortense:cpu_rome_256gb+default

P: perf: 0.4142 s/step (r:0, l:None, u:None)

[       OK ] (20/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /d65ed4f6 @hortense:cpu_rome_256gb+default

P: perf: 8.222 s/step (r:0, l:None, u:None)

[       OK ] (21/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /64b695bd @hortense:cpu_rome_256gb+default

P: perf: 8.171 s/step (r:0, l:None, u:None)

[       OK ] (22/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=1_node %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /78fa4a41 @hortense:cpu_rome_256gb+default

P: perf: 7.568 s/step (r:0, l:None, u:None)

[       OK ] (23/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=2_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /dd36273c @hortense:cpu_rome_256gb+default

P: perf: 8.242 s/step (r:0, l:None, u:None)

[       OK ] (24/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=4_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /076517db @hortense:cpu_rome_256gb+default

P: perf: 0.4094 s/step (r:0, l:None, u:None)

[       OK ] (25/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.1-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /a160edb4 @hortense:cpu_rome_256gb+default

P: perf: 9.291 s/step (r:0, l:None, u:None)

[       OK ] (26/26) EESSI_ESPRESSO_P3M_IONIC_CRYSTALS %scale=8_nodes %module_name=ESPResSo/4.2.2-foss-2023a %device_type=cpu %benchmark_info=mpi.ionic_crystals.p3m /962e9d54 @hortense:cpu_rome_256gb+default

P: perf: 9.29 s/step (r:0, l:None, u:None)

[----------] all spawned checks have finished

[  FAILED  ] Ran 26/26 test case(s) from 26 check(s) (2 failure(s), 0 skipped, 0 aborted)

[==========] Finished on Sat Jun  8 11:29:32 2024 

@casparvl
Copy link
Collaborator

Just for logging this here: I encountered

     JOBID PARTITION                      NAME     USER    STATE       TIME TIME_LIMI  NODES   PRIORITY           START_TIME NODELIST(REASON)
  28104668       cpu rfm_EESSI_ESPRESSO_P3M_IO eucaspar  PENDING       0:00   5:00:00     16   67572127                  N/A (MaxMemPerLimit)
  28104667       cpu rfm_EESSI_ESPRESSO_P3M_IO eucaspar  PENDING       0:00   5:00:00     16   67572127                  N/A (MaxMemPerLimit)
  28104670       cpu rfm_EESSI_ESPRESSO_P3M_IO eucaspar  PENDING       0:00   5:00:00      8   67567384                  N/A (MaxMemPerLimit)
  28104669       cpu rfm_EESSI_ESPRESSO_P3M_IO eucaspar  PENDING       0:00   5:00:00      8   67567384                  N/A (MaxMemPerLimit)
  28104672       cpu rfm_EESSI_ESPRESSO_P3M_IO eucaspar  PENDING       0:00   5:00:00      4   67565012                  N/A (MaxMemPerLimit)
  28104671       cpu rfm_EESSI_ESPRESSO_P3M_IO eucaspar  PENDING       0:00   5:00:00      4   67565012                  N/A (MaxMemPerLimit)
  28104674       cpu rfm_EESSI_ESPRESSO_P3M_IO eucaspar  PENDING       0:00   5:00:00      2   67563826                  N/A (MaxMemPerLimit)
  28104673       cpu rfm_EESSI_ESPRESSO_P3M_IO eucaspar  PENDING       0:00   5:00:00      2   67563826                  N/A (MaxMemPerLimit)
  28104676       cpu rfm_EESSI_ESPRESSO_P3M_IO eucaspar  PENDING       0:00   5:00:00      1   67563233                  N/A (MaxMemPerLimit)
  28104675       cpu rfm_EESSI_ESPRESSO_P3M_IO eucaspar  PENDING       0:00   5:00:00      1   67563233                  N/A (MaxMemPerLimit)

On Vega. That's why Satish made the change in ef21ed5 to 1) use the hook for this (that will skip the test if insufficient memory is available) and 2) reduce the requested memory per task to 0.9GB.

I've just resubmitted the tests, I can confirm that I no longer get the MaxMemPerLimit Reason and all jobs are running now. So looking good so far.

Satish Kamath added 3 commits June 10, 2024 16:25
boegel
boegel previously requested changes Jun 11, 2024
Copy link
Contributor

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to include eessi/testsuite/tests/apps/espresso/src/job.sh?

I don't see it being used at all by the test...

Maybe it's better to have a README file with some info on how to run this manually (using EESSI, not Spack ;) )

removing the statement from madelung that puts benchmark.csv as path
within the output parameter.
@casparvl
Copy link
Collaborator

Ok, retested on final time. The bigger tests haven't completed yet, but the tests that have finished look good. Good enough for me.

@casparvl casparvl dismissed boegel’s stale review June 11, 2024 20:32

Comments were taken into account.

@casparvl casparvl merged commit 9d51709 into EESSI:main Jun 11, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants