Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in OpenACC directives and revise machine files #6116

Merged
merged 1 commit into from
Jan 26, 2024

Conversation

hyungyukang
Copy link
Contributor

@hyungyukang hyungyukang commented Dec 10, 2023

Fix bugs in OpenACC directives:

  • Change an openacc directive in mpas_ocn_diagnostics.F to avoid
    getting stuck on pm-gpu.

Fixes #6127

[BFB]

Copy link

github-actions bot commented Dec 10, 2023

PR Preview Action v1.4.6
🚀 Deployed preview to https://E3SM-Project.github.io/E3SM/pr-preview/pr-6116/
on branch gh-pages at 2023-12-23 02:53 UTC

@hyungyukang hyungyukang added BFB PR leaves answers BFB OpenACC labels Dec 10, 2023
@ndkeen
Copy link
Contributor

ndkeen commented Dec 11, 2023

I'm seeing changes here that will have impact for other cases. We may need to find a way to make changes that will only affect MPAS. If MPAS needs a flag, then maybe should only be applied to those sources. If you need -acc, maybe only use that flag for the sources required. And please don't change the GPU bindings unless we all agree upon those changes as they impact all cases.

@hyungyukang
Copy link
Contributor Author

I'm seeing changes here that will have impact for other cases. We may need to find a way to make changes that will only affect MPAS. If MPAS needs a flag, then maybe should only be applied to those sources. If you need -acc, maybe only use that flag for the sources required. And please don't change the GPU bindings unless we all agree upon those changes as they impact all cases.

@ndkeen , Thanks for your instructions. I'm going to revert some changes I made. It seems like the best way to add -acc to nvidiagpu in MPAS is to copy and modify Depends.pgigpu.cmake. What do you think about this?

@hyungyukang hyungyukang requested a review from ndkeen December 11, 2023 14:07
Copy link
Contributor

@grnydawn grnydawn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that we may need to restrict the scope of impact from this changes within MPAS. Please see my comments in my review. I prefer to have a GPU bind suggested in this PR instead of "none". But it may need to coordinate with other developers who may impact to the GPU bind change.

@@ -107,6 +107,7 @@ list(APPEND MPAS_ADD_ACC_FLAGS
${CMAKE_BINARY_DIR}/core_ocean/mode_forward/mpas_ocn_time_integration_rk4.f90
${CMAKE_BINARY_DIR}/core_ocean/mode_forward/mpas_ocn_time_integration_si.f90
${CMAKE_BINARY_DIR}/core_ocean/mode_forward/mpas_ocn_time_integration_split.f90
${CMAKE_BINARY_DIR}/core_ocean/mode_forward/mpas_ocn_time_integration_split_ab2.f90
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how CIME use Depends files, but in my understanding, I think the settings the changes made here needs to be placed in "Depends.pm-gpu.nvidiagpu.cmake" file, not this file, because I assume that nvidiagpu is the compiler for OpenACC compilation on PM-GPU.

@@ -2,9 +2,13 @@ list(APPEND REDUCE_OPT_LIST
homme/src/share/derivative_mod_base.F90
)

# add accelerator/gpu flags for MPAS files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to @ndkeen's comment, this setting might be placed in the Depends file, not in this cmake file, in order to limit the scope of impact to specified MPAS files.

string(APPEND CPPDEFS " -DFORTRANUNDERSCORE -DNO_R16 -DCPRNVIDIA")
string(APPEND CPPDEFS_DEBUG " -DYAKL_DEBUG")
if (compile_threaded)
string(APPEND CMAKE_C_FLAGS " -mp")
endif()
string(APPEND CMAKE_C_FLAGS_RELEASE " -O2")
string(APPEND CMAKE_C_FLAGS_DEBUG " -g")
string(APPEND CMAKE_Fortran_FLAGS " -i4 -Mstack_arrays -Mextend -byteswapio -Mflushz -Kieee -DHAVE_IEEE_ARITHMETIC -Mallocatable=03 -DNO_R16 -traceback")
string(APPEND CMAKE_Fortran_FLAGS " -i4 -Mstack_arrays -Mextend -byteswapio -Mflushz -Kieee -DHAVE_IEEE_ARITHMETIC -Mallocatable=03 -DNO_R16 -traceback -acc")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to @ndkeen's comment, this setting might be placed in the Depends file, not in this cmake file, in order to limit the scope of impact to specified MPAS files.

Change an openacc directive in mpas_ocn_diagnostics.F to avoid
getting stuck on pm-gpu.
@amametjanov amametjanov force-pushed the hkang/ocean/pm-gpu-fix branch from b88b0d2 to 95d64d2 Compare December 23, 2023 02:52
@amametjanov
Copy link
Member

Rebased the branch onto latest master and checked that this PR still fixes the hang on PM-GPU: https://my.cdash.org/viewTest.php?buildid=2460796 .

@amametjanov amametjanov requested review from grnydawn and removed request for amametjanov December 23, 2023 03:03
@hyungyukang
Copy link
Contributor Author

hyungyukang commented Dec 23, 2023

@amametjanov , thanks for rebasing this PR!

@grnydawn , some of changes I made in this PR were duplicates of #6103 (merged) except for mpas_ocn_diagnostics.F, so they disappeared by rebasing this PR.

@ndkeen , this PR fixes #6116.

Copy link
Contributor

@ndkeen ndkeen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now need #6166 to allow building, but after that, this fix allows the test to run

amametjanov added a commit that referenced this pull request Jan 24, 2024
Fix bugs in OpenACC directives:
- Change an openacc directive in mpas_ocn_diagnostics.F to avoid
getting stuck on pm-gpu.

Fixes #6127

[BFB]
Copy link
Contributor

@grnydawn grnydawn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amametjanov, the changes that Hyun made in "mpas_ocn_diagnostics.F." look good to me.

However, my test case (compset: GMPAS-IAF, res: T62_EC30to60E2r2) on Hyun's code branch (hkang/ocean/pm-gpu-fix) fails during building SPIO with the following error message:

"/global/u2/y/youngsun/repos/github/E3SM/externals/scorpio/src/clib/pioc.c", line 1282: error: too many arguments in invocation of macro "adios2_init"
          ios->adiosH = adios2_init(ios->adios_comm, adios2_debug_mode_on);

Since this error is not related to the changes in the MPAS code, I believe this PR is ready to be merged.

@amametjanov amametjanov merged commit 4f0d0ca into master Jan 26, 2024
3 checks passed
@amametjanov amametjanov deleted the hkang/ocean/pm-gpu-fix branch January 26, 2024 02:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fails with nvidiagpu compiler and SMS_Ld1.T62_oEC60to30v3.CMPASO-NYF.pm-gpu_nvidiagpu
4 participants