Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRTM compile problem when building from tag v5.25.2 #7

Open
gmao-msienkie opened this issue Oct 16, 2019 · 6 comments
Open

CRTM compile problem when building from tag v5.25.2 #7

gmao-msienkie opened this issue Oct 16, 2019 · 6 comments
Assignees

Comments

@gmao-msienkie
Copy link

I accidentally deleted part of a prior build with v5.25.0 so figured I should just delete it all and start again with the newer v5.25.2 release. When executing the parallel_build I ran into a problem compiling the NCEP CRTM

catastrophic error: **Internal compiler error: internal abort** Please report this error along with the circumstances in which it occurred in a Software Problem Report.  Note: File and line given may not be explicit cause of this error.
compilation aborted for /gpfsm/dnb04/projects/p72/msienkie/DAS_checkout/GEOSadas/src/Shared/@NCEP_Shared/NCEP_crtm/ADA_Module.f90 (code 1)
make[2]: *** [src/Shared/@NCEP_Shared/NCEP_crtm/CMakeFiles/NCEP_crtm.dir/ADA_Module.f90.o] Error 1
make[1]: *** [src/Shared/@NCEP_Shared/NCEP_crtm/CMakeFiles/NCEP_crtm.dir/all] Error 2

I checked out the v5.25.0 tag again to try that, and there was no problem in compiling the CRTM.

@mathomp4
Copy link
Member

Yeah. This has actually be here since the CVS days. It's a bug to do with Intel 18+ and ADA_Module.F90. The worst thing is that if you recloned 5.25.2 it'll build for you. It's an occasional error that can be solved by:

  1. Change the optimization level of that file to -O0, -O1, or -O3. Anything but -O2.
  2. Perhaps (per @aoloso), add -no-vec to the file's options helps it work.
  3. Change TMPDIR in your environment
  4. Build on a different node

All of these can help build it.

The first one is guaranteed to let you build, but it will most likely be non-zero-diff.

The second one is a new one that I've heard helps, but again, might be non-zero-diff. And to test it, I need to trigger the ADA_Module problem (I rarely hit it).

The third and fourth don't change the compilation level, but they do not always work, so they are rather bad workarounds, but they can help in a pinch.

Thus, this means we have an issue that (truly) only @rtodling or others in GEOSadas land can help solve. Any permanent fix will probably mean a non-zero-diff change to a code I have no idea how to test.

@gmao-msienkie
Copy link
Author

I was having no luck with my TMPDIR set to something on dnb31 (same as my NOBACKUP) but when I set to a directory on dnb04 it compiled. Weird. I wonder if there is something to do with quota because I've occasionally been having issues with too many files in my $NOBACKUP. Then again maybe not. It's a mystery.

@gmao-msienkie
Copy link
Author

At last week's GSI meeting there was some talk about bringing in a new version of the CRTM. Sounds like that would be a good time to bring in any compile changes (possibly optimization changes) that would allow the library to be reliably built under the current build scheme.

@mathomp4
Copy link
Member

Do you know if this file is changed in the new version? If so, maybe we could escape. Or, as you say, move away from using different compile options and fix it that way

@bena-nasa
Copy link
Collaborator

bena-nasa commented Dec 14, 2020

And it is back. I've just hit this multiple times now with version geos/v1.0.5 using intel 19.1.3.304 with the debug build. I did confirm the debugging flags were still being passed when choosing the DEBUG build type in cmake. So apparently this can happen even at -O0

@mathomp4
Copy link
Member

As reported by @bena-nasa, this is still with us. Testing shows that -check all,noarg_temp_created seems to be the culprit in our Intel Debug flags. If we do something a la FMS:

# This 'resets' the Intel DEBUG flags for FMS. The stock debug flags use
# 'all,noarg_temp_created' which seem to be too aggressive for FMS/MOM6. This
# moves them back to the 'bounds,uninit' GEOS used to build with.
if (CMAKE_Fortran_COMPILER_ID MATCHES "Intel" AND CMAKE_BUILD_TYPE MATCHES Debug)
   string(REPLACE "all,noarg_temp_created" "bounds,uninit" _tmp "${GEOS_Fortran_FLAGS_DEBUG}")
   set (CMAKE_Fortran_FLAGS_DEBUG "${_tmp}")
endif ()

I think it would work around it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants