Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evp kernel / #160

Closed
wants to merge 32 commits into from
Closed

Evp kernel / #160

wants to merge 32 commits into from

Conversation

mhrib
Copy link
Contributor

@mhrib mhrib commented Jul 5, 2018

A number of items:

  1. New vectorized EVP-kernel (#ifdef DMI_EVP)
    1b) OpenMP issues??
  2. Revised EVP parameters according to Kimmirtz, Danilov and Losch, 2015 (#ifdef DMI_REVP)
  3. Splitting of namelist - maybe not important? (#ifdef DMI_nml)
  4. DMI specific (#ifdef DMI)
  5. Allocatable version for horz. parameters (NO #ifdef , just changed)

All from: https://github.com/mhrib/CICE/tree/evp_kernel


ad1) EVP-kernel

New EVP kernels made on the full grid on order to avoid expensive halo updates and keep it all in memory. Vectoriced and suited for one single node using OpenMP threads.

CPPDEF:
#ifdef DMI_EVP

Affected files:
./cicecore/cicedynB/dynamics/evp_kernel1d.F90 (The core)
./cicecore/cicedynB/dynamics/ice_dyn_evp.F90 (switch between CICE/new kernels)
./cicecore/cicedynB/dynamics/ice_dyn_shared.F90 (for namelist)
./cicecore/cicedynB/general/ice_init.F90 (for namelist)

Namelist:
&dynamics_nml
evp_kernel_ver = 0
...
0: CICE (default) , 1: kernel_v1 , 2: kernel_v2 (calculate distances inline)

The good:
A) EVP kernel is working and implemented into CICE. Kernel 0 (CICE) or kernel 1+2 is configurable via. namelist "evp_kernel_ver" (integer)
B) Kernel runs threaded on a single node using the following calls in ice_dyn_evp.F90:

  • evp_copyin(...): [nx_block,ny_block,nblocks] --> 1D vectors
  • evp_kernel()
  • evp_copyout(...): 1D vectors -> [nx_block,ny_block,nblocks]
    C) Works fine with binary identical results for (uvel,vvel) compared to CICE evp version using "normal" grids using IEEE flags.
    D) Seems to work fine (normal grids) with both CICE and evp-kernel threads ... AFTER comment out 10 OMP blocks (see ad 1b below)
    E) Works for "normal" circular grids using vectorized halo_update. (updated after June 2018 WebEX)

To Do and nice to have:
A) Issues with tri-pole grids. Assume that a correct halo-update can fix this???
B) Issues with CICE threading: Comment out 10 out of 135 OMP calls (in 33 files). See below for specific files:
C) Reduce number of variables in "evp_copyin" routine for v1+v2. Memory usages can be reduced hereby, and speedup can be expected. Easy to do.
D) "logistics" such that CICE can run on several nodes, and a single node for EVP kernel only.
E) Kernel v3 using real*4 internally - reduce memory and increase performance...
F) "basal_stress" has not been vectorized yet (the call between stress and stepu in ice_dyn_evp.F90). Strait forward to do it.


ad1b: OMP threads issues:

1 : xx OMP : ./cicecore/cicedynB/analysis/ice_diagnostics.F90 (Comment out last two OMP blocks)
2 : xx OMP : ./cicecore/cicedynB/analysis/ice_history.F90 (Comment out first OMP block , NOTE: OMP PRIVATE: ,qn,ns, --> ,qn,sn,)
4 : xx OMP : ./cicecore/cicedynB/dynamics/ice_dyn_eap.F90 (Comment out OMP blocks for stress+stepu)
5 : xx OMP : ./cicecore/cicedynB/dynamics/ice_dyn_evp.F90 (Comment out OMP blocks for stress+stepu)
7 : xx OMP : ./cicecore/cicedynB/dynamics/ice_transport_driver.F90 (Comment out OMP block #1,4)
8 : xx OMP : ./cicecore/cicedynB/dynamics/ice_transport_remap.F90 (Comment out second OMP block)
11 : xx OMP : ./cicecore/cicedynB/general/ice_init.F90 (Comment out first OMP block)
23 : xx OMP : ./cicecore/drivers/cice/CICE_RunMod.F90 (Comment out last OMP block)

I have inserted the following text in front of the affected OMP directives:

grep -r "MHRI: CHECK THIS OMP" ./ | grep -v git

NOTE: I have not checked so careful, so perhaps some of the OMP's are ok? Tested with cray,intel and gnu compilers and they crash. Sometimes a "re-submit" made them work ok - maybe race


ad 2) Revised EVP parameters (Till)

... according to Kimmirtz, Danilov and Losch, 2015 (#ifdef DMI_REVP)

CPPDEF:
#ifdef DMI_REVP

affected files:
./cicecore/cicedynB/dynamics/ice_dyn_evp.F90
./cicecore/cicedynB/dynamics/ice_dyn_shared.F90 (for namelist)
./cicecore/cicedynB/general/ice_init.F90 (for namelist)

Namelist:
&dynamics_nml
arlx1i = 1._dbl_kind/300.0_dbl_kind
brlx = 300.0_dbl_kind
(above is default values)


ad 3) Splitting of namelist

  • maybe not important, if the order is correct?

CPPDEF
#ifdef DMI_nml

Affected files:
/cicecore/cicedynB/analysis/ice_history_pond.F90
./cicecore/cicedynB/analysis/ice_history_bgc.F90
./cicecore/cicedynB/analysis/ice_history_drag.F90
./cicecore/cicedynB/analysis/ice_history_mechred.F90
./cicecore/cicedynB/analysis/ice_history.F90
./cicecore/cicedynB/infrastructure/ice_domain.F90
./cicecore/shared/ice_fileunits.F90
./cicecore/shared/ice_init_column.F90


ad 4) DMI specific

Nice to have but not needed by others

CPPDEFS
#ifdef DMI


ad 5) Allocatable version

Allocated version for horz. parameters

SAME as this pull on a more clean version is found here:
#157

CPPDEFS:
I have just changed the code, so no CPP flags. But quite some "subroutine alloc_xxx"'s

namelist
&domain_nml
nprocs = -1
max_blocks = -1 ! max number of blocks per processor
block_size_x = -1 ! size of block in first horiz dimension
block_size_y = -1 ! size of block in second horiz dimension
nx_global = -1 ! NXGLOB, i-axis size
ny_global = -1 ! NYGLOB

CICE stops with ERROR if the namelist settings are <1 - except for max_blocks. If max_blocks is <1 it is automatically calculated/estimated as:
max_blocks=( ((nx_global-1)/block_size_x + 1) * &
((ny_global-1)/block_size_y + 1) ) / nprocs

affected files:
cicecore/shared/ice_domain_size.F90
cicecore/shared/ice_arrays_column.F90
cicecore/drivers/cice/CICE_InitMod.F90
cicecore/drivers/hadgem3/CICE_InitMod.F90
cicecore/drivers/cesm/CICE_InitMod.F90
cicecore/cicedynB/infrastructure/ice_grid.F90
cicecore/cicedynB/infrastructure/ice_restoring.F90
cicecore/cicedynB/general/ice_flux.F90
cicecore/cicedynB/general/ice_state.F90
cicecore/cicedynB/general/ice_forcing.F90
cicecore/cicedynB/general/ice_forcing_bgc.F90
cicecore/cicedynB/general/ice_flux_bgc.F90



  • Developer(s):
    Mads Hvid Ribergaard, DMI
    Jacob Weismann Poulsen, DMI
    Till Rasmussen, DMI

  • Please suggest code Pull Request reviewers in the column at right.

  • Are the code changes bit for bit, different at roundoff level, or more substantial?

  • Is the documentation being updated with this PR? (Y/N)
    If not, does the documentation need to be updated separately at a later time? (Y/N)
    Note: "Documentation" includes information on the wiki and .rst files in doc/source/,
    which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.

  • Other Relevant Details:
    Email/WebEx us if you have any questions...

@dabail10
Copy link
Contributor

dabail10 commented Jul 6, 2018

Thanks for all the work here. I have a modified version of the boundary updates for the tripole grids. I am happy to share this with you. I'm not sure if it will help the above. What is the tool you use for tracking down the OMP issues? I know we've had a few for awhile, but it would be good to fix them. My email is [email protected].

@mhrib
Copy link
Contributor Author

mhrib commented Jul 6, 2018 via email

@mhrib
Copy link
Contributor Author

mhrib commented Jul 6, 2018 via email

@apcraig
Copy link
Contributor

apcraig commented Oct 5, 2018

Just FYI, this is being separated into multiple PRs, but being kept open as a reference. See #194, #184

@apcraig
Copy link
Contributor

apcraig commented Nov 1, 2018

I will close this now. @mhrib , if you are able to keep your evp_kernel branch around for a while in case we need to reference it, that would be helpful. That just means not manually deleting the branch just yet. thanks.

@apcraig apcraig closed this Nov 1, 2018
@mhrib mhrib deleted the evp_kernel branch November 8, 2023 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants