non-BFB normalVelocity in wetting/drying test case for MPAS-O when using more than 9 procs #5902

gcapodag · 2023-08-29T14:50:38Z

The normalVelocity is not BFB compared to a serial run when using 10 or more (I tested 11,12 and 13) processes in parallel.
This has been observed in the Compass test case called: ocean/drying_slope/1km/single_layer/ramp.
The results are instead BFB when running with 2,3,...,9 processes.
See also companion issue on Compass Github: MPAS-Dev/compass#686

The text was updated successfully, but these errors were encountered:

gcapodag · 2023-09-01T00:05:29Z

It looks like with dt=30s and a run duration of 31min and 30s the results are BFB, the next time-step (so run duration 32min) they become non-BFB. I saved the intermediate solutions for the normalVelocityProvis during the last time-step computation (from 31:30 to 32:00) at the first three stages of RK4 and compared them serial vs 10 procs using ncdiff. The results show they are exactly the same (both in single and double precision), though printing from the code after the first stage of RK4 the provis velocity is not the same because the tendency is not the same. At the first stage the tendency and diagnostics are computed with the old solution. Also, it does not look like this has anything to do with wetting and drying since wettingVelocityFactor is zero when the provis solution is advanced.

xylar · 2023-09-01T07:35:34Z

Insufficient halo updates on that first provisional solution?

gcapodag · 2023-09-01T07:36:32Z

@xylar After opening countless matrioskas I finally found the problem. A halo update on layerThickEdgeFlux seems to be missing in the code. After I added this halo update right before computing ocn_time_integrator_rk4_compute_vel_tends in RK4 the runs with more than 9 procs are finally BFB with respect to the serial 12 hr run.

xylar · 2023-09-01T07:39:41Z

That's wonderful! Not a fun debugging process, I'm sure.

cbegeman mentioned this issue Sep 5, 2023

Add decomp test case to the drying slope test group MPAS-Dev/compass#695

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

non-BFB normalVelocity in wetting/drying test case for MPAS-O when using more than 9 procs #5902

non-BFB normalVelocity in wetting/drying test case for MPAS-O when using more than 9 procs #5902

gcapodag commented Aug 29, 2023

gcapodag commented Sep 1, 2023 •

edited

Loading

xylar commented Sep 1, 2023

gcapodag commented Sep 1, 2023

xylar commented Sep 1, 2023

non-BFB normalVelocity in wetting/drying test case for MPAS-O when using more than 9 procs #5902

non-BFB normalVelocity in wetting/drying test case for MPAS-O when using more than 9 procs #5902

Comments

gcapodag commented Aug 29, 2023

gcapodag commented Sep 1, 2023 • edited Loading

xylar commented Sep 1, 2023

gcapodag commented Sep 1, 2023

xylar commented Sep 1, 2023

gcapodag commented Sep 1, 2023 •

edited

Loading