Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assess impact of single vs. double precision on performance #224

Open
billsacks opened this issue Jan 25, 2018 · 3 comments
Open

Assess impact of single vs. double precision on performance #224

billsacks opened this issue Jan 25, 2018 · 3 comments
Labels
investigation Needs to be verified and more investigation into what's going on. performance idea or PR to improve performance (e.g. throughput, memory) priority: low Background task that doesn't need to be done right away.

Comments

@billsacks
Copy link
Member

billsacks commented Jan 25, 2018

In an effort to get across-the-board speedups, we may want to assess the impact of using single vs. double precision. If there is a big performance impact, we may want to consider using single precision at least in certain performance hot-spot areas of the code, if the impact on model results is minimal.

Mariana suggested this November, 2015. From doing a bit of reading, my sense is that this is very processor-dependent (which makes sense).

One area where we would likely see an across-the-board gain due to a switch to single-precision is in memory use: 1/2 the memory use (in areas of the code like the history file) means greater cache efficiency... and it feels to me like a lot of our troubles in CLM are due to cache inefficiencies.

@billsacks billsacks added support user or developer needs help investigation Needs to be verified and more investigation into what's going on. and removed support user or developer needs help labels Jan 25, 2018
@billsacks
Copy link
Member Author

@barlage @swensosc and others: CanopyFluxes could be a good place to try switching to single precision, since that is a big culprit in terms of total run time. e.g., all variables used inside the iteration loop could be switched to single precision. Even if we needed to copy in and copy out, the benefits of using single-precision might outweigh the copy-in / copy-out time. We could also be sure to use non-pointer variables (#235 ) in this part of the code.

@billsacks
Copy link
Member Author

Long-term, @mvertens and I think it could make sense to:

  1. Use r8 for variables where having double precision is critical for numerical reasons

  2. Use r4 for variables where performance is critical and numerics are not an issue

  3. Use something generic like rk (for "real kind") for all other variables. Then rk could be switched between r4 and r8 with a single change somewhere.

@billsacks billsacks added the priority: low Background task that doesn't need to be done right away. label Jun 20, 2019
@billsacks
Copy link
Member Author

Based on John Dennis's presentation in today's SEWG meeting, I'm not very optimistic about the prospects for improved performance with single precision. For one thing, it sounds like most of the benefit comes from doubling your vector length, which depends on having vectorized code, so it may not make sense to even try to tackle this until we have made progress towards getting better vectorization (see also #627 ). We could still potentially get some speedups due to the greater cache friendliness of single precision.

So I'm labeling this low priority.

@samsrabin samsrabin added the performance idea or PR to improve performance (e.g. throughput, memory) label Feb 8, 2024
samsrabin pushed a commit to samsrabin/CTSM that referenced this issue May 3, 2024
Allow adding a day to streams' file name date specifier

### Description of changes

This is needed for the new presaero 24-hour cplhist files, where the first file name has a date stamp of 0001-01-02.

This can be leveraged by changing the file specifier for a stream – such as the presaero stream – to have a filename_advance_days="1" attribute like this:

```xml
<file first_year="$DATM_YR_START" last_year="$DATM_YR_END" filename_advance_days="1">$DATM_CPLHIST_DIR/$DATM_CPLHIST_CASE.cpl.ha2x1d.%ymd.nc</file>
```

### Specific notes

Contributors other than yourself, if any: @olyson 

CDEPS Issues Fixed (include github issue #): none

Are there dependencies on other component PRs (if so list): No

Are changes expected to change answers (bfb, different to roundoff, more substantial): No

Any User Interface Changes (namelist or namelist defaults changes): None

Testing performed (e.g. aux_cdeps, CESM prealpha, etc): Manual testing

Hashes used for testing: CTSM at tag ctsm5.1.dev120, with its corresponding externals
samsrabin pushed a commit to samsrabin/CTSM that referenced this issue May 3, 2024
Update stream definitions for new coupler history file format

### Description of changes

Modify stream_definition_datm.xml to generate a streams file (datm.streams.xml) with the new coupler history file format.

### Specific notes

Changes to accommodate new coupler history file names.
Change offset for solar stream from 2700 to -900 to accommodate changes due to time stamps.
These changes work in conjunction with CDEPS PR ESCOMP#224 and CDEPS PR ESCOMP#222 .
Note that I did not change the file names for ndep, or remove that stream. See ESCOMP/CDEPS#230

Contributors other than yourself, if any: @billsacks 

CDEPS Issues Fixed (include github issue #):  N/A

Are there dependencies on other component PRs (if so list):  No

Are changes expected to change answers (bfb, different to roundoff, more substantial):  Yes, in coupler history mode.

Any User Interface Changes (namelist or namelist defaults changes): No

Testing performed (e.g. aux_cdeps, CESM prealpha, etc):  I have conducted a pair of cases, an F-case to generate coupler history files, and an I-case to read those files, using the new file name convention, and compared the forcing output variables from clm history files between the two cases.  @billsacks and I reviewed these differences and found them to be acceptable.

@billsacks ran SMS_D_Ld1.ne30pg3_t061.I1850Clm50BgcSpinup.cheyenne_intel.clm-cplhist in the context of ESCOMP#1999

Hashes used for testing:  N/A
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigation Needs to be verified and more investigation into what's going on. performance idea or PR to improve performance (e.g. throughput, memory) priority: low Background task that doesn't need to be done right away.
Projects
None yet
Development

No branches or pull requests

2 participants