Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpgfunction cleanup #9307

Closed
wants to merge 2 commits into from
Closed

cpgfunction cleanup #9307

wants to merge 2 commits into from

Conversation

mitchute
Copy link
Collaborator

@mitchute mitchute commented Mar 2, 2022

Pull request overview

  • Removes OpenMP from cpgfunction
  • cpgfunction CMake cleanups

Pull Request Author

Add to this list or remove from it as applicable. This is a simple templated set of guidelines.

  • Title of PR should be user-synopsis style (clearly understandable in a standalone changelog context)
  • Label the PR with at least one of: Defect, Refactoring, NewFeature, Performance, and/or DoNoPublish
  • Pull requests that impact EnergyPlus code must also include unit tests to cover enhancement or defect repair
  • Author should provide a "walkthrough" of relevant code changes using a GitHub code review comment process
  • If any diffs are expected, author must demonstrate they are justified using plots and descriptions
  • If changes fix a defect, the fix should be demonstrated in plots and descriptions
  • If any defect files are updated to a more recent version, upload new versions here or on DevSupport
  • If IDD requires transition, transition source, rules, ExpandObjects, and IDFs must be updated, and add IDDChange label
  • If structural output changes, add to output rules file and add OutputChange label
  • If adding/removing any LaTeX docs or figures, update that document's CMakeLists file dependencies

Reviewer

This will not be exhaustively relevant to every PR.

  • Perform a Code Review on GitHub
  • If branch is behind develop, merge develop and build locally to check for side effects of the merge
  • If defect, verify by running develop branch and reproducing defect, then running PR and reproducing fix
  • If feature, test running new feature, try creative ways to break it
  • CI status: all green or justified
  • Check that performance is not impacted (CI Linux results include performance check)
  • Run Unit Test(s) locally
  • Check any new function arguments for performance impacts
  • Verify IDF naming conventions and styles, memos and notes and defaults
  • If new idf included, locally check the err file and other outputs

@mitchute mitchute added the Refactoring Includes code changes that don't change the functionality of the program, just perform refactoring label Mar 2, 2022
@mitchute mitchute self-assigned this Mar 2, 2022
@mitchute
Copy link
Collaborator Author

mitchute commented Mar 2, 2022

@Myoldmopar there's a lot more that could be done in there... let me know how far to take this.

@mbadams5
Copy link
Contributor

mbadams5 commented Mar 3, 2022

@mitchute, I don't particularly care either way, but why couldn't you wrap the OpenMP functions with #ifdef _OPENMP like this?

#ifdef _OPENMP
#pragma omp parallel default(none) shared(files, number_files, fileCount, schema, outputType, outputTypeStr, output_directory, convertHVACTemplate)
{
#pragma omp for
for (int i = 0; i < number_files; ++i) {
bool successful = processInput(files[i], schema, outputType, output_directory, outputTypeStr, convertHVACTemplate);
#pragma omp atomic
++fileCount;
if (successful) {
displayMessage("Input file converted to {} successfully | {}/{} | {}", outputTypeStr, fileCount, number_files, files[i]);
} else {
displayMessage("Input file conversion failed: | {}/{} | {}", fileCount, number_files, files[i]);
}
}
}
#else

Then I only link the libraries if they are found

find_package(OpenMP)
if(OPENMP_FOUND)
target_link_libraries(ConvertInputFormat PRIVATE OpenMP::OpenMP_CXX)
endif()

@jmarrec
Copy link
Contributor

jmarrec commented Mar 3, 2022

I'm a bit confused: if there is a gain by using openMP (is there?) then why remove it? I'd agree with @mbadams5 and only enable the blocks if openMP was found.

The alternative is to actually start requiring OpenMP, which isn't such a big deal really (it's not like it's hard to install on unix, it's preinstalled with MSVC, and the github runners already have it). Then we could do the call to to find_package(OpenMP COMPONENTS CXX REQUIRED) in the root CMakeLists.txt and always do set(CMAKE_INSTALL_OPENMP_LIBRARIES TRUE)

@shorowit
Copy link
Contributor

shorowit commented Mar 3, 2022

@jmarrec There’s a good summary about openmp in EnergyPlus here from @amirroth. It was introduced in EnergyPlus in one version and then quickly removed.

Full disclosure: We (the BEopt dev team) were the ones that complained that “… slowdowns were observed when multiple EnergyPlus simulations were run in parallel”. My recollection is that the slowdowns were quite severe.

@mbadams5
Copy link
Contributor

mbadams5 commented Mar 3, 2022

@shorowit The issue you and amir were describing predates the conversion to C++. And yes, I saw the same issues with the old OpenMP implementation in Fortran for the interior radiant heat exchange (it basically always used 100% of all threads requested regardless of the actual work being down within interior radiant heat exchange). I think there was something wrong with the implementation of OpenMP code in EnergyPlus causing this issue.

Regardless, I think OpenMP can have uses within EnergyPlus. I know LBNL explored some of them under the EP10X project and I do think it probably provides value within cpgfunction here.
What I would do is what I suggested and use OpenMP if it is found, otherwise use normal sequential code.

@jmarrec The only issue really stopping us from just requiring OpenMP is Apple's clang compiler. It is not built with OpenMP support, even though LLVM and Clang built from source or homebrew does have OpenMP support. GCC and MSVC both support OpenMP currently on Linux and Windows.

@jmarrec
Copy link
Contributor

jmarrec commented Mar 3, 2022

@shorowit thanks for the context. @mbadams5 I don't think there's any problem on mac though, if you use AppleClang and as long as it's v7+ (which is very old) you just brew install libomp, and cmake takes care of adding the right flags set(OMP_FLAG_AppleClang "-Xclang -fopenmp").

https://gitlab.kitware.com/cmake/cmake/blob/master/Modules/FindOpenMP.cmake#L107 and https://gitlab.kitware.com/cmake/cmake/blob/master/Modules/FindOpenMP.cmake#L280-287

I just tested again, with https://github.com/jmarrec/CppBenchmarks/blob/main/bench_OpenMP.cpp, and Apple Clang 13.0 on a M1 mac, and it worked fine.

Screen Shot 2022-03-03 at 14 22 37

@mbadams5
Copy link
Contributor

mbadams5 commented Mar 3, 2022

@jmarrec I should have been more specific. Yes, you can either use brew install llvm or brew install libomp to use either compiled LLVM or OpenMP with Apple Clang. However, the latter requires the user to have libomp installed. What I was meaning is that the default Apple Clang does not include OpenMP enabled. Thus would be an issue for users installing a released version of EnergyPlus, obviously a user can compiled EnergyPlus themselves and enable OpenMP with Apple Clang.

@jmarrec
Copy link
Contributor

jmarrec commented Mar 3, 2022

Ok I see. It would be a packaging issue then: actually installing the libomp with EnergyPlus in the correct location and setting rpaths correctly.

@mitchute
Copy link
Collaborator Author

mitchute commented Mar 3, 2022

@jmarrec @mbadams5 I'm happy to do whatever you recommend here. I opened the PR mainly so we could have this conversation and to get your recommendations on how to proceed since this was causing a hang-up elsewhere.

@jmarrec
Copy link
Contributor

jmarrec commented Mar 3, 2022

@mbadams5 so for ConvertInputFormat... currently the code at

find_package(OpenMP)
if(OPENMP_FOUND)
target_link_libraries(ConvertInputFormat PRIVATE OpenMP::OpenMP_CXX)
endif()
is not wrapped in a per platform call. That means all is well right now only because the macOS github runners do not have libomp installed, otherwise we'd be in trouble, wouldn't we?

The same is true on Linux, where it's actually found on Github runners, so a clean linux machine will not be able to run it.
same for energyplus, due to cpgfunctionEP.

Here's a test with a clean docker

docker run -it ubuntu:focal bash

Inside the container:

apt update && apt install wget
curl -L -O https://github.com/NREL/EnergyPlus/releases/download/v22.1.0-IOFreeze/EnergyPlus-22.1.0-4fa778ec59-Linux-Ubuntu20.04-x86_64.tar.gz
tar xfz EnergyPlus-22.1.0-4fa778ec59-Linux-Ubuntu20.04-x86_64.tar.gz
cd EnergyPlus-22.1.0-4fa778ec59-Linux-Ubuntu20.04-x86_64/
root@054f2656b7ff:/EnergyPlus-22.1.0-4fa778ec59-Linux-Ubuntu20.04-x86_64# ldd energyplus
	linux-vdso.so.1 (0x00007ffe127f1000)
	libenergyplusapi.so.22.1.0 => /EnergyPlus-22.1.0-4fa778ec59-Linux-Ubuntu20.04-x86_64/./libenergyplusapi.so.22.1.0 (0x00007f32fc63f000)
	libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f32fc45b000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f32fc440000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f32fc24e000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f32fc22b000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f32fc225000)
-	libgomp.so.1 => not found
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f32fc0d4000)
-	libX11.so.6 => not found
	libpython3.8.so.1.0 => /EnergyPlus-22.1.0-4fa778ec59-Linux-Ubuntu20.04-x86_64/./libpython3.8.so.1.0 (0x00007f32fbd39000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f3301173000)
	libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f32fbd34000)


root@054f2656b7ff:/EnergyPlus-22.1.0-4fa778ec59-Linux-Ubuntu20.04-x86_64# ldd ConvertInputFormat
	linux-vdso.so.1 (0x00007fffe32ea000)
-	libgomp.so.1 => not found
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fd7c0ea7000)
	libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fd7c0cc5000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd7c0b76000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fd7c0b5b000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd7c0969000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fd7c1429000)
root@054f2656b7ff:/EnergyPlus-22.1.0-4fa778ec59-Linux-Ubuntu20.04-x86_64# ./energyplus --help
./energyplus: error while loading shared libraries: libgomp.so.1: cannot open shared object file: No such file or directory

root@054f2656b7ff:/EnergyPlus-22.1.0-4fa778ec59-Linux-Ubuntu20.04-x86_64# ./ConvertInputFormat --help
./ConvertInputFormat: error while loading shared libraries: libgomp.so.1: cannot open shared object file: No such file or director

@jmarrec
Copy link
Contributor

jmarrec commented Mar 3, 2022

Note: I made a test bed at https://github.com/jmarrec/test-cxx-OpenMP-installer to see if I could correctly ship the OpenMP libs on mac/Linux (we know windows is fine via InstallRequiredSystemLibraries and CMAKE_INSTALL_OPENMP_LIBRARIES), currently it's failing.

@mitchute mitchute added this to the EnergyPlus 22.2 milestone Mar 14, 2022
@j-c-cook
Copy link
Contributor

I'm responding to @mbadams5 from #9307 (comment).

@shorowit The issue you and amir were describing predates the conversion to C++. And yes, I saw the same issues with the old OpenMP implementation in Fortran for the interior radiant heat exchange (it basically always used 100% of all threads requested regardless of the actual work being down within interior radiant heat exchange). I think there was something wrong with the implementation of OpenMP code in EnergyPlus causing this issue.

Would there be any way to test if the issues with OpenMP persist, and that cpgfunction contains those issues?

Regardless, I think OpenMP can have uses within EnergyPlus. I know LBNL explored some of them under the EP10X project and I do think it probably provides value within cpgfunction here. What I would do is what I suggested and use OpenMP if it is found, otherwise use normal sequential code.

I agree that OpenMP include guards seems to be a good solution. https://stackoverflow.com/a/8447025

I found in my implementation of cpgfunctionEP for EnergyPlus that what I really need to write (or find written) is cross-compiler scalable vectorized LU decomposition for dense square matrices (C or C++). For example, the OpenBLAS implementation of gesv (also np.linalg.solve) blows Eigens decomposition out of the water. The problem is that OpenBLAS is in Fortran (hence cpgfunction->cpgfunctionEP). My current understanding is that the best way to inform the compiler on where to vectorize is with OpenMP. Though I'm not sure if I achieve vectorization on my g++ Ubuntu 9.4.0 compiler that it would apply to say a windows compiler, or if if see great speed increases on my i9 that it would also apply to those hip AMD Rizen chips.

@mitchute mitchute marked this pull request as draft March 21, 2022 21:29
@nrel-bot
Copy link

@mitchute @Myoldmopar it has been 28 days since this pull request was last updated.

@hongtz68
Copy link
Contributor

OpenMP was tried during the 10X project but with limited gains due to the current code structure of EnergyPlus. Another try in future may be worth it after all code refactoring.

@mitchute
Copy link
Collaborator Author

@hongtz68 I think what you have said is true in general for EnergyPlus, but not for this specific third-party tool. OpenMP could be used here effectively to speed up this part of the simulation. However... there are better algorithms that have since come out that would have meaningful speedups without any sort of parallelization on our end. What Jack said above is still true, though. We do need to find, and/or start using some robust, off-the-shelf LA libraries rather than trying to roll our own solutions every time we need something (how many LU decomp or tri-diagonal solvers do we have littered throughout E+ at this point?). Figuring out that issue would go a long way in helping overcome these problems.

@j-c-cook
Copy link
Contributor

The new development of the equivalent borehole method (EBM) by Prieto and Cimmino (2021) is likely the better algorithm @mitchute is mentioning. I have considered enhancing cpgfunctionEP to contain an EBM solver (this effort would be nontrivial), but the calculation would still require solving a system of equations. The current LU decomposition implemented in cpgfunctionEP is painfully slow. (Section 2.5 of Cook (2021, pgs. 55-61) gives the current status of pygfunction, cpgfunction and cpgfunctionEP.) The option for multi-threading the calculation of the segment-to-segment response factors would be substantial even with an EBM solver enhancement.

In recent months I have considered modifying cpgfunctionEP to be dependent on Eigen for all linear algebra functionality, given that the current hand rolled LU decomposition is significantly slower than Eigen's implementation. Eigen is currently not being utilized by cpgfunctionEP due to an unresolved segmentation fault previously thrown by the Linux CI machine while performing LU decomposition (see #8708 (comment)). I am currently uncertain as to what pure C++ library should be utilized by cpgfunctionEP for use in EnergyPlus. I personally like the API BLAS and LAPACK have to offer, but am unaware of any optimized libraries like openBLAS that do not require a fortran compiler.

--
Prieto, C. and M. Cimmino (2021). Thermal interactions in large irregular fields of geothermal boreholes: the method of equivalent boreholes, Journal of Building Performance Simulation, 14:4, 446-460, DOI: https://doi.org/10.1080/19401493.2021.1968953

Cook, J.C. (2021). Development of computer programs for fast computation of g-functions and automated ground heat exchanger design. M.S. Thesis, Oklahoma State University. https://hdl.handle.net/11244/335489

@nrel-bot-3
Copy link

@mitchute @Myoldmopar it has been 28 days since this pull request was last updated.

@mitchute
Copy link
Collaborator Author

This needs to happen but I can't tackle it at this point. There's no sense in keeping this open indefinitely, so I'm closing this one and I'll probably open a different PR later if I get around to it.

@mitchute mitchute closed this Jun 16, 2022
@mitchute mitchute deleted the cpgfunction_cleanup branch June 16, 2022 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Refactoring Includes code changes that don't change the functionality of the program, just perform refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants