-
Notifications
You must be signed in to change notification settings - Fork 392
Refactoring Passes
We have been working in the C++ version of EnergyPlus for almost ten years now, but the code still has a lot of "Fortran" in it. There have been a few targeted refactoring efforts focused on specific areas (e.g., plant) and there will be more, but a few sweeps through the code to remove some old idioms and replace with new and better ones will create a better environment in which to perform those future efforts and establish new patterns that new and updated code can mimic. I estimate that we need to do about 12-15 passes of various "sizes". The way I would characterize the size of a pass is:
- small: can be done by one person in two months or less.
- medium: can be done by two people in one iteration or less.
- large: requires more than two people and/or more than one iteration.
Here are the passes that I am thinking of along with my effort estimates, which may be completely off. I've tried to order them "topologically", i.e., the ones that don't depend on other passes first. These are actually not "refactoring" per se, they are more "re-idiomizing" or "re-patterning". Real refactoring, e.g., reorganizing objects to leverage inheritance and virtual functions, collecting related data items into struct
s, converting groups scalar variables into array variables indexed by enum
, etc. is instance-specific and probably cannot be done in a pass. Those are probably better done in a targeted for specific modules that can clearly benefit from such transformations.
-
Enum. Medium. This pass is already underway and about 60% completed. It consists of three sub-passes. The first converts collections of
constexpr int
constants intoenum class
enumerations with a standard format in which the first item isInvalid = -1
and the last item isNum
. This part may require some thinking and rejiggering (technical term) as in several places the code exhibits an undue reliance on the actual values of enumerations rather than on their symbolic names. The second sub-pass creates aconstexpr std::array<std::string_view, static_cast<int>(ec::Num)> = ecNamesUC = {};
array of upper-cased names for eachenum class ec
that appears in the IDF and replaces theif-else
ladders used to convert the enumeration strings to their values with calls tostatic_cast<ec>getEnumerationValue(str, ecNamesUC);
. A goal of this sub-pass is to eliminate case-insensitive comparisons to constant strings. The third sub-pass replaces theSELECT_CASE_var
if-else
ladders that use these enumerations withswitch
/case
statements. -
Namespace. Small. This pass is intended to reverse the proliferation of namespaces in EnergyPlus and shorten
namespace
names. The original Fortran to C++ translation automatically creatednamespace X
for everyX.hh
file, but there is no fundamental one-to-one relationship between files and namespaces. There is also no fundamental relationship betweennamespace
names and file names. Logically, there are not 300+ namespaces in EnergyPlus. The number is probably more like 20 or 30. And because there are few logical namespaces, individual namespace names can be short. The C++ standard template library namespace is calledstd
, notStandardTemplateLibrary
.HeatPumpWaterToWaterCOOLING
is not a goodnamespace
name. Shortening these names will shorten lines, make code more readable, and make it easier for us to stop usingusing
. Look here for a longer discussion. -
Optional. Small. This pass would eliminate optional parameters and replace them with default parameters. A modification that would make this pass into a medium or even large pass would be to pass all reference variables by pointer rather than reference which--among other things--would allow us to use
nullptr
as a default value. See here for a longer discussion of why reference parameters are evil. -
EPVectorClass. Small. This pass consists of two sub-components which can probably be done together. The first replaces all instances of
ObjexxFCL::Array1D
for item types other thanReal64
andbool
withEPVector
, which is a much lighter template. This change will add bounds-checking to debug builds, slowing down debug-build simulations significantly. To negate this, we will need to establish the pattern of usingauto & x = state.dataX->X(XNum);
followed byx.Field1
,x.Field2
, etc. rather than constantly doingstate.dataX->X(XNum).Field1
,state.dataX->X(XNum).Field2
all over the place (you know what I'm talking about). This will also make the code more readable. This part may actually make this into a medium pass rather than a small one. -
EPVectorRealBool. Small. This pass complements the EPVectorClass pass and eliminates uses of
Array1D_bool
andArray1D<Real64>
. This is split into a separate pass because it will require hand rolling loops for some of the arithmetic and boolean operators implemented for these templates. We may want to implement the boolean operations in EPVector if they are not implemented already. -
EPVEctorXD. Medium. This pass would replace all
Array2D
,Array3D
, etc. with combinations of nested EPVector andstd::array
as appropriate. This may seem quite expensive but can be amortized by collecting multipleArrayXD<Real64>
that have similar indexing structures and making them into a singleArrayXD<struct MultiReal64>
, which is probably a good thing to do anyway. This may be the final vestige of ObjexxFCL. - std::vector. Large. This may be a pipe dream, but at some point we need to consider moving away from 1-based indexing.
-
epJSON. Large. This one is simple conceptually, but will probably take some time to actually implement. The IDF input processor interface, i.e.,
getObjectItem
, has been internally re-plumbed to access the epJSON infrastructure. This layer should be peeled away and objects should be constructed using the epJSON interface directly. There are already one or two examples of this. -
Initialization. Large. This pass will unlock a number of other optimizations and simplifications. This intent is to remove the ubiquitous
if (state.dataX->GetInputFlag) { GetXInput(state); state.dataX->GetInputFlag = false; }
from the prologues of about half the functions in EnergyPlus and the only slightly less ubiquitousif (CompIndex == 0) { CompIndex = FindItemInList(CompName, state.dataX->XList); if (CompIndex == 0) { ShowFatalError(""); } }
that typically follows it. The idea is to replace this code with two-pass initialization at startup. The first pass reads and instantiates all objects from the input file, setting all fields that do not reference other objects. The second pass resolves object references and sets object indices--one focus of this second initialization pass is to eliminate case-insensitive comparisons for user-defined names and references, a second focus is to eliminate reference strings in objects. Once these two passes are implemented, the scattered initialization code can be deleted and object name parameters can be removed from internal functions.