-
Notifications
You must be signed in to change notification settings - Fork 3
The OSTRICH input file (ostIn)
Introduction
Comments
Case sensitivity
Basic configuration
File pairs
Extra files
Extra directories
Real-valued parameters
Integer parameters
Combinatorial parameters
Tied parameters
Special parameters and preemption
Initial parameters
Parameter correction
Observations
Tied response variables
Type conversion
Search algorithms
Constraints
This section summarizes the input file of the OSTRICH program. On case-sensitive Linux systems, the input file must be named ostIn.txt. On Windows systems the file could also be named OstIn.txt. OSTRICH is a command-line console driven tool and when launched it will look for ostIn.txt in the working directory (i.e. the directory from which OSTRICH is launched). If this file does not exist or if it contains syntax errors, OSTRICH will quickly recognize this and report an error message and close. Windows users will experience this behavior as a brief flash of the DOS console window as it opens and then rapidly closes. In fact, the open-close sequence may happen so fast that all a user notices is a brief flicker on the computer monitor. This does not mean that OSTRICH is not installed correctly! It just means that you didn’t create a valid input file prior to running OSTRICH. The output file named “OstErrors0.txt” will have details on why OSTRICH failed to run.
For OSTRICH to work with a given modeling program, the modeling program must meet the following requirements:
- The modeling program must use a text-based input/output file format. OSTRICH can also work with modeling programs that use the MS Access or NetCDF file formats, but users will need to configure an additional section of the OSTRICH input file. This section is described in Section 2.17 (Type Conversions).
- The modeling program must be able to run without prompting for user intervention. This means, for example, that the modeling program cannot prompt the user to enter the name of an input file and the modeling program must not pause for user input at the end of a simulation.
- The output of the modeling program must be in a consistent format that can be reliably parsed. OSTRICH can also work with modeling programs that sometimes fail to write consistently formatted output. In such cases users should configure the optional “OnObsError” feature described in Section 2.3 (Basic Configuration).
OSTRICH utilizes a text-based input file format which specifies that configuration variables be organized on a line-by-line basis using loosely human-readable syntax. Users typically prepare the OSTRICH input file using a text editor like Notepad, Wordpad, VIM, or Emacs. For some sections (e.g. observations and response variables) it may also be helpful to use a spreadsheet program like Excel or Calc and then copy the desired cells from the spreadsheet to the text-based input file. With a few exceptions (which will be explicitly noted in the following text) the basic format for a line of input in the ostIn.txt file is:
<variable> <value>
Where is the name of the configuration variable (e.g. ProgramType) and is the user-selected value for the variable (e.g. ParticleSwarm). The whitespace separating and can be any number of spaces or tab characters. Inside ostIn.txt, the OSTRICH configuration variables are organized into groups and each group is described below in its own section.
Although the list of ostIn.txt configuration groups is rather extensive, most of the groups do not need to be specified, as they are initialized within OSTRICH to reasonable defaults if the user does not set a value for them. Furthermore, many of the configuration groups relate to optional features within OSTRICH and may not be used in a given run of the program. In fact, the only groups that must be configured by the user are: Basic Configuration, File Pairs, and Parameters. You must also include an Observations group if calibrating using OSTRICH’s internal weighted least squares objective function. Otherwise, if using OSTRICH’s general-purpose constrained optimization platform (GCOP), you must include a Response Variables group, a Costs group, and a Constraints group. Sections 2.3 through 2.24 discuss the particular syntax and purpose of the various groups that may be included in the ostIn.txt file.
Comment lines in the OSTRICH input files have the '#’'symbol as the first character. These lines are ignored and allow the user to make the input file more readable and disable configuration parameters or observations without completely deleting the corresponding lines. A sample comment line is given below in Listing 1. More examples can be found in the demonstration files distributed with the OSTRICH program and these are described in Section 5. <\p>
#
# These are some example comment lines. It’s a good
# idea to include comments in the input file to
# describe the intent of your configuration
# choices.
#
Example 1: Comment Lines
Variable names and group tags in the OSTRICH input file are case sensitive; e.g. using beginfilepairs instead of BeginFilePairs will result in a parsing error. Meanwhile, values of variables are case insensitive; e.g. GENETICALGORITHM, geneticalgorithm, and GeneticAlgorithm will all correctly select the genetic algorithm ProgramType.
The “Basic Configuration” variables describe the modeling program that is to be optimized or calibrated and identify the optimization (or regression) algorithm that OSTRICH should use. In addition, there are a number of optional basic configuration variables that effect various aspects of the OSTRICH program. Listing 2 summarizes the syntax for the variables that make up the basic configuration group. The third column of text is enclosed in brackets (i.e. “[ ]”) and provides the default settings for each variable. Only the first two columns (i.e. variable and desired value) should be included in an actual input file. An example syntactically correct configuration of the basic group is given in Listing 4. Users can “cut-and-paste” Listing 4 and edit as needed for their particular problem. Listing 3 and Listing 5 provide list of possible values for the TelescopingStrategy and ProgramType variables, respectively. Users interested in the details of the telescoping strategies are referred to the publication by Matott et al (2013).
# essential variables
ProgramType see_listing_5 [Levenberg-Marquardt]
ModelExecutable name_of_model [no default]
ModelSubdir name_of_subdir [.]
ObjectiveFunction wsse/gcop [wsse]
# useful optional variables
PreserveBestModel name_of_script [no default]
PreserveModelOutput yes/no/name_of_script [no]
OstrichWarmStart yes/no [no]
NumDigitsOfPrecision val_from_1_to_32 [6]
TelescopingStrategy see_listing_3 [none]
RandomSeed value [randomly assigned]
OnObsError quit/value [quit]
# experimental or less common optional variables
CheckSensitivities yes/no [no]
SuperMUSE yes/no [no]
OstrichCaching yes/no [no]
BoxCoxTransformation value [1.00]
ModelOutputRedirectionFile filename [OstExeOut.txt]
Example 2: Basic Configuration Groups
Values for the “ModelExecutable” and “PreserveBestModel” variables can include be fully qualified paths or relative paths and should reference an executable file, batch file, or script file. If a path or file contains spaces the value should be enclosed in double quotes (i.e. “ “).
Note: The basic configuration group is the only group in the OSTRICH input file that does not have a corresponding “Begin…” and “End…” group tag. As such these variables can be placed anywhere within the input file. However, since these are the first variable processed by OSTRICH, a good convention to follow is to place these variables at the beginning of the file and avoid mixing them in with the other groups.
ProgramType: This variable tells OSTRICH which algorithm should be used to perform the optimization or calibration.
Options for ProgramType | Algorithm Description |
---|---|
GeneticAlgorithm | See Table 1 (RGA) |
BinaryGeneticAlgorithm | See Table 1 (BGA) |
ShuffledComplexEvolution | See Table 1 (SCE) |
BisectionAlgorithm | See Table 1 (BIS) |
SamplingAlgorithm | See Table 1 (BBBC) |
ParticleSwarm | See Table 1 (PSO) |
APPSO | See Table 1 (APPSO) |
PSO-GML | See Table 1 (PSO-GML) |
SimulatedAnnealing | See Table 1 (CSA) |
DiscreteSimulatedAnnealing | See Table 1 (DSA) |
VanderbiltSimulatedAnnealing | See Table 1 (VSA) |
Levenberg-Marquardt | See Table 1 (GML) |
GML-MS | See Table 1 (MSGML) |
Powell | See Table 1 (POWL) |
Steepest-Descent | See Table 1 (STPDSC) |
Fletcher-Reeves | See Table 1 (FLRV) |
RegressionStatistics | Compute regression statistics |
Jacobian | Compute Jacobian matrix |
Hessian | Compute Hessian matrix |
Gradient | Compute Gradient information |
ModelEvaluation | Process InitParams group |
GridAlgorithm | See Table 1 (GRID) |
DDS | See Table 1 (DDS) |
DDSAU | See Table 1 (DDS-AU) |
ParallelDDS | See Table 1 (PDDS) |
DiscreteDDS | See Table 1 (DDDS) |
GLUE | See Table 1 (GLUE) |
RejectionSampler | See Table 1 (RJSMP) |
MetropolisSampler | See Table 1 (MCMC) |
SMOOTH | See Table 1 (SMOOTH) |
PADDS | See Table 1 (PADDS) |
ParaPADDS | See Table 1 (ParaPADDS) |
BEERS | See Table 1 (BEERS) |
Table 1: Supported Values for the Program Type Option
ModelExecutable: Specifies the model executable or driver program or script. If the executable is in the same directory as the working directory from which the program is executed, then the path information may be omitted.
ModelSubdir: When running in parallel, users must specify a working subdirectory to prevent parallel runs from clobbering each other’s input and output files. If set to any value other than ’.’ (i.e. the default), the value of ModelSubdir will cause OSTRICH to create unique subdirectories for the model runs of each parallel processor. The subdirectory names are created by concatenating the ModelSubdir value with each processors MPI id number.
ObjectiveFunction: The objective function to be optimized, either WSSE (weighted sum of squared error) calibration or GCOP (General-purpose Constrained Optimization Platform).
PreserveBestModel: A user-supplied script or executable that is run by OSTRICH every time a new best parameter set is discovered.
PreserveModelOutput: If set to "yes" OSTRICH will make copies of files associated with each model run and preserved files will be stored directories named "runNNN", when NNN is a counter that is incremented after each model run. For example, the files for the first model run will be copied into a directory named run1, and files from the second run copied into a directory named run2, and so on. Alternatively, users can provide the name of a script or executable. This script will be run after the completion of each model run and can be used, for example, to filter results so that only model runs deemed important by the user are preserved (e.g. non-dominated solutions in a multi-objective context). Note that the preservation script provided by the user must take care of creating directories and copying any files that are to be saved. OSTRICH will pass the following arguments to the user-defined script:
- rank – The zero-based processor id of the processor that invoked the script.
- trial – For multi-start algorithms (i.e. DDSAU, MS-GML, and PSO-GML) the trial argument indicates which multi-start trial is currently underway. For all other algorithms the trial argument is set to 0.
- counter – The current count of model runs completed for the given rank and trial.
-
objective function category (ofcat) – A text string that categorizes the objective function value associated with the completed model run. User-defined model preservation scripts may wish to take different actions depending on the ofcat setting. Possible values are given below:
- best – For single-objective algorithms, an ofcat value of “best” indicates that the completed model run is the best solution obtained so far.
- behavioral – For uncertainty-based algorithms, an ofcat value of “behavioral” indicates that the completed model run is a behavioral solution.
- non-behavioral – For uncertainty-based algorithms, an ofcat value of “non-behavioral” indicates that the completed model run is a non-behavioral solution.
- dominated – For multi-objective algorithms, an ofcat value of “dominated” indicates that the completed model run is a dominated solution.
- non-dominated – For multi-objective algorithms, an ofcat value of “non-dominated” indicates that the completed model run is a non-dominated solution based on the model runs that have completed so far. Note that a non-dominated solution may become dominated later in a search.
- other – An ofcat value of “other” indicates that the completed model run does not fit into any of the previously listed categories. For example, in a single-objective algorithm this would indicate that the completed model run is not the best solution obtained so far.
OstrichWarmStart: If set to "yes" OSTRICH will read the contents of any previously created "OstModel" output files and use the entries therein to restart an optimization or calibration exercise.
NumDigitsOfPrecision: This specifies the precision of values written to OSTRICH output files.
TelescopingStrategy: If selected, this optional setting will cause parameter bounds to become increasingly smaller as an optimization or calibration proceeds. Options for the telescoping strategy are:
- none
- convex-power
- convex
- linear
- concave
- delayed-concave
RandomSeed: This variable can be used to control the random seed OSTRICH uses when generating random numbers.
OnObsError: This variable controls how OSTRICH behaves when a model fails to generate all of the expected output for a WSSE calibration. If set to "quit", OSTRICH will abort if it ever fails to parse an observation from user-specified output files. If set to a value, OSTRICH will use the value as a placeholder observation value if it can't read a given observation from model output.
CheckSensitivities: If this variable is set to "yes", OSTRICH will perform a pre-calibration step to calculate parameter sensitivities (i.e. changes in simulated equivalent observations with respect to changes in parameters).
SuperMUSE: If set to "yes", OSTRICH will interface with EPA SuperMUSE tasker-client approach to parallel computing.
OstrichCaching: If set to "yes", OSTRICH will examine "OstModel" output files prior to running a given model configuration to see if the associated parameter set has already been evaluated.
BoxCoxTransformation: If set to a value other than "1", OSTRICH will apply a Box-Cox power transformation on each calibration residual. The user-supplied value is used as the exponent for the transformation.
ModelOutputRedirectionFile: This variable allows users to override the default name (i.e. OstExeOut.txt) of the file where OSTRICH will redirect model output that would normally be displayed on a console screen (i.e. stderr and stdout). Bit buckets (e.g. /dev/null or NUL) are supported, making it possible to discard console output entirely.
# essential variables
ProgramType ParticleSwarm
ModelExecutable “C:\My Folder\My_Model.exe”
ModelSubdir mod
ObjectiveFunction GCOP
# useful optional variables
PreserveBestModel “C:\My Folder\Save_Best.bat”
PreserveModelOutput no
OstrichWarmStart yes
NumDigitsOfPrecision 8
TelescopingStrategy none
RandomSeed 100
OnObsError quit
# experimental or less common optional variables
CheckSensitivities yes
SuperMUSE no
OstrichCaching no
BoxCoxTransformation 1.00
ModelOutputRedirectionFile ModelOutput.stdout
Example 3: Example of a Syntactically Correct Basic Configuration Group Variables set to default values could be omitted or commented out
A file pair consists of a template file and a corresponding model input file. The contents of the template file should be identical to the paired model input file except that values of optimization (or calibration) parameters are replaced with unique parameter names defined in the Parameters section. During optimization, OSTRICH uses the template files to create syntactically correct model input files in preparation of running the model at different parameter values. Section (@@) describes this process in detail. The general syntax for the File Pair group is given in Listing 6 along with a concrete example.
BeginFilePairs
<template1><sep><input1>
<template2><sep><input2>
.
.
.
<templateN><sep><inputN>
EndFilePairs
As shown in Listing 6, BeginFilePairs and EndFilePairs are parsing tags that wrap a list of file name pairs such that ... are the names of the template files corresponding to the ... model input files, and is a separator that tells OSTRICH when one filename ends and the next begins. Valid file name separators are the semi-colon character ’;’ and the TAB character. Spaces are not valid separator characters because OSTRICH allows spaces within file names.
Extra files are model input files not used by OSTRICH, but required for proper execution of the model. In parallel environments, OSTRICH needs to know about these extra input files so that it can copy them to each processor’s working directory (see ModelSubdir in Section 2.3, above). Sharing a working directory among parallel processors is not recommended because it can result in multiple processors trying to write to the same file at the same time. The general syntax for the Extra Files group is given in Listing 7 along with a concrete example.
As shown in Listing 7, BeginExtraFiles and EndExtraFiles are parsing tags that wrap a list of extra model input files. Extra files must be identified if the model is to be executed in a dynamically generated subdirectory (as specified by the ModelSubdir variable), so that OSTRICH knows to copy them to the subdirectory. For serial algorithms, creation of a dynamic subdirectory is unnecessary and specification of the extra files section is optional. However, this section is required if running a parallel algorithm to avoid aforementioned processor I/O conflicts.
Extra directories are directories containing model input files not used by OSTRICH, but required for proper execution of the model. In parallel environments, OSTRICH needs to know about these extra directories so that it can copy them (and all files and subdirectories contained within) to store in each processors working directory (as specified by the ModelSubdir variable). Sharing a working directory among parallel processors is not recommended because it can result in multiple processors trying to write to the same file of the same directory at the same time. Listing 8 contains the general syntax and a concrete example of the Extra Directories group. As shown in Listing 8, BeginExtraDirs and EndExtraDirs are parsing tags that wrap a list of extra model input directories.
This configuration group describes the parameters to be calibrated or optimized. Parameter configuration variables include names, initial values, lower and upper bounds, input, output and internal transformations, and (optionally) fixed format printing codes. Parameters in this section are real and continuously varying. Listing 9 provides the general format for the parameters group and Listing 10 gives a concrete example.
In Listing 9, BeginParams and EndParams are parsing tags that wrap a list of N model parameters made up of the following variables: name: The name of the parameter, parameter names must be unique and correspond identically to the names used in the template file(s) (see Section 2.4 and Section Error! Reference source not found.). init: Initial value of the parameter, in units specified by the txIn variable. Alternatively, the keywords “random” or “extract” may be used instead of specifying a value. OSTRICH will assign a randomly generated initial value if the “random” keyword is used. OSTRICH will extract the initial value from existing model input files if the “extract” keyword is used. lwr: Lower bound (i.e. minimum value) of the parameter, in units specified by the txIn variable. upr: Upper bound (i.e.. maximum value) of the parameter, in units specified by the txIn variable. txIn, txOst, and txOut: These specify the type of transformation units that OSTRICH should use. Transformations allow the user to take advantage of any linearity relationships that exist between a transformed parameter value (e.g. log10 or loge) and the underlying model. Three kinds of transformations are provided so that the user can work with input and output transformations that are different than the internal transformation. Typically, the user will request no input and output transformation (so that input and output values are the native units of the parameter), while instructing OSTRICH to perform a transformation internally. This approach allows the algorithm to take advantage of a transformed relationship without requiring manual conversion of input and output values. However, it should be noted that some statistical output is reported in terms of txOst units, regardless of the value of txOut; namely (a) parameter variance-covariance, (b) observation influence, (c)parameter sensitivity, (d) model linearity, and (e) matrices. OSTRICH supports the following transformation values:
– none: no transformation. – log10: log base 10 transformation. – ln: natural logarithm transformation.
fmt: A format code that OSTRICH will use when writing model input files. This is provided so that OSTRICH can support modeling programs which expect fixed format inputs (i.e. when values in the input file are expected to take up an exact number of characters). For example, many programs written in legacy FORTRAN (e.g. F77) expect fixed format. Use a fmt value of “free” if using a modeling program that is not bound by fixed format requirements. Otherwise, use a format code of “Fw.d” for decimal values (e.g. 3.4567) where “w” is the total number of characters and “d” is the number of characters following the decimal. For example, to represent the value of Pi to 6 significant digits you would use a format code of F8.6, resulting in a value of “3.141593”. Use a format code of “Ew.d” or “Dw.d” for scientific notation, where “w” is the total number of characters and “d” is the number of significant digits. For example, applying a format code of E10.3 to the value of 1/12 would result in “ 8.333E-02”. For fixed decimal notation “w” should be at least equal to “d”+2 and for fixed scientific notation “w” should be at least equal to “d”+7.
This configuration group describes those parameters to be calibrated or optimized which can take on only integer values. Like their real-parameter counterparts, integer parameter configuration variables include names, initial values, and lower and upper bounds. However, format codes and unit transformations are not supported for integer parameters. Listing 11 provides the general syntax and a concrete example of the integer parameters group.
This configuration group describes those parameters to be calibrated or optimized which can take on a discrete set of values, which can be in the form of real, integer or string (text) values. Like integer and real parameters, combinatorial parameter configuration variables include names and initial values; but instead of lower and upper bounds, the user must supply a complete list of the discrete values that may be assigned to the parameter. Furthermore, format codes and unit transformations are not supported for combinatorial parameters. Listing 12 provides the general syntax of the combinatorial parameters group.
In Listing 12, the “type” field should be either “real”, “integer”, or “string” and should correspond to the type of values in the subsequent combinatorial list. Furthermore, the “N1” through “NM” values specify the number of entries in the combinatorial list, which is generically represented in Listing 12 as vm,n for the nth discrete value that can be taken on by the mth parameter. Listing 13 provides a concrete example of the combinatorial parameters group.
Tied parameters are parameters which are computed as a function of integer, real or combinatorial parameter values. They may also be functions of other tied parameters.
Where, Xtied is the tied parameter value which is a function of n non-tied parameters (X1,X2,...Xn) and a set of m coefficients (c1,c2,...cm), which depend on the functional form of ftied(). Tied parameter configuration variables include: the name of the tied parameter; a list of the names of tied or non-tied parameters used in the computation of the tied-parameter value; a specification of the functional form of ftied(); and a list of coefficients used in the evaluation of ftied(). Listing 14 provides the general syntax for the tied parameters group.
In Listing 14, BeginTiedParams and EndTiedParams are parsing tags that wrap a list of tied model parameters made up of the following variables: name: The name of the tied parameter, parameter names must be unique and correspond identically to the corresponding name used in the template file(s). np : The number of non-tied parameters used in the calculation of the tied parameter value. Valid values for np depend on the choice of functional relationship, specified in the type field. pname1 … pnamenp: A list of parameter names that are used in the computation of the tied-parameter. type: The type of functional relationship ,ftied(), between the tied parameter and the list of named parameters (i.e. pname1 … pnamenp). Valid values for type are: linear: Selects a linear relationship for ftied(). If this choice is selected, the value of np must be either 1 or 2. exp: Selects an exponential relationship for ftied(). If this choice is selected, the value of np must be 1. log: Selects a log relationship for ftied(). If selected, the value of np must be 1. dist: The tied parameter is the distance between two (x,y) coordinates, where these coordinates are parameters of the optimization/calibration. If selected, the value of np must be 4 and the ordering of parameter names should correspond to (x1,y1),(x2,y2). wsum: The tied parameter is the weighted sum of the listed parameters. ratio: The tied parameter is the ratio of a linear combination of parameters. If selected, the value of np must be 2 or 3. constant: The tied parameter is a constant. If selected, the value of np must be 0. type_data: Depending on the choice of type, the syntax of this field varies, as described below. The syntax for type_data includes a format specifier – see the description of the fmt variable in Section 2.7. If type = ”linear” and np = "1": The functional relationship is linear and has the form: Xtied = (c1 × X) + c0 Where Xtied is the tied-parameter value, c0 and c1 are coefficients, X is the non-tied parameter value, and type_data should be replaced with the following syntax:
If type = ”linear” and np = "2": The functional relationship has the form: Xtied = (c3 × X1 × X2) + (c2 × X2) + (c1 × X1) + c0 Where Xtied is the tied-parameter value, c0, c1, c2, and c3 are coefficients, X1 and X2 are the non-tied parameter values, and type_data should be replaced with the following syntax:
If type = ”exp”: The functional relationship has the form: Xtied = c2 × b(c1 × X) + c0 Where Xtied is the tied-parameter value, c0, c1 and c2 are coefficients, b is the exponent base, X is the non-tied parameter value, and type_data should be replaced with:
Where base can be a numerical value, or “exp” if the natural base is to be used. If type = ”log”: The functional relationship has the form: Xtied = c3 × loga(c2 × X + c1) + c0 Where Xtied is the tied-parameter value, c0, c1, c2 and c3 are coefficients, a is the logarithm base, X is the non-tied parameter, and type_data should be replaced with the following syntax: Where base can be a numerical value, or “ln” if the natural logarithm is to be used.If type = ”dist”: The type_data field should contain the desired fmt specification.
If type = ”wsum”: The type_data field should list the values of each weight, using the same ordering as the named list of parameters, followed by the desired fmt specification.
If type = ”ratio” and np = “2”: The functional relationship has the form: Xtied = (c3 × X1 + c2) / (c1 × X2 + c0) Where Xtied is the tied-parameter value, c3, c2, c1 and c0 are coefficients, X1 and X2 are non-tied parameters, and type_data should be replaced with the following syntax:
If type = ”ratio” and np = “3”: The functional relationship has the form: Xtied = [ (n7 × X1 × X2 × X3) + (n6 × X1 × X2) + (n5 × X1 × X3) + (n4 × X2 × X3) + (n3 × X1) + (n2 × X2) + (n1 × X3) + n0 ] / [ (d7× X1 × X2 × X3) + (d6 × X1 × X2) + (d5 × X1 × X3) + (d4 × X2 × X3) + (d3 × X1) + (d2 × X2) + (d1 × X3) + d0 ] Where Xtied is the tied-parameter value, n7 … n0 and d7 … d0 are coefficients, X1 … X3 are non-tied parameters, and type_data should be replaced with the following syntax: n7 n6 n5 n4 n3 n2 n1 n0 d7 d6 d5 d4 d3 d2 d1 d0 fmt If np = “0”: The tied parameter is assigned a constant value. No type field is required and the type_data field must contain the parameter value followed by a format specifier (fmt). Listing 15 provides concrete examples of the different tied parameter types.
Certain models are capable of monitoring the progress of a simulation and aborting further processing if some threshold cost or constraint is exceeded. OSTRICH provides the “SpecialParams” group to support such models. Special parameters are cost and constraint thresholds that are tracked by selected algorithms in OSTRICH (see the relevant column in Table 1, above) and written to input files using the same template mechanism as regular calibration/optimization parameters. In this way OSTRICH can pass the most up to date threshold values on to the pre-emptive model. Pre-emption is described in detail by Razavi et al (2010). The general syntax for the SpecialParams group is given below in Listing 16 and a concrete example is given in Listing 17.
In Listing 16, BeginSpecialParams and EndSpecialParams are parsing tags that wrap a list of model pre-emption parameters made up of the following variables:
name: The name of the pre-emption parameter, parameter names must be unique and correspond identically to the corresponding name used in the template file(s).
init: The initial value of the pre-emption parameter. This should be set to a value that will NOT trigger pre-emption.
type: The type of pre-emption parameter. This should be set to either “BestCost” or “BestConstraint” depending on the nature of pre-emption (i.e. model pre-emption based on exceeding the cost function or model pre-emption based on violation of a constraint threshold).
con_type: For “BestConstraint” pre-emption parameters the “con_type” value should be either “upper” or “lower”. Set the value to “upper” if the model should pre-empt if it’s internally computed constraint exceeds the value of the constraint specified by “con_name”. Set the value to “lower” if the model should pre-empt if it’s internally computed constraint is less than the value of the constraint specified by “con_name”. For “BestCost” pre-emption parameters, the “con_type” and “con_name” fields are ignored and should be set to “n/a”.
con_name: The name of the constraint whose violation should trigger pre-emption. Constraints are defined in the Constraints group which, in turn, require specification of a Response Variable group --- see Sections 2.15 and 2.24, below.
As indicated in Table 1, users of certain algorithms can optionally seed some or all of the initial search entries with predefined parameter sets. This allows the user to incorporate prior information (such as previous optimization results or expert judgement) into the optimization, and may enhance the efficiency and/or effectiveness of the algorithm. To use this option, insert an “InitParams” group, which uses the general syntax given in Listing 18.
Where “BeginInitParams” and “EndInitParams” are parsing tags that wrap a list of initial parameters, and n is the number of parameters, m is the number of entries in the initial parameters group, and pi,j is the j-th initial value of the i-th parameter (ordered according to the order of the parameters section(s)). A concrete example of the “InitParams” group is given in Listing 19.
The “ParameterCorrection” group and corresponding “Corrections” sub-group allows users to interface OSTRICH with an external program or script that makes adjustments to a candidate parameter set that has been calculated by an OSTRICH search algorithm but not yet evaluated. These corrections allows users to incorporate expert judgment or other information into the search procedure while still using one of the algorithms already implemented within OSTRICH. As an example, consider an optimization problem that seeks to install a well in an optimal location for extracting contaminated groundwater. Parameter correction can be used to adjust candidate well locations if they are found to be outside the boundaries of the contaminated plume. To use this option, insert a “ParameterCorrection” group, which uses the general syntax given in Listing 20 and which includes a “Corrections” sub-group.
Where “BeginParameterCorrection” and “EndParameterCorrection” are parsing tags that wrap the configuration variables of the “ParameterCorrection” group and “BeginCorrections” and “EndCorrections” are parsing tags that wrap the “Corrections” sub-group. Configuration variables are described below: name_of_exe: The name (including path, if desired) of the external correction program or script that implements user-defined parameter corrections.
tpl_name: The name of the template file that mimics the input file used by the external parameter correction program (i.e. “name_of_exe”). The template file must contain the names of all parameters that are to be subjected to possible correction by the external program.
inp_name: The name of the input file read by the “name_of_exe” parameter. OSTRICH will create this file by replacing the parameter names listed in the “tpl_name” template file with actual candidate values under consideration by the search algorithm.
name: The name of a correctable parameter listed in the template file (i.e. “tpl_name”). Each correctable parameter must be included in the Corrections sub-group.
outfile: The name of the file that will be created by the external correction program and which will contain the possibly corrected value of the parameter specified by the corresponding “name” field.
keyword: A keyword that is search for within “outfile” prior to extracting the possibly corrected value of the parameter specified by the corresponding “name” field. If no keyword search is desired, set the value of this variable to “OST_NULL”.
line: The line number to advance to within “outfile” prior to extracting the possibly corrected value of the parameter specified by the corresponding “name” field. If “keyword” is set to “OST_NULL” the line number is relative to the beginning of the file, otherwise the line number is relative to the first line containing the specified keyword. A line number of “0” indicates the same line as the keyword, a line number of “1” indicates the first line after the keyword, a line number of “2” indicates the second line after the keyword, and so on.
col: The column number within the specified line of the “outfile” that will contain the possibly corrected value of the parameter specified by the corresponding “name” field. A column number of “1” indicates the first column, a column number of “2” indicates the second column, and so on, where each column is separated by the separator character given in the “sep” field.
sep: A character that separates each column. This variable should be enclosed in single quotes (e.g. ' ' for space-separated, ',' for comma-separated, etc.).
A concrete example of the “ParameterCorrection” group and accompanying “Corrections” sub-group is given in Listing 21.
For calibration problems that use the internal OSTRICH weighted sum of squared errors (WSSE) objective function, the Observations group is used to list the observation names, values, and weights, along with parsing instructions for reading simulated equivalent observations from model output files. The general syntax for the Observations group is given in Listing 22 and a concrete example is given in Listing 23.
In Listing 22 and Listing 23, BeginObservations and EndObservations are parsing tags that wrap a list of observations, which are made up of the following variables:
name: The name of the observation, each observation should have a unique name. value: The field-measured value of the observation.
wgt: The weight assigned to the observation. See Hill (1998) and Hill and Tiedeman (2007) for guidelines to assigning observation weights.
file: The model output file where the simulated value of the observation will be stored following execution of the modeling program.
sep: This variable is a filename separator (i.e. a tab or semi-colon). See also the File Pairs section (Section 2.4).
key, line, col, and, tok: These variables tell OSTRICH how to extract model simulated observation values from the model output file. First, OSTRICH positions the output file parser at the first line in file containing key(word). If OSTRICH should begin parsing at the beginning of the file, then the value of key should be OST_NULL. Next, the parser uses the line and col values to locate the position of the desired observation value. This value is then extracted and converted to a double precision number. The parsing process is repeated until all observation values are read. The line variable tells OSTRICH how many lines must be skipped, starting from the line containing key, before the line containing the desired observation value is reached. Therefore, if the observation value is on the same line as key, then line should be equal to 0; if the observation value is on the line immediately following key, then line should be equal to 1, and so on. The col variable tells OSTRICH which column in the line contains the desired observation value; where column numbering begins at 1 and the tok variable specifies the column separator. Note that values for the tok variable should be enclosed in single quotes (e.g. ‘,’ for a comma token). Furthermore, providing a whitespace token (e.g. ‘ ‘) will cause any sequence of space or TAB characters to be treated as a single column separator token. Figure 1 illustrates the parse procedure using an example observation list (Listing 23) and model output file (Figure 2).
aug: Setting the value of the aug (i.e. augmented output) variable to yes will cause OSTRICH to include the simulated values of the selected observation(s) in the OstModel output file (see Section 4.5). This can be useful, for example, when assembling samples for a predictive uncertainty analysis.
grp: Use the grp variable to partition observations into meaningful groups (e.g. high- vs. low-flow observations, groundwater head vs. flow observations, nitrate vs. trichloroethylene concentrations, etc.). When performing multi-criteria calibration, OSTRICH will compute multiple WSSR objectives corresponding to each unique observation group.
When performing optimization (as opposed to calibration), this group specifies the response variables that OSTRICH should read from model output files prior to evaluating costs and constraints. The syntax is very similar to the observations group used in model calibration, and includes variable name, output file name (from which the value of the variable is read), and parsing instructions for retrieving the value of the variable from the given model output file. The Constraints and GCOP sections (see below) build upon the Response and Tied Response Variable groups by associating response variables with a constraint or cost variable. The general syntax for the “ResponseVars” group is given in Listing 24 and a concrete example is given in Listing 25.
Where BeginResponseVars and EndResponseVars are parsing tags that wrap a list of response variables, which are made up of the following variables:
name: The name of the response variable, each should have a unique name.
file: The model output file where the simulated value of the response variable will be stored following execution of the modeling program.
sep: This variable is a filename separator (i.e. a tab or semi-colon). See also the File Pairs section (Section 2.4).
key, line, col, and tok: These variables tell OSTRICH how to extract model simulated response variable values from the model output file. The parsing procedure is identical to that used in extracting Observation group data (see Section 2.14 for details).
aug: Setting the value of the aug (i.e. augmented output) variable to yes will cause OSTRICH to include the simulated values of the selected response variable(s) in the OstModel output file (see Section 4.5). For multi-objective problems, there should be a one-to-one correspondence between cost functions (see Section 2.23) and augmented response variables.
This group specifies ’tied’ response variables; variables whose values are computed by OSTRICH as functions of one or more response variables and/or parameters. The general syntax for the “TiedRespVars” group is given in Listing 26 and a concrete example is given in Listing 27.
In Listing 26 and Listing 27, BeginTiedRespVars and EndTiedRespVars are parsing tags that wrap a list of tied response variables. The parameters in this section are identical to those in the Tied Parameters (see Section 2.10), except fewer functional relationships are supported and the list of non-tied items (used in the calculation of the tied response variable) may be parameters, response variables, and/or other tied response variables.
name: The name of the tied response variable, each should have a unique name.
np: The number of parameters, response variables and/or other tied response variables used in the calculation of the named tied response variable. Valid values for np depend on the choice of functional relationship, specified in the type field.
pname1 … pnamenp: A list of the names of parameters, response variables, and other tied response variables that are used in the computation of the named tied response variable.
type: The type of functional relationship ,ftied(), between the tied response variable and the list of non-tied variables (i.e. pname1 … pnamenp). Valid values for type are:
linear: Selects a linear relationship for ftied(). If this choice is selected, the value of np must be either 1 or 2. wsum: The tied response variable is the weighted sum of the listed non-tied variables.
type_data: Depending on the choice of type, the syntax of this field varies, as described below.
If type = ”linear” and np = "1": The functional relationship is linear and has the form: Ytied = (c1 × Y) + c0 Where Ytied is the tied response variable, c0 and c1 are coefficients, Y is the non-tied variable, and type_data should be replaced with the following syntax: If type = ”linear” and np = "2": The functional relationship has the form: Ytied = (c3 × Y1 × Y2) + (c2 × Y2) + (c1 × Y1) + c0 Where Ytied is the tied response variable, c0, c1, c2, and c3 are coefficients, Y1 and Y2 are the non-tied variables, and type_data should be replaced with the following syntax: If type = ”wsum”: The type_data field should list the values of each weight, using the same ordering as the named list of non-tied variables.
Models that generate input or output files in MS Access or netcdf format can be interfaced with OSTRICH via specification of a corresponding “TypeConversion” group. Outputs specified in the TypeConversion group are extracted into text-based files that can then be processed into Observations (see Section 2.14) or ResponseVariables (see Section 2.15). As such, incorporating these types of output data into OSTRICH is a two-step process that requires entries the TypeConversion group and corresponding entries in the ResponseVariable or Observation group. Inputs specified in the TypeConversion group provide a mapping between parameters (see Sections 2.7 through 2.10) and corresponding non-text input files. This mapping allows OSTRICH to adjust parameter values in these non-text input files in lieu of the template file mechanism described in Section 2.4. Listing 28 provides the general syntax for filling out the TypeConversion group in the ostIn.txt input file. Listing 29 provides a concrete example for converting MS Access files. Listing 30 provides a concrete example for converting NetCDF files.
Where “BeginTypeConversion” and “EndTypeConversion” are parsing tags that wrap a list of conversion instructions for converting the inputs and outputs of a given file that uses a non-text format. Except where noted, each entry consists of the following fields:
type: This variable specifies the file format to be converted. Supported values are “NetCDF“ (for .netcdf files) and Access (for MS Access databases).
fname: The formatted file containing the data to be converted (e.g. MyAccessDbase.mdb or MyNetCDF.ncd). Outputs read from this file will be written to a text-based file. The text-based file will have the same file name prefix as fname but will be given a “.txt” extension (e.g. MyAccessDbase.txt or MyNetCDF.txt). File names for this field must not contain any spaces.
rw: This variable specifies the conversion to be performed. Supported values are “Read” and “Write”. A “Read” conversion will extract data from the formatted file and write the result to a text-based file that can be processed by the Observations or ResponseVars groups. A “Write” conversion instructs OSTRICH adjust the contents of the formatted file according to the value of the named parameter.
table: The name of the MS Access table or NetCDF array in the formatted file that contains the desired input or output.
keycol (Access only): The column in the MS Access table that contains an index key suitable for uniquely identifying the database entry for the desired input or output (e.g. OBS_ID). This field should be provided if the type field is “Access” but should be omitted if the type field is “NetCDF”.
key (Access only): A unique index key for the desired input or output. This key will be searched for in the corresponding keycol column (e.g. MW_01) to locate the tuple containing the desired observation or parameter value. This field should be provided if the type field is “Access” but should be omitted if the type field is “NetCDF”.
col: This field identifies the column in the Access database table or the array position in the NetCDF array that contains the actual value of the corresponding parameter, response variable, or observation.
name: This field specifies the name of an OSTRICH parameter, response variable, or observation that corresponds to the previously listed file format conversion information. The name field must reference an observation or response variable if the rw field is set to “Read”. Conversely, the name field must reference a parameter or tied parameter if the rw field is set to “Write”.
Each algorithm has its own configuration group, wherein the user can specify the values for various algorithm control variables. Additional optional configuration variables and groups (i.e. Warm Start, Pre-Emption, Parameter Correction, a List of Initial Parameters, Math and Stats, and Line Search) may also be available for a given algorithm, as indicated in Table 1. Please see the search algorithm page for more information about each algorithm, including the sytax required for each within the OstIn file.
In the Constraints group, the user supplies information about the various constraints that are to be placed on a general constrained optimization problem. Any number and combination of constraints are supported. As shown in Listing 61, the configuration syntax for constraints consists of: constraint name, constraint type, conversion factor, and names of relevant response (or tied-response) variables.
Where BeginConstraints and EndConstraints are parsing tags that wrap a list of general constraints made up of the following variables:
name: A unique name for the constraint.
type: The type of constraint – the only supported value is “general”.
CF: A cost factor that is multiplied by the amount of constraint violation. This converts a constraint violation into a penalty cost.
lwr: The lower constraint limit (gmin). If the actual constraint value (g) is less than gmin, a penalty of P = CF × (g − gmin) will be added to PTOTAL.
upr: The upper constraint limit (gmax). If the actual constraint value (g) is greater than gmax, a penalty of P = CF × (gmax − g) will be added to PTOTAL.
resp: The name of the response variable (tied or non-tied) used to evaluate the constraint.