Skip to content

Commit

Permalink
Feature 1905 ens_ctrl (#1955)
Browse files Browse the repository at this point in the history
Co-authored-by: j-opatz <[email protected]>
  • Loading branch information
JohnHalleyGotway and j-opatz authored Nov 3, 2021
1 parent 8c446ff commit d60924c
Show file tree
Hide file tree
Showing 13 changed files with 394 additions and 273 deletions.
19 changes: 10 additions & 9 deletions met/docs/Users_Guide/ensemble-stat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ The usage statement for the Ensemble Stat tool is shown below:
[-grid_obs file]
[-point_obs file]
[-ens_mean file]
[-ctrl file]
[-obs_valid_beg time]
[-obs_valid_end time]
[-outdir path]
Expand All @@ -92,23 +93,23 @@ Optional arguments for ensemble_stat

4. To produce ensemble statistics using gridded observations, use the **-grid_obs file** option to specify a gridded observation file. This option may be used multiple times if your observations are in several files.

5. To produce ensemble statistics using point observations, use the **-point_obs file** option to specify a NetCDF point observation file. This option may be used multiple times if your observations are in several files.

5. To produce ensemble statistics using point observations, use the **-point_obs file** to specify a NetCDF point observation file. This option may be used multiple times if your observations are in several files.
6. To override the simple ensemble mean value of the input ensemble members for the ECNT, SSVAR, and ORANK line types, the **-ens_mean file** option specifies an ensemble mean model data file. This option replaces the **-ssvar_mean file** option from earlier versions of MET.

7. The **-ctrl file** option specifies an ensemble control member data file. The control member is included in the computation of the ensemble mean but excluded from the spread.

6. To override the simple ensemble mean value of the input ensemble members for the ECNT, SSVAR, and ORANK line types, the **-ens_mean file** specifies an ensemble mean model data file. This option replaces the **-ssvar_mean file** from earlier versions of MET.
8. To filter point observations by time, use **-obs_valid_beg time** in YYYYMMDD[_HH[MMSS]] format to set the beginning of the matching observation time window.

7. To filter point observations by time, use **-obs_valid_beg time** in YYYYMMDD[_HH[MMSS]] format to set the beginning of the matching observation time window.
9. As above, use **-obs_valid_end time** in YYYYMMDD[_HH[MMSS]] format to set the end of the matching observation time window.

8. As above, use **-obs_valid_end time** in YYYYMMDD[_HH[MMSS]] format to set the end of the matching observation time window.
10. Specify the **-outdir path** option to override the default output directory (./).

9. Specify the **-outdir path** option to override the default output directory (./).
11. The **-log** file outputs log messages to the specified file.

10. The **-log** file outputs log messages to the specified file.
12. The **-v level** option indicates the desired level of verbosity. The value of "level" will override the default setting of 2. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity will increase the amount of logging.

11. The **-v level** option indicates the desired level of verbosity. The value of "level" will override the default setting of 2. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity will increase the amount of logging.

12. The **-compress level** option indicates the desired level of compression (deflate level) for NetCDF variables. The valid level is between 0 and 9. The value of "level" will override the default setting of 0 from the configuration file or the environment variable MET_NC_COMPRESS. Setting the compression level to 0 will make no compression for the NetCDF output. Lower number is for fast compression and higher number is for better compression.
13. The **-compress level** option indicates the desired level of compression (deflate level) for NetCDF variables. The valid level is between 0 and 9. The value of "level" will override the default setting of 0 from the configuration file or the environment variable MET_NC_COMPRESS. Setting the compression level to 0 will make no compression for the NetCDF output. Lower number is for fast compression and higher number is for better compression.

An example of the ensemble_stat calling sequence is shown below:

Expand Down
2 changes: 1 addition & 1 deletion met/docs/Users_Guide/reformat_grid.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Required arguments for the pcp_combine
Optional arguments for pcp_combine
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

3. The **-field string** option defines the data to be extracted from the input files. Use this option when processing fields other than **APCP** or non-GRIB files. This option may be used multiple times and output will be created for each.
3. The **-field string** option defines the data to be extracted from the input files. Use this option when processing fields other than **APCP** or non-GRIB files. It can be used multiple times and output will be created for each. In general, the field string should include the **name** and **level** of the requested data and be enclosed in single quotes. It is processed as an inline configuration file and may also include data filtering, censoring, and conversion options. For example, use **-field ‘name=”ACPCP”; level=”A6”; convert(x)=x/25.4;’** to read 6-hourly accumulated convective precipitation from a GRIB file and convert from millimeters to inches.

4. The **-name list** option is a comma-separated list of output variable names which override the default choices. If specified, the number of names must match the number of variables to be written to the output file.

Expand Down
53 changes: 53 additions & 0 deletions met/src/basic/vx_util/num_array.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1157,6 +1157,59 @@ double NumArray::wmean_fisher(const NumArray &wgt) const

}


////////////////////////////////////////////////////////////////////////


double NumArray::variance(int skip_index) const

{

if(n() == 0) return ( bad_data_double );

int j, count;
double s, s_sq, var;

s = s_sq = 0.0;
count = 0;

for(j=0; j<n(); j++) {
if(is_bad_data(e[j]) || j == skip_index) continue;
s += e[j];
s_sq += e[j]*e[j];
count++;
}

// Check for slightly negative precision error
if(count > 1) {
var = (s_sq - s*s/(double) count)/((double) (count - 1));
if(is_eq(var, 0.0)) var = 0.0;
}
else {
var = bad_data_double;
}

return(var);

}


////////////////////////////////////////////////////////////////////////


double NumArray::stdev(int skip_index) const

{

double v = variance(skip_index);

if ( !is_bad_data(v) ) v = sqrt(v);

return(v);

}


////////////////////////////////////////////////////////////////////////

//
Expand Down
4 changes: 4 additions & 0 deletions met/src/basic/vx_util/num_array.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
#include <iostream>

#include "concat_string.h"
#include "is_bad_data.h"


////////////////////////////////////////////////////////////////////////
Expand Down Expand Up @@ -107,6 +108,9 @@ class NumArray {
double mean_sqrt() const;
double mean_fisher() const;

double variance(int skip_index = bad_data_int) const;
double stdev(int skip_index = bad_data_int) const;

double wmean(const NumArray &) const;
double wmean_sqrt(const NumArray &) const;
double wmean_fisher(const NumArray &) const;
Expand Down
60 changes: 45 additions & 15 deletions met/src/libcode/vx_statistics/pair_data_ensemble.cc
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ void PairDataEnsemble::clear() {

n_ens = 0;
n_pair = 0;
ctrl_index = bad_data_int;
skip_const = false;
skip_ba.clear();

Expand All @@ -121,6 +122,7 @@ void PairDataEnsemble::clear() {

esum_na.clear();
esumsq_na.clear();
esumn_na.clear();

mn_na.clear();
mn_oerr_na.clear();
Expand Down Expand Up @@ -170,6 +172,7 @@ void PairDataEnsemble::extend(int n) {
var_plus_oerr_na.extend (n);
esum_na.extend (n);
esumsq_na.extend (n);
esumn_na.extend (n);
mn_na.extend (n);
mn_oerr_na.extend (n);

Expand Down Expand Up @@ -222,6 +225,7 @@ void PairDataEnsemble::assign(const PairDataEnsemble &pd) {
ign_na = pd.ign_na;
pit_na = pd.pit_na;
n_pair = pd.n_pair;
ctrl_index = pd.ctrl_index;
skip_const = pd.skip_const;
skip_ba = pd.skip_ba;

Expand All @@ -231,6 +235,7 @@ void PairDataEnsemble::assign(const PairDataEnsemble &pd) {

esum_na = pd.esum_na;
esumsq_na = pd.esumsq_na;
esumn_na = pd.esumn_na;

mn_na = pd.mn_na;
mn_oerr_na = pd.mn_oerr_na;
Expand Down Expand Up @@ -290,12 +295,14 @@ void PairDataEnsemble::add_ens_var_sums(int i_obs, double v) {
if(i_obs >= esum_na.n()) {
esum_na.add(0.0);
esumsq_na.add(0.0);
esumn_na.add(0.0);
}

// Track sums of the raw ensemble member values
if(!is_bad_data(v)) {
esum_na.inc(i_obs, v);
esumsq_na.inc(i_obs, v*v);
esumn_na.inc(i_obs, 1);
}

return;
Expand Down Expand Up @@ -416,17 +423,17 @@ void PairDataEnsemble::compute_pair_vals(const gsl_rng *rng_ptr) {
else {

// Compute the variance of the unperturbed ensemble members
var_unperturbed = compute_variance(esum_na[i], esumsq_na[i], v_na[i]);
var_unperturbed = compute_variance(esum_na[i], esumsq_na[i], esumn_na[i]);
var_na.add(var_unperturbed);

// Process the observation error information.
ObsErrorEntry * e = (has_obs_error() ? obs_error_entry[i] : 0);
if(e) {

// Compute perturbed ensemble mean and variance
cur_ens.compute_mean_variance(mean, var_perturbed);
mn_oerr_na.add(mean);
var_oerr_na.add(var_perturbed);
// Compute perturbed ensemble mean and variance
// Exclude the control member from the variance
mn_oerr_na.add(cur_ens.mean());
var_oerr_na.add(cur_ens.variance(ctrl_index));

// Compute the variance plus observation error variance.
var_plus_oerr_na.add(var_unperturbed +
Expand Down Expand Up @@ -467,8 +474,12 @@ void PairDataEnsemble::compute_pair_vals(const gsl_rng *rng_ptr) {
crps_emp_na.add(compute_crps_emp(o_na[i], cur_ens));
crpscl_emp_na.add(compute_crps_emp(o_na[i], cur_clm));

// Ensemble mean and standard deviation
// Exclude the control member from the standard deviation
mean = cur_ens.mean();
stdev = cur_ens.stdev(ctrl_index);

// Store Gaussian CRPS stats
cur_ens.compute_mean_stdev(mean, stdev);
crps_gaus_na.add(compute_crps_gaus(o_na[i], mean, stdev));
crpscl_gaus_na.add(compute_crps_gaus(o_na[i], cmn_na[i], csd_na[i]));
ign_na.add(compute_ens_ign(o_na[i], mean, stdev));
Expand Down Expand Up @@ -613,7 +624,7 @@ struct ssvar_bin_comp {

void PairDataEnsemble::compute_ssvar() {
int i, j;
double mn, var;
double var;
ssvar_bin_map bins;
NumArray cur;

Expand Down Expand Up @@ -644,21 +655,22 @@ void PairDataEnsemble::compute_ssvar() {
if(skip_ba[i]) continue;

// Store ensemble values for the current observation
for(j=0, cur.erase(); j<n_ens; j++) cur.add(e_na[j][i]);

// Compute standard deviation
cur.compute_mean_variance(mn, var);
// Exclude the control member from the variance
for(j=0, cur.erase(); j<n_ens; j++) {
if(j == ctrl_index) continue;
cur.add(e_na[j][i]);
}

// Build a variance point
ens_ssvar_pt pt;
pt.var = var;
pt.var = cur.variance();
pt.f = mn_na[i];
pt.o = o_na[i];
pt.w = wgt_na[i];

// Determine the bin for the current point and add it to the list
// Bins are defined starting at 0 and are left-closed, right-open
j = floor(var/ssvar_bin_size);
j = floor(pt.var/ssvar_bin_size);
string ssvar_min = str_format("%.5e", j*ssvar_bin_size).contents();
if( !bins.count(ssvar_min) ){
ssvar_pt_list pts;
Expand Down Expand Up @@ -801,7 +813,7 @@ PairDataEnsemble PairDataEnsemble::subset_pairs_obs_thresh(const SingleThresh &o
//
// Exclude from subset:
// sid_sa, lat_na, lon_na, x_na, y_na, vld_ta, lvl_ta, elv_ta,
// o_qc_sa, esum_na, esumsq_na
// o_qc_sa, esum_na, esumsq_na, esumn_na

pd.wgt_na.add(wgt_na[i]);
pd.o_na.add(o_na[i]);
Expand Down Expand Up @@ -1667,7 +1679,10 @@ void VxPairDataEnsemble::add_ens(int member, bool mn, Grid &gr) {
else {

// Track unperturbed ensemble variance sums
pd[i][j][k].add_ens_var_sums(l, fcst_v);
// Exclude the control member from the variance
if(member != pd[i][j][k].ctrl_index) {
pd[i][j][k].add_ens_var_sums(l, fcst_v);
}

// Apply observation error perturbation, if requested
if(obs_error_info->flag) {
Expand Down Expand Up @@ -1781,6 +1796,21 @@ void VxPairDataEnsemble::calc_obs_summary() {

////////////////////////////////////////////////////////////////////////

void VxPairDataEnsemble::set_ctrl_index(int index) {

for(int i=0; i < n_msg_typ; i++){
for(int j=0; j < n_mask; j++){
for(int k=0; k < n_interp; k++){
pd[i][j][k].ctrl_index = index;
}
}
}

return;
}

////////////////////////////////////////////////////////////////////////

void VxPairDataEnsemble::set_skip_const(bool tf) {

for(int i=0; i < n_msg_typ; i++){
Expand Down
4 changes: 4 additions & 0 deletions met/src/libcode/vx_statistics/pair_data_ensemble.h
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ class PairDataEnsemble : public PairBase {

int n_ens; // Number of ensemble members
int n_pair; // Number of valid pairs, n_obs - sum(skip_ba)
int ctrl_index; // Index of the control member
bool skip_const; // Skip cases where the observation and
// all ensemble members are constant
BoolArray skip_ba; // Flag for each observation [n_obs]
Expand All @@ -106,6 +107,7 @@ class PairDataEnsemble : public PairBase {

NumArray esum_na; // Sum of unperturbed ensemble values [n_obs]
NumArray esumsq_na; // Sum of unperturbed ensemble squared values [n_obs]
NumArray esumn_na; // Count of ensemble values [n_obs]

NumArray mn_na; // Ensemble mean value [n_obs]
NumArray mn_oerr_na; // Mean of perturbed members [n_obs]
Expand Down Expand Up @@ -293,6 +295,8 @@ class VxPairDataEnsemble {

void calc_obs_summary();

void set_ctrl_index(int);

void set_skip_const(bool);
};

Expand Down
Loading

0 comments on commit d60924c

Please sign in to comment.