Skip to content

Commit

Permalink
Feature #2279 point_weight_flag (#2993)
Browse files Browse the repository at this point in the history
* Per #2887, update NumArray::vals() to return a reference to the vector rather a pointer to doubles.

* Per #2887, switch over the whole ContingencyTable class heirarchy from storing integer counts to storing double-precision weights.

* Add ContingencyTable::is_integer() member function to check whether the table contains all integers

* Per #2887, update parse_stat_line.cc to get it to compile after changing PCT to store thresholds in a std::vector.

* Per #2887, update PCTInfo::clear() logic.

* Per #2887, update ctc_by_row() logic to create reproducible results with the develop branch.

* Per #2887, update logic of define_prob_bins() to add a final >=1.0 threshold if needed. While ==0.1 works fine, I found that ==0.05 did not because the last >=1.0 threshold was missing likely do to floating point precision issues. This change should fix that problem.

* Per #2887, update roc_auc() function to match the develop branch

* Per #2887, fix bug if computation of far()

* Per #2887, replaced all ==0 integer equality checks with calls to is_eq() instead and fix a couple of equations to snuff out diffs in some CTS statistics.

* Per #2887, address some of the 34 SonarQube code smells flagged for this PR. Note that the compute_ci.h/.cc changes are necessary and good since we should be computing CI's using doubles instead of integer counts.

* Per #2887, update run_sonarqube.sh to specify the target CXX standard as 11. The hope is that that will limit the findings to only those features available in the C++11 standard.

* Per #2887, update to SonarQube version 6.1.0.4477 released on 6/27/2024.

* Per #2887, updating build_met_sonarqube.sh to specify --std=c++11 since c++17 is used by default

* Per #2887, swap in a much simpler implementation of the ORSS statistic to match the equation listed in the MET User's Guide.

* Per #2887, update grid_stat and library code to actually apply the grid_weight_flag settings to the computation of contingency table counts and statistics.

* Per #2887, fix the handling of bad data in the ORSS equation.

* Per #2887, add Npairs member to the ContingencyTable class, eliminate the n() accessor function, and carefully replace references to n() with n_pairs() for the integer number of matched pairs or total() with the double-precision sum of the weights.

* Per #2887, reset Npairs = 0 for ContingencyTable::zero_out()

* Per #2883, need to call set_n_pairs() in a few spots to set ECLV TOTAL column correctly ci-run-unit

* Per #2887, call set_n_pairs() when aggregating PCT data in Series-Analysis ci-run-unit

* Per #2887, update stat_analysis to parse the TOTAL column for the PCT and MCTC line types.

* Pet #2882, call set_n_pairs() after set_size() ci-run-unit

* Per #2887, reconfigure existing Ensemble-Stat unit test to request probabilistic output to see that it's impacted by the grid_weight_flag setting.

* Per #2887, update Ensemble-Stat test to provide climo stdev data

* Per #2887, add grid_weight_flag to the list of config options for Grid-Stat and Ensemble-Stat.

* Per #2887, disable FHO output if grid_weight_flag != NONE.

* Per #2887, revise the existing unit_grid_weight.xml unit tests for Grid-Stat to write CTC/CTS/MCTC/MCTS output and for the DESC column to be populated to indicate the type of grid weighting that was applied.

* Per #2279, add the MaskSID struct to store information about station id names and corresponding weights.

* Per #2279, add new PointWeightType enumeration along with code to parse it.

* Per #2279, adding point_weight_flag option to all Point-Stat and Ensemble-Stat config file and tweaking whitespace.

* Per #2279, add point_weight_flag to the Point-Stat and Ensemble-Stat config class. Also remove sue unneeded wgt_dp argument for the add_point_obs() functions. Plan to add logic to set the point weights only AFTER all the observations have been collected for each verification task.

* Per #2279, use the default_weight contstant instead of the literal 1.0 value.

* Per #2279, add stubs for actually applying the point_weight_flag settings.

* Per #2279, fix PairBase to actually set point weight values parsed from station id masks.

* Per #2279, trying to fix 2 sonarqurqube bugs

* Per #2279, fix a couple bugs parsing the SID weights and add a new unit_point_weight.xml unit test to run Point-Stat on scalar and probability inputs weighting the stations by their elevation. Still need to add Ensemble-Stat calls.

* Per #2279, fix small bug ci-run-unit

* Per #2279, add ensemble_stat calls to unit_point_weight.xml

* Per #2279, add documentation about the point_weight_flag configuration option.

* Per #2279, working on debug and warning messages.

* Per #2279, tweak the user's guide

* Per #2279, switch MaskSID::sid_list from a vector of pairs to a simpler map named sid_map.

* Per #2279, fix the madis2nc call to parse_sid_mask()

* Per #2279, move MaskSID from vx_config over into dedicated vx_util/mask_sid.h and .cc to be consistent with mask_poly.h. I note that the members of the MaskSID struct were not being initialized properly. So making it a complete class was the right solution.

* Per #2279, another change to make it compile.

* Per #2279, more tweaks to get it to compile.
  • Loading branch information
JohnHalleyGotway authored Oct 16, 2024
1 parent c5cd28d commit 7b73439
Show file tree
Hide file tree
Showing 126 changed files with 3,848 additions and 550 deletions.
4 changes: 3 additions & 1 deletion .github/workflows/testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ jobs:
- jobid: 'job1'
tests: 'ascii2nc_indy pb2nc_indy tc_dland tc_pairs tc_stat plot_tc tc_rmw rmw_analysis tc_diag tc_gen'
- jobid: 'job2'
tests: 'met_test_scripts mode_multivar mode_graphics mtd regrid airnow gsi_tools netcdf modis series_analysis wwmca_regrid gen_vx_mask grid_weight interp_shape grid_diag grib_tables lidar2nc shift_data_plane trmm2nc aeronet wwmca_plot ioda2nc gaussian'
tests: 'met_test_scripts mode_multivar mode_graphics mtd regrid airnow gsi_tools netcdf modis series_analysis wwmca_regrid gen_vx_mask interp_shape grid_diag grib_tables lidar2nc shift_data_plane trmm2nc aeronet wwmca_plot ioda2nc gaussian'
fail-fast: false
steps:
- uses: actions/checkout@v4
Expand Down Expand Up @@ -310,6 +310,8 @@ jobs:
tests: 'ensemble_stat stat_analysis_es'
- jobid: 'job5'
tests: 'ugrid'
- jobid: 'job6'
tests: 'grid_weight point_weight'
fail-fast: false
steps:
- uses: actions/checkout@v4
Expand Down
4 changes: 4 additions & 0 deletions data/config/ConfigConstants
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,10 @@ NONE = 1;
COS_LAT = 2;
AREA = 3;

// Point weight flag settings
NONE = 1;
SID = 2;

// Duplicate flag settings
NONE = 1;
UNIQUE = 2;
Expand Down
8 changes: 5 additions & 3 deletions data/config/EnsembleStatConfig_default
Original file line number Diff line number Diff line change
Expand Up @@ -266,8 +266,10 @@ rng = {

////////////////////////////////////////////////////////////////////////////////

grid_weight_flag = NONE;
output_prefix = "";
version = "V12.0.0";
grid_weight_flag = NONE;
point_weight_flag = NONE;

output_prefix = "";
version = "V12.0.0";

////////////////////////////////////////////////////////////////////////////////
7 changes: 4 additions & 3 deletions data/config/GridStatConfig_default
Original file line number Diff line number Diff line change
Expand Up @@ -271,8 +271,9 @@ nc_pairs_flag = {
////////////////////////////////////////////////////////////////////////////////

grid_weight_flag = NONE;
tmp_dir = "/tmp";
output_prefix = "";
version = "V12.0.0";

tmp_dir = "/tmp";
output_prefix = "";
version = "V12.0.0";

////////////////////////////////////////////////////////////////////////////////
8 changes: 5 additions & 3 deletions data/config/PointStatConfig_default
Original file line number Diff line number Diff line change
Expand Up @@ -305,8 +305,10 @@ output_flag = {

////////////////////////////////////////////////////////////////////////////////

tmp_dir = "/tmp";
output_prefix = "";
version = "V12.0.0";
point_weight_flag = NONE;

tmp_dir = "/tmp";
output_prefix = "";
version = "V12.0.0";

////////////////////////////////////////////////////////////////////////////////
44 changes: 35 additions & 9 deletions docs/Users_Guide/config_options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1847,15 +1847,18 @@ in the following ways:
embedded within another quoted string. Any such embedded quotes must
be escaped using a preceeding backslash character.

* The "sid" entry is an array of strings which define groups of
observation station ID's over which to compute statistics. Each entry
in the array is either a filename of a comma-separated list.

* For a filename, the strings are whitespace-separated. The first
string is the mask "name" and the remaining strings are the station
* The "sid" entry is an array of strings which define groups of observation station
ID's over which to compute statistics. Each station ID string can be followed by an
optional numeric weight enclosed in parenethesis and used by the "point_weight_flag"
configuration option. Each entry in the "sid" "array is either a filename or a
comma-separated list.

* For an ASCII filename, the strings contained within it are whitespace-separated.
The first string is the mask "name" and the remaining strings are the station
ID's to be used.
* For a comma-separated list, optionally use a colon to specify a name.
For "MY_LIST:SID1,SID2", name = MY_LIST and values = SID1 and SID2.
For "MY_LIST:SID1(WGT1),SID2(WGT2)", name = MY_LIST which consists of
two station ID's (SID1 and SID2) and optional numeric weights (WGT1 and WGT2).
* For a comma-separated list of length one with no name specified, the
mask "name" and value are both set to the single station ID string.
For "SID1", name = SID1 and value = SID1.
Expand All @@ -1865,6 +1868,7 @@ in the following ways:
For "SID1,SID2", name = MASK_SID and values = SID1 and SID2.
* The "name" of the station ID mask is written to the VX_MASK column
of the MET output files.

* The "llpnt" entry is either a single dictionary or an array of
dictionaries. Each dictionary contains three entries, the "name" for
the masking region, "lat_thresh", and "lon_thresh". The latitude and
Expand Down Expand Up @@ -2353,8 +2357,9 @@ NBRCTC), partial sums (SL1L2, SAL1L2, VL1L2, and VAL1L2), and statistics
It is meant to account for grid box area distortion and is often applied
to global Lat/Lon grids. It is only applied for grid-to-grid verification
in Grid-Stat and Ensemble-Stat and is not applied for grid-to-point
verification. It can only be defined once at the highest level of config
file context and applies to all verification tasks for that run.
verification, which is controlled by the "point_weight_flag" option.
It can only be defined once at the highest level of config file context
and applies to all verification tasks for that run.

Three grid weighting options are currently supported:

Expand Down Expand Up @@ -2391,6 +2396,27 @@ versions of MET.
grid_weight_flag = NONE;
point_weight_flag
-----------------

The "point_weight_flag" is similar to the "grid_weight_flag", described above,
but applies to grid-to-point verification in Point-Stat and Ensemble-Stat.
It is not applied for grid-to-grid verification which is controlled by the
"grid_weight_flag" option. It can only be defined once at the highest level
of config file context and applies to all verification tasks for that run.

While only one point weighting option is currently supported, additional
methods are planned for future versions:

* NONE to disable point weighting using a constant weight of 1.0 (default).

* SID to use the weights defined by the station ID masking configuration option,
"mask.sid".

.. code-block:: none
point_weight_flag = NONE;
hss_ec_value
------------

Expand Down
14 changes: 13 additions & 1 deletion docs/Users_Guide/ensemble-stat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -182,13 +182,25 @@ ____________________
obs_perc_value = 50;
message_type_group_map = [...];
grid_weight_flag = NONE;
point_weight_flag = NONE;
output_prefix = "";
version = "VN.N";
The configuration options listed above are common to many MET tools and are described in :numref:`config_options`.

Note that the **HIRA** interpolation method is only supported in Ensemble-Stat.
.. note::

The **HIRA** interpolation method is only supported in Ensemble-Stat.

.. note::

The "grid_weight_flag" and "point_weight_flag" options described in
:numref:`config_options` define how matched pairs are weighted for
grid-to-grid and grid-to-point verification in Ensemble-Stat. These
weights currently only apply to the computation of probabilistic
outputs (PCT, PSTD, PJC, and PRC) but no other Ensemble-Stat output
line types.

_____________________

Expand Down
1 change: 1 addition & 0 deletions docs/Users_Guide/point-stat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -362,6 +362,7 @@ ________________________
obs_summary = NONE;
obs_perc_value = 50;
message_type_group_map = [...];
point_weight_flag = NONE;
tmp_dir = "/tmp";
output_prefix = "";
version = "VN.N";
Expand Down
8 changes: 5 additions & 3 deletions internal/test_unit/config/EnsembleStatConfig
Original file line number Diff line number Diff line change
Expand Up @@ -226,8 +226,10 @@ rng = {

////////////////////////////////////////////////////////////////////////////////

grid_weight_flag = NONE;
output_prefix = "${OUTPUT_PREFIX}";
version = "V12.0.0";
grid_weight_flag = NONE;
point_weight_flag = NONE;

output_prefix = "${OUTPUT_PREFIX}";
version = "V12.0.0";

////////////////////////////////////////////////////////////////////////////////
8 changes: 5 additions & 3 deletions internal/test_unit/config/EnsembleStatConfig_MASK_SID
Original file line number Diff line number Diff line change
Expand Up @@ -218,8 +218,10 @@ rng = {

////////////////////////////////////////////////////////////////////////////////

grid_weight_flag = NONE;
output_prefix = "${OUTPUT_PREFIX}";
version = "V12.0.0";
grid_weight_flag = NONE;
point_weight_flag = NONE;

output_prefix = "${OUTPUT_PREFIX}";
version = "V12.0.0";

////////////////////////////////////////////////////////////////////////////////
8 changes: 5 additions & 3 deletions internal/test_unit/config/EnsembleStatConfig_climo
Original file line number Diff line number Diff line change
Expand Up @@ -248,8 +248,10 @@ rng = {

////////////////////////////////////////////////////////////////////////////////

grid_weight_flag = NONE;
output_prefix = "${OUTPUT_PREFIX}";
version = "V12.0.0";
grid_weight_flag = NONE;
point_weight_flag = NONE;

output_prefix = "${OUTPUT_PREFIX}";
version = "V12.0.0";

////////////////////////////////////////////////////////////////////////////////
8 changes: 5 additions & 3 deletions internal/test_unit/config/EnsembleStatConfig_grid_weight
Original file line number Diff line number Diff line change
Expand Up @@ -240,8 +240,10 @@ rng = {

////////////////////////////////////////////////////////////////////////////////

grid_weight_flag = ${GRID_WEIGHT};
output_prefix = "${OUTPUT_PREFIX}";
version = "V12.0.0";
grid_weight_flag = ${GRID_WEIGHT};
point_weight_flag = NONE;

output_prefix = "${OUTPUT_PREFIX}";
version = "V12.0.0";

////////////////////////////////////////////////////////////////////////////////
8 changes: 5 additions & 3 deletions internal/test_unit/config/EnsembleStatConfig_one_cdf_bin
Original file line number Diff line number Diff line change
Expand Up @@ -232,8 +232,10 @@ rng = {

////////////////////////////////////////////////////////////////////////////////

grid_weight_flag = NONE;
output_prefix = "${OUTPUT_PREFIX}";
version = "V12.0.0";
grid_weight_flag = NONE;
point_weight_flag = NONE;

output_prefix = "${OUTPUT_PREFIX}";
version = "V12.0.0";

////////////////////////////////////////////////////////////////////////////////
Loading

0 comments on commit 7b73439

Please sign in to comment.