Feature #2395 TOTAL_DIR #2892

JohnHalleyGotway · 2024-05-16T23:04:52Z

Expected Differences

Do these changes introduce new tools, command line arguments, or configuration file options? [No]

If yes, please describe:
Do these changes modify the structure of existing or add new output data types (e.g. statistic line types or NetCDF variables)? [Yes]

If yes, please describe:

This PR adds the TOTAL_DIR column to the VL1L2, VAL1L2, and VCNT line types. It adds this column BEFORE all the stats previously added for #2395. Since these new stats have not been included in a public release, adding this in the current "middle" of the line type is fine. It also updates Stat-Analysis to use those counts when aggregating the stats from those line types that begin with DIR_.

Pull Request Testing

Describe testing already performed for these changes:

The code for this feature branch is compiled as the met_test user in seneca:/d1/projects/MET/MET_pull_requests/met-12.0.0/beta5/MET-feature_2395_TOTAL_DIR.

Tested aggregating VL1L2 lines:

bin/stat_analysis -lookin out/point_stat/point_stat_360000L_20070331_120000V_vl1l2.txt -vx_mask DTC165,DTC166 -interp_mthd NEAREST -job aggregate -line_type VL1L2 -dump_row dump_vl1l2.txt

Input:

TOTAL_DIR  DIR_ME  DIR_MAE    DIR_MSE
685 -1.75317 47.69438 4272.77589
2494 -4.9744  28.90333 1909.85557

Output:

TOTAL_DIR  DIR_ME  DIR_MAE    DIR_MSE
3179 -4.2803 32.95236 2419.00952

And these are the correct weighted averages.

Tested printing summaries of these columns.

bin/stat_analysis -lookin out/point_stat/point_stat_360000L_20070331_120000V.stat -vx_mask DTC165,DTC166 -interp_mthd NEAREST -job summary -line_type VCNT -column TOTAL,TOTAL_DIR,DIR_ME

And it's parse the correct values from the correct columns.

Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:

The code for this feature branch is compiled as the met_test user in seneca:/d1/projects/MET/MET_pull_requests/met-12.0.0/beta5/MET-feature_2395_TOTAL_DIR.

Please review code changes, doc updates, and test as you see fit.

Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes]

Updated the tables to describe the output line types.

Do these changes include sufficient testing updates? [Yes]

I added to new jobs to an existing Stat-Analysis config file to demonstrate the processing of VL1L2 lines.
This generates 2 new output files:

stat_analysis_ps/CONFIG_POINT_STAT_agg_vl1l2.stat
stat_analysis_ps/CONFIG_POINT_STAT_agg_stat_vl1l2_to_vcnt.stat

Will this PR result in changes to the MET test suite? [Yes]

If yes, describe the new output and/or changes to the existing output:

Differences are flagged in all VCNT, VL1L2, and VAL1L2 output lines.

I downloaded and inspected the diff and log artifacts from the GHA run.

Confirmed there are no more warnings about 0-vectors.
Diffs exist in 82 output files:

48 .stat files and 34 .txt files.
79 Point-Stat, 12 Grid-Stat, and 1 Stat-Analysis

I ran the commands listed below to confirm that all these diffs are limited to the VCNT, VL1L2, and VAL1L2 line types:

rm -f diff_line_types
for OUTPUT in `find ./ -type f | grep OUTPUT`; do
  TRUTH=`echo $OUTPUT | sed 's/OUTPUT/TRUTH/g'`;
  diff -b $TRUTH $OUTPUT | egrep "V12.0.0" | awk '{print $25}' | sort -u >> diff_line_types
done
cat diff_line_types | sort -u

Here's the result:

VAL1L2
VCNT
VL1L2

Will this PR result in changes to existing METplus Use Cases? [Yes]

If yes, create a new Update Truth METplus issue to describe them.

Differences are flagged in all VCNT, VL1L2, and VAL1L2 output lines.
See this dtcenter/METplus#2592 issue to update the truth data.

Do these changes introduce new SonarQube findings? [Yes]

If yes, please describe:

The SonarQube quality gate flag failed in this run with 0 new bugs but 7 new code smells.

I did modify the code to address 4 of the 7 ones flagged (Global variables should be const. and Add the "static" storage specifier to this declaration).

But chose not to address the other 3 (Extract the assignment from this expression. and Refactor this function to reduce its Cognitive Complexity from XX to the 25 allowed.)

Please complete this pull request review by [Tues 5/21/24].

Pull Request Checklist

See the METplus Workflow for details.

Review the source issue metadata (required labels, projects, and milestone).
Complete the PR definition above.
Ensure the PR title matches the feature or bugfix branch name.
Define the PR metadata, as permissions allow.
Select: Reviewer(s) and Development issue
Select: Milestone as the version that will include these changes
Select: Coordinated METplus-X.Y Support project for bugfix releases or MET-X.Y.Z Development project for official releases
After submitting the PR, select the ⚙️ icon in the Development section of the right hand sidebar. Search for the issue that this PR will close and select it, if it is not already selected.
After the PR is approved, merge your changes. If permissions do not allow this, request that the reviewer do the merge.
Close the linked issue and delete your feature or bugfix branch from GitHub.

… superceded by the new dcount and dacount VL1L2Info members to keep track of the number of valid wind direction vectors.

…types and update the header column tables.

…in the VL1L2, VAL1L2, and VCNT line types.

…d use the values to aggregate results when needed.

…' to satisfy the finding that 'Global variables should be const.' Should probably switch from 'char char *' to strings eventually. But for now, I'm just making up for some SonarQube technical debt.

…_header_columns_V12.0.txt file

… of VL1L2 lines.

georgemccabe

The code changes look good to add an additional line to the vector line types. I see that the SonarQube scan failed but this was addressed in the PR notes.

I see that the test run fails during the difference step, but the errors seem to imply that the test function failed.

COMPARING ref_config_lead_00/point_stat/AFWAv3.4_Noahv3.3/point_stat_AFWAv3.4_Noahv3.3_F000_WINDS_000000L_20110902_000000V.stat
file1: /data/output/met_test_truth/ref_config_lead_00/point_stat/AFWAv3.4_Noahv3.3/point_stat_AFWAv3.4_Noahv3.3_F000_WINDS_000000L_20110902_000000V.stat
file2: /data/output/met_test_output/ref_config_lead_00/point_stat/AFWAv3.4_Noahv3.3/point_stat_AFWAv3.4_Noahv3.3_F000_WINDS_000000L_20110902_000000V.stat
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
line 1 did not have 100 elements
In addition: Warning message:
In system(paste(strCmd, "2>&1"), intern = T) :
running command '/usr/bin/diff /data/output/met_test_truth/ref_config_lead_00/point_stat/AFWAv3.4_Noahv3.3/point_stat_AFWAv3.4_Noahv3.3_F000_WINDS_000000L_20110902_000000V.stat /data/output/met_test_output/ref_config_lead_00/point_stat/AFWAv3.4_Noahv3.3/point_stat_AFWAv3.4_Noahv3.3_F000_WINDS_000000L_20110902_000000V.stat 2>&1' had status 1
ERROR: compareStat() failed

Does this message indicate that there were differences in the output? If so, it may be good to improve the messaging so it is more clear that there are differences found instead implying that something failed in the diff logic.

…king 'DIR_', check 'DIR_ME', 'DIR_MAE', and 'DIR_MSE' to avoid an false positive match for the 'DIR_ERR' column which is computed from the vector partial sums rather than the individual direction differences.

JohnHalleyGotway · 2024-05-20T15:48:19Z

Thanks for taking a look @georgemccabe. I did just find/address a "false positive" match in the is_vector_dir_stat() function.

As for the diffs, yes, the rather brittle diff logic fails when the number of columns differ. Rather than enhancing the existing Rscript diff logic, I'd rather switch over to the Python-based one used by METplus soon.

JohnHalleyGotway · 2024-05-21T17:04:31Z

@bikegeek, FYI, I documented the METplus-Analysis impacts with dtcenter/METdataio#307 and dtcenter/METcalcpy#384.

georgemccabe

These changes look good. I see differences in the test output that is expected. I expected some METplus use cases to produce differences after this is merged into develop. This will likely require an update to the truth data. An Update Truth METplus GitHub issue should be created to describe the expected differences.

JohnHalleyGotway added 4 commits May 16, 2024 23:01

Per #2395, remove the n_dir_undef and n_dira_undef variables that are…

0365952

… superceded by the new dcount and dacount VL1L2Info members to keep track of the number of valid wind direction vectors.

Per #2395, add TOTAL_DIR columns to the VL1L2, VAL1L2, and VCNT line …

7f9fe09

…types and update the header column tables.

Per #2395, update the User's Guide to list the new TOTAL_DIR columns …

7358ae6

…in the VL1L2, VAL1L2, and VCNT line types.

Per #2395, update stat_analysis to parse the new TOTAL_DIR columns an…

31e56a0

…d use the values to aggregate results when needed.

JohnHalleyGotway added this to the MET 12.0.0 milestone May 16, 2024

JohnHalleyGotway linked an issue May 16, 2024 that may be closed by this pull request

Add new wind direction verification statistics for RMSE, Bias, and MAE #2395

Closed

24 tasks

JohnHalleyGotway added 2 commits May 17, 2024 16:06

Per #2395, for SonarQube change 'const char *' to 'const char * const…

35d4373

…' to satisfy the finding that 'Global variables should be const.' Should probably switch from 'char char *' to strings eventually. But for now, I'm just making up for some SonarQube technical debt.

Per #2395, fix typo in placement of the DIR_ME column name in the met…

a69e258

…_header_columns_V12.0.txt file

JohnHalleyGotway requested review from bikegeek and georgemccabe May 17, 2024 19:31

Per #2395, add 2 new Stat-Analysis jobs to demonstrate the processing…

588e4db

… of VL1L2 lines.

JohnHalleyGotway mentioned this pull request May 17, 2024

Update Truth: For dtcenter/MET#2892 dtcenter/METplus#2592

Closed

11 tasks

georgemccabe reviewed May 20, 2024

View reviewed changes

Per #2395, update logic of is_vector_dir_stat(). Instead of just chec…

406afe4

…king 'DIR_', check 'DIR_ME', 'DIR_MAE', and 'DIR_MSE' to avoid an false positive match for the 'DIR_ERR' column which is computed from the vector partial sums rather than the individual direction differences.

JohnHalleyGotway mentioned this pull request May 21, 2024

Enhance METdataio schema to load the TOTAL_DIR column added to the VL1L2, VAL1L2, and VCNT line types during the MET 12.0.0 beta5 development cycle dtcenter/METdataio#307

Closed

21 tasks

georgemccabe approved these changes May 21, 2024

View reviewed changes

JohnHalleyGotway removed the request for review from bikegeek May 22, 2024 17:27

JohnHalleyGotway merged commit 889f1b2 into develop May 22, 2024
36 of 38 checks passed

github-actions bot mentioned this pull request May 22, 2024

Update develop-ref after dtcenter/MET#2892 dtcenter/METplus#2598

Merged

JohnHalleyGotway linked an issue May 22, 2024 that may be closed by this pull request

Update Truth: For dtcenter/MET#2892 dtcenter/METplus#2592

Closed

11 tasks

github-actions bot mentioned this pull request May 23, 2024

Update develop-ref after dtcenter/MET#2892 and dtcenter/MET#2899 dtcenter/METplus#2600

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature #2395 TOTAL_DIR #2892

Feature #2395 TOTAL_DIR #2892

JohnHalleyGotway commented May 16, 2024 •

edited

Loading

georgemccabe left a comment

JohnHalleyGotway commented May 20, 2024

JohnHalleyGotway commented May 21, 2024

georgemccabe left a comment

Feature #2395 TOTAL_DIR #2892

Feature #2395 TOTAL_DIR #2892

Conversation

JohnHalleyGotway commented May 16, 2024 • edited Loading

Expected Differences

Pull Request Testing

Pull Request Checklist

georgemccabe left a comment

Choose a reason for hiding this comment

JohnHalleyGotway commented May 20, 2024

JohnHalleyGotway commented May 21, 2024

georgemccabe left a comment

Choose a reason for hiding this comment

JohnHalleyGotway commented May 16, 2024 •

edited

Loading