Correct HRA summary statistics table #1083

phargogh · 2022-09-15T23:06:50Z

Description

This PR remains a work in progress until natcap/pygeoprocessing#271 is resolved and released.

Fixes #1080

The user's guide will need an update for this change.

Checklist

Updated HISTORY.rst (if these changes are user-facing)
Updated the user's guide (if needed)
Tested the affected models' UIs (if relevant)

RE:natcap#1080

Adding this field seems right because it means we can make sure all 4 of the fields add up to 100%. RE:natcap#1080

RE:natcap#1080

Mostly just division by zero or missing keys. RE:natcap#1080

RE:natcap#1080

…080-hra-summary-statistcs-table-incorrect Conflicts: HISTORY.rst

…fork RE:natcap#1080

…rrect

…ithub.com:phargogh/invest into bugfix/1080-hra-summary-statistcs-table-incorrect

phargogh · 2022-12-14T07:27:56Z

Tests are currently failing pending #1134

davemfish

Hey @phargogh , I had a couple questions in the test to try to clarify why the expected values are what they are. Thanks for taking a look!

davemfish · 2022-12-14T16:16:01Z

HISTORY.rst

+      standard error emitted from an invest model.
+    * Added a new "Save as" dialog window to handle different save options, and
+      allow the option to use relative paths in a JSON datastack
+      (`#1088 <https://github.com/natcap/invest/issues/1088>`_)


Somehow these bullets got duplicated here.

Huh, wonder how that happened! Patched in 395a757

davemfish · 2022-12-14T17:37:22Z

src/natcap/invest/hra.py

-            reclassified_count = max(stats['R_N_ANY'], 1)
+                        subregion_stats[fieldname] = reduce_func([
+                            subregion_stats[fieldname],
+                            stats_under_feature[opname.lower()]])


This kind of confused me at first...why we're calling the reducing functions like min, max here. Don't we already have those stats in raster_stats? But I guess we're allowing for multiple features with the same name to be treated as the same subregion, and so would need to further accumulate those stats here. Is that right?

Yes, the intent is to support multiple features having the same name being treated as the same subregion.

I'm not sure I understand what you mean by accumulating them further ... since subregion_stats_by_name is a collections.defaultdict(), we're initializing the values the first time a subregion is identified so this reduce_func accumulation should be the only accumulation that's needed.

The only other sort of accumulation that should be needed would be if no subregion names are provided, in which case we default to all regions having the name "Total Region". In this case, the same reduce_func accumulation applies, we're just accumulating over all the features' stats.

I think we're handling these cases as expected in the tests, too, under test_summary_stats(), but I could absolutely be missing something.

Could you elaborate a little on the further accumulation?

I'm not sure I understand what you mean by accumulating them further

I was referring to the zonal_stats operation as the first accumulation/reduction -- across pixels under a feature. So I was trying to process why we would still need to be calling a reducing function after we already found the min, max, etc from zonal_stats. But you've answered that...it's the need to reduce (or accumulate, to re-use my poorly chosen term) across features with the same name.

davemfish · 2022-12-14T18:17:42Z

tests/test_hra.py

+                self.workspace_dir, 'summary_classes.tif')
+        }
+        pygeoprocessing.numpy_array_to_raster(
+            numpy.array([[2, 3, 2, 3]], dtype=numpy.int8),


Are these values in the array directly related to the expected values of the summary stats? If so, is there a way -- either a variable or just a comment -- to make that connection?

Yes, absolutely. In edaa069 I reworked the test so that variable names could be more clear and also to derive test values from the arrays they're generated from. I hope that'll help make the connection ... I think this should make the test easier to maintain too, but let me know what you think!

Thanks that's very helpful!

davemfish · 2022-12-14T18:20:58Z

tests/test_hra.py

+                    'R_%HIGH': 50.0,
+                    'R_%MEDIUM': 50.0,
+                    'R_%LOW': 0,
+                    'R_%NONE': 0,


I'm a bit confused by these numbers. Somehow the % from all stressors are different than the % from one stressor, even though in this test there is only one stressor, right?

Yes, if we were running the model with only one habitat/stressor pair, we would expect the cumulative risk to match the pairwise risk. In this case in this test, I'm putting in different values for the cumulative risk classes to test the summary table construction only -- these values are definitely not real or reasonable model outputs!

I've added a note to clarify this (and improved variable names a bit) in 43a37ec.

Gotcha, thanks for adding that clarification in the test

I wasn't being clear about which classification arrays were affecting which records in the tests. This commit changes that by renaming a few array variables, clarifying which values are derived from which arrays, and adding a helper function to clearly derive values from arrays. RE:natcap#1080

RE:natcap#1080

davemfish

Some unrelated test is still failing simply because the Actions here are not checking out the latest commit on main, and triggering a re-run doesn't seem to make that happen. I'm confident all these will pass on the merge commit, so I'm going to merge.

phargogh and others added 25 commits September 10, 2022 15:29

Reimplementing SUMMARY_STATISTICS.csv to use zonal_statistics.

b9dcc68

RE:natcap#1080

Fixing a few bugs in the table creation step.

a4c59db

RE:natcap#1080

Adding a %NONE field for no risk.

76ecd5a

Adding this field seems right because it means we can make sure all 4 of the fields add up to 100%. RE:natcap#1080

Removing no-longer-used functions.

0d94a8b

RE:natcap#1080

Using in-development value_counts pygeoprocessing branch. RE:natcap#1080

9aeb56e

Adding docstring for new stats table function. RE:natcap#1080

765ca7a

Rewriting explicit dict creation with comprehensions. RE:natcap#1080

9bbccb4

Minor comment and linting. RE:natcap#1080

f1b7d3d

Fixing a few issues that came up in tests.

b7730a1

Mostly just division by zero or missing keys. RE:natcap#1080

Removing a test that's no longer needed. RE:natcap#1080

bd91d26

Reducing the indentation level in a nested loop.

a59ffaf

RE:natcap#1080

Adding a test for the summary statistics table.

6c68518

RE:natcap#1080

Noting changes in HISTORY. RE:natcap#1080

acc8d5d

Merge branch 'main' of https://github.com/natcap/invest into bugfix/1…

9a25463

…080-hra-summary-statistcs-table-incorrect Conflicts: HISTORY.rst

I forgot to install pygeoprocessing via pip and not conda.

fec27e0

Using latest merged version of upstream pygeoprocessing. RE:natcap#1080

70e1d81

Merge branch 'main' of https://github.com/natcap/invest into bugfix/1…

296cd96

…080-hra-summary-statistcs-table-incorrect Conflicts: HISTORY.rst

Capping qtawesome to avoid recent api change. RE:natcap#1080

59a948e

Fixing pygeoprocessing version hash issue; pointing back to personal …

ba398f0

…fork RE:natcap#1080

Merge branch 'main' into bugfix/1080-hra-summary-statistcs-table-inco…

1003937

…rrect

Merge branch 'main' into bugfix/1080-hra-summary-statistcs-table-inco…

506055c

…rrect

Merge branch 'main' into bugfix/1080-hra-summary-statistcs-table-inco…

068ef70

…rrect

Merge branch 'main' into bugfix/1080-hra-summary-statistcs-table-inco…

198ff29

…rrect

Bumping pygeoprocessing revision. RE:natcap#1080

e180cdb

Merge branch 'bugfix/1080-hra-summary-statistcs-table-incorrect' of g…

ef261ff

…ithub.com:phargogh/invest into bugfix/1080-hra-summary-statistcs-table-incorrect

phargogh requested a review from davemfish December 14, 2022 04:00

phargogh marked this pull request as ready for review December 14, 2022 04:00

phargogh assigned davemfish and phargogh Dec 14, 2022

phargogh mentioned this pull request Dec 14, 2022

Update pygeoprocessing to 2.3.5 #1126

Closed

davemfish reviewed Dec 14, 2022

View reviewed changes

phargogh added 3 commits December 14, 2022 14:02

Removing history notes that had somehow been duplicated. RE:natcap#1080

395a757

Renaming variables, adding a comment for clarity.

43a37ec

RE:natcap#1080

phargogh requested a review from davemfish December 14, 2022 23:08

davemfish approved these changes Dec 15, 2022

View reviewed changes

davemfish merged commit 99dc889 into natcap:main Dec 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct HRA summary statistics table #1083

Correct HRA summary statistics table #1083

phargogh commented Sep 15, 2022

phargogh commented Dec 14, 2022

davemfish left a comment

davemfish Dec 14, 2022

phargogh Dec 14, 2022

davemfish Dec 14, 2022

phargogh Dec 14, 2022

davemfish Dec 15, 2022 •

edited

Loading

davemfish Dec 14, 2022

phargogh Dec 14, 2022 •

edited

Loading

davemfish Dec 15, 2022

davemfish Dec 14, 2022

phargogh Dec 14, 2022 •

edited

Loading

davemfish Dec 15, 2022

davemfish left a comment

Correct HRA summary statistics table #1083

Correct HRA summary statistics table #1083

Conversation

phargogh commented Sep 15, 2022

Description

Checklist

phargogh commented Dec 14, 2022

davemfish left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davemfish Dec 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phargogh Dec 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phargogh Dec 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davemfish left a comment

Choose a reason for hiding this comment

davemfish Dec 15, 2022 •

edited

Loading

phargogh Dec 14, 2022 •

edited

Loading

phargogh Dec 14, 2022 •

edited

Loading