Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: optionally return unique pixel counts under each polygon #270

Closed
phargogh opened this issue Aug 3, 2022 · 1 comment · Fixed by #271
Closed

Proposal: optionally return unique pixel counts under each polygon #270

phargogh opened this issue Aug 3, 2022 · 1 comment · Fixed by #271
Assignees
Labels
enhancement New feature or request in progress Working on it!

Comments

@phargogh
Copy link
Member

phargogh commented Aug 3, 2022

The InVEST HRA model has a couple of columns in an output table (R_%LOW, R_%MED, R_%HIGH in SUMMARY_STATISTICS.csv) that are derived from the pixel counts of each classification in a raster that has pixel values in the set {0, 1, 2, 3}. As it stands, the only way to get accurate reports of this information is to basically reimplement zonal_statistics, adding in the counting. These outputs cannot be derived from the current output of zonal_statistics.

It would be very handy if pygeoprocessing.zonal_statistics allowed for pixel counts under each polygon to be reported. The assumption is that this only makes sense on discrete (integer) rasters. Recording these values on a floating-point raster could easily exhaust available memory, which is why this is best kept as an optional parameter that defaults to False.

I propose modifying the function signature of pygeoprocessing.zonal_statistics to include a new parameter, include_pixel_counts:

def zonal_statistics(
        base_raster_path_band, aggregate_vector_path,
        aggregate_layer_name=None, ignore_nodata=True,
        polygons_might_overlap=True, include_pixel_counts=False,
        working_dir=None):

Using the HRA example, the return value for FID 60 would be: 11, 62, 4

{60: {
    'min': 1,
    'max': 3,
    'sum': 147,
    'count': 77,
    'nodata_count': 0,
    'pixel_counts': {
        1: 11,
        2: 62,
        3: 4,
    }
}
@phargogh phargogh self-assigned this Aug 3, 2022
@phargogh
Copy link
Member Author

We discussed this on a software team call and approved this feature without the need for a DD and understanding that pixel_counts fieldname should be renamed to value_counts

Future: consider the real meaning of statistics in zonal_statistics

We did have some interesting conversation about the meaning of the term statistics in zonal_statistics, and a value_counts dictionary isn't really a statistic as much as it is a report. ArcGIS has a zonal_histogram that produces this same kind of output, we might want to do the same in the future.

Future: consider a zonal_reduce with arbitrary reduction operator

We also talked about the possibility of wanting to provide a custom function to a zonal_statistics-style function, which could be implemented later on as a zonal_reduce or something like that later on, with zonal_statistics and zonal_histogram being aliases for specific reduce callables.

@phargogh phargogh added in progress Working on it! enhancement New feature or request labels Aug 10, 2022
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Aug 10, 2022
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Aug 10, 2022
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Aug 10, 2022
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Aug 11, 2022
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Sep 9, 2022
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Sep 9, 2022
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Sep 14, 2022
A `Counter` object allows for succinct tallying of landuse codes, so it
really makes sense to use in this case.

RE:natcap#270
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Sep 14, 2022
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Sep 14, 2022
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Sep 14, 2022
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Sep 16, 2022
I'm trying to clarify the intent of the function and the created dict
... "sample" isn't a good descriptor for what's going on here.

RE:natcap#270
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Sep 16, 2022
phargogh added a commit to phargogh/pygeoprocessing that referenced this issue Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request in progress Working on it!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant