Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add new statistics methods and band names in ImageData object #427

Merged
merged 5 commits into from
Oct 12, 2021

Conversation

vincentsarago
Copy link
Member

@vincentsarago vincentsarago commented Sep 28, 2021

This PR will bring some important change to the statistics method in rio-tiler

  • deprecate stats method
  • addstatistics method
  • add new BandStatistics model (replace ImageStatistics)
  • add band_names in ImageData class (statistics can be get for expression or indexes so we need a way to forward the band information from the ImageData object to the statistics method)
  • statistics method is only able to return stats for the full dataset (no bounds nor feature)

To Do

  • validate ☝️ changes
  • update base classes (deprecate stats and add statistics)
  • update tests
  • update documentation (wil do in another PR)
  • start docs for v2 to v3 (wil do in another PR)

"""
kwargs = {**self._kwargs, **kwargs}

data = self.preview(max_size=max_size, **kwargs)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the statistics method will only get stats for the full dataset (using preview) but we can set max_size to control the resolution the user wants

)

return {
f"{data.band_names[ix]}": BandStatistics(**stats[ix])
Copy link
Member Author

@vincentsarago vincentsarago Sep 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use band_names to keep trace of the bands we return

e.g.

with COGReader("cog.tif") as cog:
    print(cog.statistics(expression="b1*2,b1"))

>> {
    'b1*2': BandStatistics(...),
    'b1': BandStatistics(...),
}

class Config:
"""Config for model."""

extra = "allow"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we allow extra for percentile_{} values

rio_tiler/utils.py Outdated Show resolved Hide resolved
rio_tiler/utils.py Outdated Show resolved Hide resolved
histogram = [
[out_dict[x] for x in h_keys],
h_keys,
]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the data is set as categorical the histogram is then set to be the count of each values.

def statistics(
self,
bands: Union[Sequence[str], str] = None,
indexes: int = 1,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it clear that only the first internal band would be used. but we still allow indexes (e.g if you have band files with multiple internal band you could select the one you want) (this was already possible but we make it clearer by adding it in the function definition

@@ -368,7 +397,7 @@ def parse_expression(self, expression: str) -> Tuple:
return tuple(set(re.findall(_re, expression)))

def info( # type: ignore
self, assets: Union[Sequence[str], str] = None, *args, **kwargs: Any
self, assets: Union[Sequence[str], str] = None, **kwargs: Any
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

args is not part of the method definition

@@ -527,6 +600,7 @@ def _reader(asset: str, *args: Any, **kwargs: Any) -> ImageData:
if expression:
blocks = expression.split(",")
output.data = apply_expression(blocks, assets, output.data)
output.band_names = blocks
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we set the band_names to be the name of the expression blocks. This will lose any information about indexes or expression within assets. e.g if you pass expression="red/blue", indexes=2 or expression="red/blue", asset_expression=b1*5, band_names will be set to red/blue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expression will be mostly used for STAC with one band so I think for now it's ok

@vincentsarago vincentsarago marked this pull request as ready for review October 4, 2021 06:48
@vincentsarago
Copy link
Member Author

This is ready for review.

Main changes to review are:

  • new statistics class and method (matching the one we use in TiTiler). The statistics method only return stats for the whole dataset (using the preview method internally to reduce data transfer). In the documentation I'll add notes to explain how to extend the class and create custom statistics methods.

  • add band_names in ImageData. names are string values in form of:

    • COGReader: ["1", "2", "3"...] (string version of internal rasterio index names)
    • MultiBaseReader: ["{asset}_1", "{asset}_2", "{asset1}_3"] (asset name + string version of internal rasterio index names)
    • MultiBandReader: ["{band1}", "{band2}", "{band3}"]. Note we could have add index name but by definition MultiBandReader should be use for per file band data, so there should be only 1 index.

cc @kylebarron @geospatial-jeff

@vincentsarago vincentsarago merged commit cc50276 into rio-tiler-v3 Oct 12, 2021
@vincentsarago vincentsarago deleted the betterStatistics branch October 12, 2021 15:32
vincentsarago added a commit that referenced this pull request Oct 19, 2021
* switch to morecantile3 (#418)

* switch to morecantile3

* remove python 3.6

* make sure to use rio-cogeo2.3.1 in tests

* 🤦

* use rio-cogeo from github

* update fixtures and tests

* update changelog

* deprecate metadata methods (#425)

* zxy -> xyz in SpatialMixin.tile_exits method (#419)

* No max size (#422)

* remove default max_size for part and feature

* ignore type, failing in python 3.9

* use rio-cogeo alpha

* Use RIO_TILER_MAX_THREADS instead of MAX_THREADS (#432)

* MAX_THREADS to RIO_TILER_MAX_THREADS

* Update env variable in docs

* Add to changelog

* Update CHANGES.md

Co-authored-by: Vincent Sarago <[email protected]>

* allow non-earth dataset (#429)

* allow non-earth dataset

* fix stac

* metadata/info returns `geographic_bounds`

* add test for non earth object tile reading

* add notebook

* update changelog

* update docs

* update

* revert and remove min/max zoom in __init__

* edit changes

* Use httpx (#431)

* Replace requests with httpx

* Use httpx instead of requests

* Use httpx in notebooks

* Add tox to dev requirements

* Revert change and remove unused import

* update changelog

Co-authored-by: Vincent Sarago <[email protected]>

* add new statistics methods and band names in ImageData object (#427)

* add new statistics methods and band names in ImageData object

* update base classes and tests

* MultiBandReader.statistics should use self.preview as COGReader

* update tests

* start migration docs

* update docs

* remove `band_expression` option in MultiBandReader (#437)

* Asset expression indexes (#438)

* change asset_expression type and add asset_indexes

* update changelog

* moar docs

* Allow float tile_buffer (#405)

* Add tests of the expected behaviour

* Add `tile_buffer` `float` support in `COGReader.tile()`

* :facepalm

Co-authored-by: Bernhard Stadlbauer <[email protected]>
Co-authored-by: vincentsarago <[email protected]>

* update docs and allow backward compat for indexes in MultiBaseReader

* Range colormap (#439)

* add range colormap support

* update docs

* update types

* s/range/intervals/g

* remove deprecated code and update docs (#440)

* remove deprecated code and update docs

* update changelog

Co-authored-by: Rodrigo Almeida <[email protected]>
Co-authored-by: bstadlbauer <[email protected]>
Co-authored-by: Bernhard Stadlbauer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant