-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rasters FeatureExtraction #3117
Rasters FeatureExtraction #3117
Conversation
raster/src/main/scala/geotrellis/raster/FeatureExtraction.scala
Outdated
Show resolved
Hide resolved
56deb7e
to
aba9c53
Compare
raster/src/main/scala/geotrellis/raster/FeatureExtraction.scala
Outdated
Show resolved
Hide resolved
430debf
to
763a10b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core interface looks good to my relatively untrained eye. This gets at the core desired functionality.
Don't forget additions to the module hierarchy document and the CHANGELOG for this feature.
raster/src/main/scala/geotrellis/raster/FeatureExtraction.scala
Outdated
Show resolved
Hide resolved
fe68872
to
44bbfb1
Compare
44bbfb1
to
fa54bf4
Compare
Signed-off-by: Grigory Pomadchin <[email protected]>
The Rasterizer is defined on Geometry, so any type variance on the feature we're using to mask the raster is pretty much wasted
Array of features from a sizable can produce very high load on heap, likely blowing out a spark job if it is used in that context. The expectation is that this feature will be used in context of RDD map/flatMap where Iterator is very benficial or the first operation will be a filter or foldLeft on the generated features.
This is important to decide how to capture or ignore pixels on the border of the geometry. Also geom and raster switch palces. Since operation is conceptually on raster it takes the first parameter place, that of self.
Previous type-class seemed too wide open for interpretation. This implementation narrows down the meaning for greater clarity.
f71733f
to
ba91ade
Compare
@pomadchin this PR is ready for re-review. This is the result of our conversation last night. |
|
||
|
||
/** Type class to convert a raster into features of cell geometries */ | ||
trait CellFeatures[R, D] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
R
is unconstrained because I want to be able to do this to RasterSource
which doesn't have much in common with Raster[Tile]
trait CellFeatures[R, D] { | ||
|
||
/** Describe cell grid of the source raster */ | ||
def cellGrid(raster: R): GridExtent[Long] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This very clearly is crying out to be a type class of its own. RasterLike
or something like that. That could be a meaningful filter on the Methods
as well. If something can't be described with a GridExtent[*]
we don't want to be talking about cell features.
IMO we can deal with that later. This method can be removed without impacting subclasses that may spring up in the meantime.
/** Produce a CellFeatures instance given functions to retrieve a values and describe the cell grid. | ||
* This function is suitable for use with rasters that fit entirely in memory. | ||
*/ | ||
def make[R, D](getGrid: R => GridExtent[Int], cellValue: (R, Int, Int) => D): CellFeatures[R, D] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cellValue
is Function3
and is not specialized, causing boxing for col/row here. Elsewhere in the code we would make a specific trait. However since the main purpose of this interface is to turn a single double cell value into a nested class with a whole geometry ... its fine, we'll let the boxing slide here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is itself a slow operation, would like to see someday how bad is boxing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I can't approve it due to github limitations, but merging it in. 💯
/** Produce a CellFeatures instance given functions to retrieve a values and describe the cell grid. | ||
* This function is suitable for use with rasters that fit entirely in memory. | ||
*/ | ||
def make[R, D](getGrid: R => GridExtent[Int], cellValue: (R, Int, Int) => D): CellFeatures[R, D] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is itself a slow operation, would like to see someday how bad is boxing here.
Rasterizer.foreachCellByGeometry(geom, grid.toRasterExtent) { case (col, row) => mask.set(col, row, 1) } | ||
for { | ||
row <- Iterator.range(0, grid.rows) | ||
col <- Iterator.range(0, grid.cols) if mask.get(col, row) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch 👍
Overview
This PR adds a type class to extract
PointFeatures
fromRasters
by a passedGeometry
Checklist
docs/CHANGELOG.rst
updated, if necessaryDemo
Notes
The user is able to "easily" provide new instances for
CellFeatures
type class for in-memory rasters by usingCellFeatures.make
and providing:getGrid: R => GridExtent[Int]
,cellValue: (R, Int, Int) => D
R
is unbounded so any kind of crazy thing that should be treated as Raster will pass mustard here.Closes #2942