Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] - Geo Hash Grid Aggregation on Geo Shape Field #193

Closed
navneet1v opened this issue Dec 5, 2022 · 5 comments
Closed

[RFC] - Geo Hash Grid Aggregation on Geo Shape Field #193

navneet1v opened this issue Dec 5, 2022 · 5 comments
Assignees
Labels
aggregations label for all the aggregations that are getting created in this repo feature geospatial v2.9.0 v2.9.0

Comments

@navneet1v
Copy link
Collaborator

navneet1v commented Dec 5, 2022

The purpose of this RFC (request for comments) is to gather community feedback on a proposal to allow OpenSearch users to facilitate Geo Hash Grid Aggregations over the GeoShape data type. This RFC is in continuation with #84.

Geo Hash Grid Aggregation

GeoHash Grid is a multi-bucket aggregation which will grouping the GeoShapes into different buckets where each bucket represent the cells in a grid. Each cell is labeled using a geohash which is of user-definable precision. The working of the aggregation is very similar to GeoHash Grid geo points Aggregation but there is key difference. A geo_point(if it is not multi geo_point) is only present in one bucket, but the a geo_shape will be counted in all the geo hashes grid cells with which the shape is intersecting.

Aggregation Schema

{
  "aggs/aggregations": {
    "<user-provided-aggregation-name>": {
      "geohash_grid": {
        "field": "<field-name-on-which-aggregation-will-be-performed>",
        "precision" : 10, // default value is 5
        "bounds": {  // optional object
          "top_left": <Point representing the top left corner>,
          "bottom_right": <Point representing the bottom right corner>
        }
      }
    }
  }
}

Input Parmeters:

  • precision: It is defined as the string length of the GeoHash bucket in the output. Default value is 5.
  • bounds: It is an optional object, which bounds the region over which the aggregation is happening. Shapes which are intersecting with this bound or completely present in this bounds will only be included in the aggregation output.

Example

Index Creation

PUT /example
{
    "mappings": {
        "properties": {
            "location": {
                "type": "geo_shape"
            }
        }
        
    }
}

Index Data

POST /example/_bulk?refresh
{"index":{"_id":191}}
{"name": "NEMO Science Museum","location": {"type": "envelope","coordinates": [ [100.0, 1.0], [101.0, 0.0] ]}}
{"index":{"_id":219}}
{"name": "NEMO Science Museum","location": {"type": "envelope","coordinates": [ [100.0, 1.0], [106.0, 0.0] ]}}

Run Aggregation

POST /myexample/_search?size=0
{
  "aggregations": {
    "myaggregation": {
      "geohash_grid": {
        "field": "location",
        "precision": 3
      }
    }
  }
}

Output of Aggregation

{
....
  "aggregations": {
        "myaggregation": {
            "buckets": [
                {
                    "key": "w0p",
                    "doc_count": 2
                },
                {
                    "key": "w25",
                    "doc_count": 1
                },
                {
                    "key": "w24",
                    "doc_count": 1
                },
                {
                    "key": "w21",
                    "doc_count": 1
                },
                {
                    "key": "w20",
                    "doc_count": 1
                }
            ]
        }
    }

References

  1. RFC for Aggregations: [RFC] - Implement Aggregations on Geo Shape Field #84
  2. [Feature] : Geo Hash Grid Aggregation on Geo Shape field #191
@navneet1v navneet1v self-assigned this Dec 5, 2022
@navneet1v navneet1v added the aggregations label for all the aggregations that are getting created in this repo label Dec 5, 2022
@navneet1v navneet1v added the v2.9.0 v2.9.0 label Jul 7, 2023
@navneet1v
Copy link
Collaborator Author

navneet1v commented Jul 7, 2023

The feature is pushed in the 2.x branch of OpenSearch core.

@ohltyler
Copy link
Member

@navneet1v is this part of 2.9? If not, can it be re-labeled with the target version?

@heemin32
Copy link
Collaborator

heemin32 commented Jul 14, 2023

Only backend side of work will be released in 2.9. The front end side of work is not completed.

@navneet1v
Copy link
Collaborator Author

@ohltyler I will this issue once 2.9 is released. There is frontend work but we will track it separately.

@navneet1v
Copy link
Collaborator Author

Closing this RFC as the code is merged to 2.9 branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aggregations label for all the aggregations that are getting created in this repo feature geospatial v2.9.0 v2.9.0
Projects
None yet
Development

No branches or pull requests

3 participants