Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Breaking regression 2.8 -> 2.9 for geo_shape #10795

Closed
simlu opened this issue Oct 20, 2023 · 5 comments
Closed

[BUG] Breaking regression 2.8 -> 2.9 for geo_shape #10795

simlu opened this issue Oct 20, 2023 · 5 comments
Assignees
Labels
bug Something isn't working Geospatial untriaged

Comments

@simlu
Copy link

simlu commented Oct 20, 2023

Describe the bug
Can no longer use multiple polygons in geo_shape field.

To Reproduce
Steps to reproduce the behavior:

  1. Create the following files
run.sh
#!/usr/bin/env bash

printf 'Reset\n'
curl opensearch:9200/entity -X DELETE
printf '\nCreate\n'
curl opensearch:9200/entity -X PUT -d @mapping.json -H "Content-Type: application/json"
printf '\nInsert\n'
curl opensearch:9200/_bulk -X POST --data-binary @data.json -H "Content-Type: application/x-ndjson"
printf '\nRefresh\n'
curl opensearch:9200/entity/_refresh
printf '\nQuery1\n'
curl opensearch:9200/entity/_search?pretty -d @query1.json -H "Content-Type: application/json" > result1.json
printf '\nQuery2\n'
curl opensearch:9200/entity/_search?pretty -d @query2.json -H "Content-Type: application/json" > result2.json
printf '\nQuery3\n'
curl opensearch:9200/entity/_search?pretty -d @query3.json -H "Content-Type: application/json" > result3.json
printf '\n'
mapping.json
{
  "mappings": {
    "dynamic": "false",
    "properties": {
      "id": {
        "type": "keyword"
      },
      "polygons": {
        "type": "geo_shape"
      }
    }
  }
}
data.json
{"update":{"_index":"entity","_id":"e3bb575d-30b9-4793-a92a-91652ffb6e9e"}}
{"doc":{"id":"e3bb575d-30b9-4793-a92a-91652ffb6e9e","polygons":[{"type":"Polygon","coordinates":[[[109,9],[109,8],[110,8],[110,9],[109,9]]]},{"type":"Polygon","coordinates":[[[109,11],[109,10],[110,10],[110,11],[109,11]]]}]},"doc_as_upsert":true}
query1.json
{
  "query": {
    "bool": {
      "filter": {
        "geo_shape": {
          "polygons": {
            "shape": {
              "coordinates": [109.5,10.5],
              "type": "point"
            },
            "relation": "intersects"
          }
        }
      }
    }
  }
}
query2.json
{
  "query": {
    "bool": {
      "filter": {
        "geo_shape": {
          "polygons": {
            "shape": {
              "coordinates": [109.5,8.5],
              "type": "point"
            },
            "relation": "intersects"
          }
        }
      }
    }
  }
}

query3.json
{
  "query": {
    "bool": {
      "filter": {
        "geo_shape": {
          "polygons": {
            "shape": {
              "coordinates": [109.5,9.5],
              "type": "point"
            },
            "relation": "intersects"
          }
        }
      }
    }
  }
}
  1. Ensure opensearch is running and execute run.sh.
  2. Inspect result1.json, result2.json and result3.json

Expected behavior
Geo shape with multiple polygons gets indexed and is queryable. Results are as expected. This works in 2.8, but no longer works in 2.9 and fails with the error "DocValuesField \"polygons\" appears more than once in this document (only one value is allowed per field)"

Plugins
None

Host/Environment (please complete the following information):

  • OS: Official docker image opensearchproject/opensearch:2.9.0
  • Version 2.9 introduces the issue

Additional context
This possibly wasn't supposed to work to begin with, but it did for a very long time and people are relying on it (we were!). Consider marking this as a breaking change, or at the very least documenting the behaviour change in the changelogs.

I'm suspecting this has to do with the changelog note in 2.9 Geospatial tools add support for three types of aggregations using geoshapes data: geo_bounds, geo_hash, and geo_tile.

Looking forward to your feedback on this ticket, L~

@simlu simlu added bug Something isn't working untriaged labels Oct 20, 2023
@pierre-mallet
Copy link

pierre-mallet commented Oct 23, 2023

We have the same problem in our solution. Furthermore we don't see how to make a clean rolling upgrade for this because

1/ if the translog is not flushed properly, when the nodes restart in 2.9+ it generates errors during the application of the translog that seems to generate data loss ( we have some example of document disappearing )

2/ during migration the index is updated with the doc_values to "true" (default value) and it is no more possible to update the mapping to set the doc_values to false.

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Mapper for [shape] conflicts with existing mapping:\n[mapper [shape] has different [doc_values] values]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Mapper for [shape] conflicts with existing mapping:\n[mapper [shape] has different [doc_values] values]"
  },
  "status": 400
}

3/ we can't set the doc_values to false before the upgrade since it generate this error :

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "Mapping definition for [shape] has unsupported parameters:  [doc_values : false]"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "Failed to parse mapping [_doc]: Mapping definition for [shape] has unsupported parameters:  [doc_values : false]",
    "caused_by" : {
      "type" : "mapper_parsing_exception",
      "reason" : "Mapping definition for [shape] has unsupported parameters:  [doc_values : false]"
    }
  },
  "status" : 400
}

Any hint on how to migrate to 2.9+ without needing to reset our indices ?

@simlu If you can recreate your indices from scratch, adding doc_values = false to the mapping do the job, you can still have multiple geo_shape and filter in them. But you will of course not be able to aggregate on this field

@dblock
Copy link
Member

dblock commented Oct 24, 2023

That was added via #8301 I believe, cc: @heemin32.

@heemin32
Copy link
Contributor

@navneet1v Could you take a look?

@shwetathareja
Copy link
Member

Duplicate issue for #10958

@dblock
Copy link
Member

dblock commented Oct 30, 2023

Let's close as dup.

@dblock dblock closed this as completed Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Geospatial untriaged
Projects
None yet
Development

No branches or pull requests

7 participants