Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misbehaviour when using missing filter on fields which have the same name as _type #7962

Closed
mcuelenaere opened this issue Oct 2, 2014 · 3 comments
Assignees
Labels

Comments

@mcuelenaere
Copy link

When using the missing filter on a field which has the same name as _type, it seems to get transformed to a match_all filter:

/tmp $ cat t
#!/bin/sh

curl http://$HOST:9200/
curl -XDELETE http://$HOST:9200/foo
curl -XPOST http://$HOST:9200/foo -d '{}'
curl -XPOST http://$HOST:9200/foo/bar/_mapping -d '{
    "dynamic": "strict",
    "_index": {
        "enabled": true
    },
    "_id": {
        "index": "not_analyzed",
        "indexed": false,
        "store": true
    },
    "properties": {
        "foo": {
            "type": "string",
            "index": "not_analyzed"
        },
        "bar": {
            "type": "string",
            "index": "not_analyzed"
        }
    }
}
'
curl -XPOST http://$HOST:9200/foo/bar/123 -d '{"foo": "abc", "bar": "def"}'
curl -XPOST http://$HOST:9200/foo/bar/456 -d '{"foo": "abc", "bar": "def"}'

sleep 1

curl -XGET http://$HOST:9200/foo/bar/123/_explain?pretty -d '
{
    "query": {
        "filtered": {
            "filter": {
                "bool": {
                    "must": [
                        {
                            "missing": {"field": "bar"}
                        }
                    ]
                }
            }
        }
    }
}'

curl -XGET http://$HOST:9200/foo/bar/123/_explain?pretty -d '
{
    "query": {
        "filtered": {
            "filter": {
                "bool": {
                    "must": [
                        {
                            "missing": {"field": "foo"}
                        }
                    ]
                }
            }
        }
    }
}'

curl -XGET http://$HOST:9200/foo/bar/123/_explain?pretty -d '
{
    "query": {
        "filtered": {
            "filter": {
                    "missing": {"field": "bar"}
                }
            }
        }
    }
}'
/tmp $ HOST=xxx ./t
{
  "status" : 200,
  "name" : "xxx",
  "version" : {
    "number" : "1.3.4",
    "build_hash" : "a70f3ccb52200f8f2c87e9c370c6597448eb3e45",
    "build_timestamp" : "2014-09-30T09:07:17Z",
    "build_snapshot" : false,
    "lucene_version" : "4.9"
  },
  "tagline" : "You Know, for Search"
}
{"acknowledged":true}{"acknowledged":true}{"acknowledged":true}{"_index":"foo","_type":"bar","_id":"123","_version":1,"created":true}{"_index":"foo","_type":"bar","_id":"456","_version":1,"created":true}{
  "_index" : "foo",
  "_type" : "bar",
  "_id" : "123",
  "matched" : true,
  "explanation" : {
    "value" : 1.0,
    "description" : "ConstantScore(BooleanFilter(+*:*)), product of:",
    "details" : [ {
      "value" : 1.0,
      "description" : "boost"
    }, {
      "value" : 1.0,
      "description" : "queryNorm"
    } ]
  }
}
{
  "_index" : "foo",
  "_type" : "bar",
  "_id" : "123",
  "matched" : false,
  "explanation" : {
    "value" : 0.0,
    "description" : "ConstantScore(BooleanFilter(+cache(NotFilter(cache(BooleanFilter(_field_names:foo)))))) doesn't match id 0"
  }
}
{
  "_index" : "foo",
  "_type" : "bar",
  "_id" : "123",
  "matched" : true,
  "explanation" : {
    "value" : 1.0,
    "description" : "ConstantScore(cache(_type:bar)), product of:",
    "details" : [ {
      "value" : 1.0,
      "description" : "boost"
    }, {
      "value" : 1.0,
      "description" : "queryNorm"
    } ]
  }
}
@clintongormley
Copy link
Contributor

Hi @mcuelenaere

Thanks for reporting this. A simpler recreation follows:

DELETE /foo

POST /foo

POST /foo/bar/_mapping
{
    "properties": {
        "foo": {
            "type": "string",
            "index": "not_analyzed"
        },
        "bar": {
            "type": "string",
            "index": "not_analyzed"
        }
    }
}

POST /foo/bar/123
{"foo": "abc", "bar": "def"}

POST /foo/bar/456
{"foo": "abc", "bar": "def"}

This explain:

GET /foo/bar/123/_explain?pretty
{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "missing": {
                "field": "foo"
              }
            }
          ]
        }
      }
    }
  }
}

Returns:

{
   "_index": "foo",
   "_type": "bar",
   "_id": "123",
   "matched": false,
   "explanation": {
      "value": 0,
      "description": "ConstantScore(BooleanFilter(+cache(NotFilter(cache(BooleanFilter(_field_names:foo)))))) doesn't match id 0"
   }
}

While this explain:

GET /foo/bar/123/_explain?pretty
{
  "query": {
    "filtered": {
      "filter": {
        "missing": {
          "field": "bar"
        }
      }
    }
  }
}

Incorrectly returns:

{
   "_index": "foo",
   "_type": "bar",
   "_id": "123",
   "matched": true,
   "explanation": {
      "value": 1,
      "description": "ConstantScore(cache(_type:bar)), product of:",
      "details": [
         {
            "value": 1,
            "description": "boost"
         },
         {
            "value": 1,
            "description": "queryNorm"
         }
      ]
   }
}

@jpountz
Copy link
Contributor

jpountz commented Oct 14, 2014

For reference this is an old bug, I could reproduce it with 1.1 and it probably also reproduces on earlier versions.

Since we don't index anything for inner nodes of a json object, when we get an exists/missing filter on a field f and f is mapped as an object, it is internally translated to f.* so that it can match anything that would be under f. The issue here is that when we try to look up an object mapper that has bar as a path, we get the root object mapper since bar is the name of the type (which is weird to me, I need to check if we have things that rely upon it).

So in the end we try to match on bar.*, which doesn't match any field, so we return a match_all filter.

@jpountz
Copy link
Contributor

jpountz commented Nov 5, 2014

I'm closing in favor of #4081, the issue here is that elasticsearch tried to interpret the first part of the field name as a type.

@jpountz jpountz closed this as completed Nov 5, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants