Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

VerificationException: When using multiple indices, the mappings must be identical. #46

Closed
horacimacias opened this issue May 10, 2019 · 9 comments
Labels
enhancement New feature or request

Comments

@horacimacias
Copy link

I'm trying to run SQL queries on an index alias (e.g. select field from myindex-*) and my indexes are daily (so I have myindex-2019-04-01, etc.).

For some of the indexes, additional fields have been added and dynamic field mapping has updated the mapping, resulting on myindex-2019-04-01 having less fields compared to myindex-2019-04-02 for example.

When running SQL queries, elasticsearch is complaining with:

VerificationException: When using multiple indices, the mappings must be identical.

even though the SQL query is not using the fields that are different between the indexes.
Is there any way to relax the strictness of all mappings having to be identical?

if I run:

select myfield from myindex*

and the mapping for myfield is the same on all indexes matching myindex*, do we really want to enforce the whole mapping is identical? I'm not seeing the benefit of that.
even if some indexes do not contain myfield, wouldn't it be better to just return an empty value for such column instead of not allowing the request at all?

@horacimacias
Copy link
Author

Here are some references for issues/fixes on the xpack version of elasticsearch:

elastic/elasticsearch#31611
elastic/elasticsearch#34718

@abbashus
Copy link
Contributor

Thanks for bringing this up. This was a design choice to support common use case of log analytics where the indexes have similar prefixes and underlying mappings are same. We have noted this enhancement and will be out with a fix soon.

@abbashus abbashus added the enhancement New feature or request label May 10, 2019
@linuzvipin
Copy link

Hi , Im also facing the same issue , any idea for rollout a fix for the same.
{
"error": {
"reason": "When using multiple indices, the mappings must be identical.",
"details": "VerificationException[When using multiple indices, the mappings must be identical.]",
"type": "VerificationException"
},
"status": 503
}

zhongnansu added a commit to zhongnansu/sql that referenced this issue Jun 5, 2019
@zhongnansu zhongnansu mentioned this issue Jun 5, 2019
zhongnansu added a commit to zhongnansu/sql that referenced this issue Jun 6, 2019
@abbashus
Copy link
Contributor

abbashus commented Jun 6, 2019

Hi @horacimacias,

and the mapping for myfield is the same on all indexes matching myindex*, do we really want to enforce the whole mapping is identical? I'm not seeing the benefit of that.

Correct. We will still be enforcing the mappings for fields mentioned as part of SELECT clause.

even if some indexes do not contain myfield, wouldn't it be better to just return an empty value for such column instead of not allowing the request at all?

This is already the case.

@horacimacias
Copy link
Author

Perhaps I'm missing something but I'm getting different results on a 0.9.0 environment.

Here's what I did to try with clean values. These indexes didn't exist before.

Create two documents, one on each index, both having a common field and one having an extra field:

curl -X PUT "https://localhost:9200/myindex1/_doc/1" -H 'Content-Type: application/json' -d'{"field" : "test",}'
curl -X PUT "https://localhost:9200/myindex2/_doc/1" -H 'Content-Type: application/json' -d'{"field" : "test", "field2":"value2"}'

Selecting field common to all documents:

GET _opendistro/_sql
{
"query": "select field from myindex*"
}

I get the error mentioned:

{
"error": {
"reason": "When using multiple indices, the mappings must be identical.",
"details": "VerificationException[When using multiple indices, the mappings must be identical.]",
"type": "VerificationException"
},
"status": 503
}

Selecting both fields on "myindex*":

GET _opendistro/_sql
{
"query": "select field,field2 from myindex*"
}

gives me the same error (it does not give me empty value for the field2 on the myindex1 doument).

Selecting both fields on myindex1 (the document does not have a field2 value):

GET _opendistro/_sql
{
"query": "select field,field2 from myindex1"
}

gives me the same output as searching for "field" on myindex1:

GET _opendistro/_sql
{
"query": "select field from myindex1"
}

So, the request is allowed but the missing fields are completely ignored.
No empty value for field2. Using "?format=csv" shows the same behaviour.
I'd expect querying for field and field2 give you as many "columns" as fields requested. With empty values if they are not defined, of course, but at least the number of columns should match the number of requested fields.

@abbashus
Copy link
Contributor

Case-1: Selecting field common to all documents.
A: This is expected as expanded indices have different mappings.

Case-2 : Selecting both fields on "myindex*".
A: This is expected as expanded indices have different mappings.

Case-3: Selecting both fields on myindex1, gives me the same output as searching for "field" on myindex1.
A: Since the default format is json, we don't modify the underlying response from elasticsearch.

In the below example, hello field does not exist even in the mappings. nickname exists in mappings, but not in document with id: 2.


GET bank/_mappings

{
  "bank" : {
    "mappings" : {
      "properties" : {
        "firstname" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "lastname" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "nickname" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

POST _opendistro/_sql/?
{
  "query": "SELECT hello, nickname, firstname FROM bank"
}


{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "firstname" : "Abbas",
          "nickname" : "superman"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "firstname" : "Arsen"
        }
      }
    ]
  }
}

To get results for missing fields in the documents , jdbc format can be used here. Here the result has as many columns as requested.

POST _opendistro/_sql/?format=jdbc
{
  "query": "SELECT hello, nickname, firstname FROM bank"
}

{
  "schema": [
    {
      "name": "nickname",
      "type": "text"
    },
    {
      "name": "firstname",
      "type": "text"
    }
  ],
  "total": 2,
  "datarows": [
    [
      "superman",
      "Abbas"
    ],
    [
      null,
      "Arsen"
    ]
  ],
  "size": 2,
  "status": 200
}

For both Case-1 and Case-2 we are coming up with a fix, we don't have an ETA on this, as it requires major changes.

@horacimacias
Copy link
Author

thanks!

zhongnansu added a commit to zhongnansu/sql that referenced this issue Jun 14, 2019
…dices query with compatible mappings. Add new test cases and data. Delete useless test case
zhongnansu added a commit to zhongnansu/sql that referenced this issue Jun 15, 2019
…dices query with compatible mappings. Add new test cases and data. Delete useless test case
zhongnansu added a commit to zhongnansu/sql that referenced this issue Jun 18, 2019
…h compatible mappings. Give user detailed error message on incompatible mappings. Add new IT test cases and data. Delete useless test case
@horacimacias
Copy link
Author

horacimacias commented Jul 11, 2019

I've just tested with opendistro 1.0.0 and I'm still seeing the same behaviour for cases 1 and 2.

This is what I did, in case my expectations are incorrect:

Create two documents, one on each index, both having a common field and one having an extra field:

curl -X PUT "https://localhost:9200/myindex1/_doc/1" -H 'Content-Type: application/json' -d'{"field" : "test",}'
curl -X PUT "https://localhost:9200/myindex2/_doc/1" -H 'Content-Type: application/json' -d'{"field" : "test", "field2":"value2"}'

Case-1: Selecting field common to all documents/indices.
"select field from myindex*" returns "When using multiple indices, the mappings must be identical."

Case-2 : Selecting both fields on "myindex*".
"select field,field2 from myindex*" returns "When using multiple indices, the mappings must be identical."

are these supposed to work on opendistro 1.0.0? am I doing anything wrong?

@dai-chen
Copy link
Member

I've just tested with opendistro 1.0.0 and I'm still seeing the same behaviour for cases 1 and 2.

This is what I did, in case my expectations are incorrect:

Create two documents, one on each index, both having a common field and one having an extra field:
curl -X PUT "https://localhost:9200/myindex1/_doc/1" -H 'Content-Type: application/json' -d'{"field" : "test",}'
curl -X PUT "https://localhost:9200/myindex2/_doc/1" -H 'Content-Type: application/json' -d'{"field" : "test", "field2":"value2"}'

Case-1: Selecting field common to all documents/indices.
"select field from myindex*" returns "When using multiple indices, the mappings must be identical."

Case-2 : Selecting both fields on "myindex*".
"select field,field2 from myindex*" returns "When using multiple indices, the mappings must be identical."

are these supposed to work on opendistro 1.0.0? am I doing anything wrong?

I just double checked the fix is supposed to be in version 1.1.0+. Please try it out and let us know if you still see the issue. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants