Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"failed to parse field [error.grouping_name] of type [keyword] in document" after upgrading to 8.15 #14060

Open
carsonip opened this issue Sep 11, 2024 · 4 comments

Comments

@carsonip
Copy link
Member

Full error:

{
  "error": {
    "root_cause": [
      {
        "type": "document_parsing_exception",
        "reason": "[1:2013] failed to parse field [error.grouping_name] of type [keyword] in document with id 'OcTp4ZEBkEnqNYvJrysL'. Preview of field's value: '14 UNAVAILABLE: '"
      }
    ],
    "type": "document_parsing_exception",
    "reason": "[1:2013] failed to parse field [error.grouping_name] of type [keyword] in document with id 'OcTp4ZEBkEnqNYvJrysL'. Preview of field's value: '14 UNAVAILABLE: '",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Cannot index data directly into a field with a [script] parameter"
    }
  },
  "status": 400
}

APM Server version (apm-server version): Upgrade from pre-8.15 to 8.15+

Description of the problem including expected versus actual behavior: indexing errors affects apm-server upgraded from pre-8.15 to 8.15+, where reroute processor is used in logs-apm.error@custom.

Steps to reproduce:

Here's a minimal reproducible example without actually upgrading

PUT _index_template/logs-foo
{
  "data_stream": {},
  "index_patterns": ["logs-foo-*"], 
  "composed_of": ["logs-foo@mappings"],
  "ignore_missing_component_templates": ["logs-foo@mappings"],
  "template": {
    "settings": {
      "index.default_pipeline": "logs-foo@old-pipeline"
    }
  }, 
  "priority": 250
}

PUT _component_template/logs-foo@mappings
{
 "template": {
   "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "foo": {
          "type": "keyword"
        }
      }
    }
 } 
}

PUT _ingest/pipeline/logs-foo@old-pipeline
{
  "processors": [
    {
      "set": {
        "field": "foo",
        "value": "old"
      }
    },
    {
      "pipeline": {
        "name": "logs-foo@pipeline"
      }
    }
  ]
}

PUT _ingest/pipeline/logs-foo@pipeline
{
  "processors": [
  ]
}

POST logs-foo-default/_doc
{
  "@timestamp": "1970-01-01T00:00:00.000Z"
}

GET logs-foo-*/_search
{
  "fields": ["foo"]
}

PUT _index_template/logs-foo
{
  "data_stream": {},
  "index_patterns": ["logs-foo-*"], 
  "composed_of": ["logs-foo@mappings"],
  "ignore_missing_component_templates": ["logs-foo@mappings"],
  "template": {
    "settings": {
      "index.default_pipeline": "logs-foo@pipeline"
    }
  }, 
  "priority": 250
}

PUT _component_template/logs-foo@mappings
{
 "template": {
   "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "foo": {
          "type": "keyword",
          "script": {
            "source": "emit(\"new\")",
            "lang": "painless"
          }
        }
      }
    }
 } 
}


PUT _ingest/pipeline/logs-foo@pipeline
{
  "processors": [
    {"reroute": {
      "namespace": ["mynamespace"]
    }}
  ]
}

POST logs-foo-default/_doc
{
  "@timestamp": "1970-01-01T00:00:00.000Z"
}

Error from this example:

{
  "error": {
    "root_cause": [
      {
        "type": "document_parsing_exception",
        "reason": "[1:120] failed to parse field [foo] of type [keyword] in document with id 'kWE245EB4mZ7zYVt-HD2'. Preview of field's value: 'old'"
      }
    ],
    "type": "document_parsing_exception",
    "reason": "[1:120] failed to parse field [foo] of type [keyword] in document with id 'kWE245EB4mZ7zYVt-HD2'. Preview of field's value: 'old'",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Cannot index data directly into a field with a [script] parameter"
    }
  },
  "status": 400
}
@carsonip carsonip added the bug label Sep 11, 2024
@carsonip
Copy link
Member Author

carsonip commented Sep 11, 2024

The reason behind the bug, e.g. upgrading from 8.13.2 to 8.15.0

  • apm-server always (unless apm-server.data_streams.namespace is set in apm-server.yml) issues the bulk request to logs-apm.error-default. It will be subject to ingest pipeline of current backing index of DS logs-apm.error-default.
  • 8.13.2 uses ingest pipeline to set error.grouping_name, while 8.15.0 sets it within a script in the mapping.
  • 8.13.2 uses ingest pipeline logs-apm.error-8.13.2, while 8.15.0 uses ingest pipeline logs-apm.error@default-pipeline. Both ingest pipeline references logs-apm.error@custom where the reroute processor is placed.
  • Let's say reroute processor creates logs-apm.error-some_custom_namespace
  • after 8.15 upgrade, logs-apm.error-default is not rolled over while logs-apm.error-some_custom_namespace is.

The state of the system after the above:

  • Current backing index of logs-apm.error-default is from 8.13.2, which sets error.grouping_name, then calls logs-apm.error@custom which performs reroute.
  • The event is reroute to logs-apm.error-some_custom_namespace, where backing index has the mapping from 8.15.0, where it sets error.grouping_name using a script in mapping (not ingest pipeline).
  • It is forbidden to use a script in mapping for error.grouping_name, when the _source already contains error.grouping_name, from the 8.13.2 ingest pipeline. Therefore, indexing fails with reason "Cannot index data directly into a field with a [script] parameter".

@axw
Copy link
Member

axw commented Sep 12, 2024

after 8.15 upgrade, logs-apm.error-default is not rolled over while logs-apm.error-some_custom_namespace.

That seems to be the crux of the issue. I suspect the issue is that we're relying on lazy rollover, which I suppose does not trigger when rerouting. I'll check on this.

@axw
Copy link
Member

axw commented Sep 12, 2024

Confirmed above hypothesis:

  1. Create an index template which sets a default ingest pipeline with reroute
  2. Create a data stream matching the index template
  3. Send a document to the data stream; it will be rerouted
  4. Create another index template with higher priority with the same index pattern, with no default ingest pipeline
  5. Rollover the source data stream with "lazy=true"
  6. Send a document to the data stream; it will still be rerouted
  7. Rollover the source data stream with "lazy=false"
  8. Send a document to the data stream; it will not be rerouted

@axw
Copy link
Member

axw commented Sep 12, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants