Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSDB: numbers that are mapped as keyword are not validated correctly at routing time #96552

Closed
tetianakravchenko opened this issue Jun 5, 2023 · 4 comments
Assignees
Labels
>bug :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@tetianakravchenko
Copy link

tetianakravchenko commented Jun 5, 2023

I am trying to add dimensions to the system integration package, core data_stream. When enabling TSDB - all documents are dropped with the error:

 {\"type\":\"illegal_argument_exception\",\"reason\":\"Error extracting routing: Routing values must be strings but found [VALUE_NUMBER]\",\"caused_by\":{\"type\":\"parsing_exception\",\"reason\":\"Routing values must be strings but found [VALUE_NUMBER]\",\"line\":1,\"col\":505}}, dropping event!"

Even though all selected for dimensions fields are of type keyword - see draft PR elastic/integrations#6454

the field of concerns is system.core.id, mapping:

      "system": {
        "properties": {
          "core": {
            "properties": {
              "id": {
                "type": "keyword",
                "time_series_dimension": true
              },

mapping - keyword , but actual value - number

Document example
{
    "agent": {
        "ephemeral_id": "20d6ad4a-276c-491c-b4a1-a98bcc404be1",
        "id": "37188f3d-6747-4f0e-831d-87af1f5ea514",
        "name": "kind-worker",
        "type": "metricbeat",
        "version": "8.8.0"
    },
    "data_stream": {
        "dataset": "system.core",
        "namespace": "default",
        "type": "metrics"
    },
    "ecs": {
        "version": "8.0.0"
    },
    "elastic_agent": {
        "id": "37188f3d-6747-4f0e-831d-87af1f5ea514",
        "snapshot": false,
        "version": "8.8.0"
    },
    "event": {
        "dataset": "system.core",
        "duration": 226255,
        "module": "system"
    },
    "host": {
        "architecture": "x86_64",
        "containerized": false,
        "hostname": "kind-worker",
        "id": "d79d0441a1cd4e14a3361dd723319cb3",
        "ip": [
            "10.244.2.1",
            "10.244.2.1",
            "172.24.0.4",
            "172.18.0.4",
            "fc00:f853:ccd:e793::4",
            "fe80::42:acff:fe12:4"
        ],
        "mac": [
            "02-42-AC-12-00-04",
            "02-42-AC-18-00-04",
            "5E-9E-3C-8C-9C-E2",
            "CA-74-3B-E9-F5-32"
        ],
        "name": "kind-worker",
        "os": {
            "codename": "focal",
            "family": "debian",
            "kernel": "5.15.49-linuxkit",
            "name": "Ubuntu",
            "platform": "ubuntu",
            "type": "linux",
            "version": "20.04.6 LTS (Focal Fossa)"
        }
    },
    "metricset": {
        "name": "core",
        "period": 10000
    },
    "service": {
        "type": "system"
    },
    "system": {
        "core": {
            "core_id": 0,
            "id": 1,
            "idle": {
                "pct": 0.9586
            },
            "iowait": {
                "pct": 0.001
            },
            "irq": {
                "pct": 0
            },
            "mhz": 2400,
            "model_name": "Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz",
            "model_number": "158",
            "nice": {
                "pct": 0
            },
            "physical_id": 1,
            "softirq": {
                "pct": 0.001
            },
            "steal": {
                "pct": 0
            },
            "system": {
                "pct": 0.0186
            },
            "total": {
                "pct": 0.0404
            },
            "user": {
                "pct": 0.0207
            }
        }
    }
}
Index mapping
{
  "mappings": {
    "_meta": {
      "managed_by": "fleet",
      "managed": true,
      "package": {
        "name": "system"
      }
    },
    "_data_stream_timestamp": {
      "enabled": true
    },
    "dynamic_templates": [
      {
        "container.labels": {
          "path_match": "container.labels.*",
          "match_mapping_type": "string",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "strings_as_keyword": {
          "match_mapping_type": "string",
          "mapping": {
            "ignore_above": 1024,
            "type": "keyword"
          }
        }
      }
    ],
    "date_detection": false,
    "properties": {
      "@timestamp": {
        "type": "date"
      },
      "agent": {
        "properties": {
          "id": {
            "type": "keyword",
            "time_series_dimension": true
          }
        }
      },
      "cloud": {
        "properties": {
          "account": {
            "properties": {
              "id": {
                "type": "keyword",
                "time_series_dimension": true
              }
            }
          },
          "availability_zone": {
            "type": "keyword",
            "time_series_dimension": true
          },
          "image": {
            "properties": {
              "id": {
                "type": "keyword",
                "ignore_above": 1024
              }
            }
          },
          "instance": {
            "properties": {
              "id": {
                "type": "keyword",
                "time_series_dimension": true
              },
              "name": {
                "type": "keyword",
                "ignore_above": 1024
              }
            }
          },
          "machine": {
            "properties": {
              "type": {
                "type": "keyword",
                "ignore_above": 1024
              }
            }
          },
          "project": {
            "properties": {
              "id": {
                "type": "keyword",
                "ignore_above": 1024
              }
            }
          },
          "provider": {
            "type": "keyword",
            "time_series_dimension": true
          },
          "region": {
            "type": "keyword",
            "time_series_dimension": true
          }
        }
      },
      "container": {
        "properties": {
          "id": {
            "type": "keyword",
            "time_series_dimension": true
          },
          "image": {
            "properties": {
              "name": {
                "type": "keyword",
                "ignore_above": 1024
              }
            }
          },
          "name": {
            "type": "keyword",
            "ignore_above": 1024
          }
        }
      },
      "data_stream": {
        "properties": {
          "dataset": {
            "type": "constant_keyword"
          },
          "namespace": {
            "type": "constant_keyword"
          },
          "type": {
            "type": "constant_keyword"
          }
        }
      },
      "event": {
        "properties": {
          "agent_id_status": {
            "type": "keyword",
            "ignore_above": 1024
          },
          "dataset": {
            "type": "constant_keyword",
            "value": "system.core"
          },
          "ingested": {
            "type": "date",
            "format": "strict_date_time_no_millis||strict_date_optional_time||epoch_millis"
          },
          "module": {
            "type": "constant_keyword",
            "value": "system"
          }
        }
      },
      "host": {
        "properties": {
          "architecture": {
            "type": "keyword",
            "ignore_above": 1024
          },
          "containerized": {
            "type": "boolean"
          },
          "domain": {
            "type": "keyword",
            "ignore_above": 1024
          },
          "hostname": {
            "type": "keyword",
            "ignore_above": 1024
          },
          "id": {
            "type": "keyword",
            "ignore_above": 1024
          },
          "ip": {
            "type": "ip"
          },
          "mac": {
            "type": "keyword",
            "ignore_above": 1024
          },
          "name": {
            "type": "keyword",
            "time_series_dimension": true
          },
          "os": {
            "properties": {
              "build": {
                "type": "keyword",
                "ignore_above": 1024
              },
              "codename": {
                "type": "keyword",
                "ignore_above": 1024
              },
              "family": {
                "type": "keyword",
                "ignore_above": 1024
              },
              "full": {
                "type": "keyword",
                "ignore_above": 1024,
                "fields": {
                  "text": {
                    "type": "match_only_text"
                  }
                }
              },
              "kernel": {
                "type": "keyword",
                "ignore_above": 1024
              },
              "name": {
                "type": "keyword",
                "ignore_above": 1024,
                "fields": {
                  "text": {
                    "type": "match_only_text"
                  }
                }
              },
              "platform": {
                "type": "keyword",
                "ignore_above": 1024
              },
              "version": {
                "type": "keyword",
                "ignore_above": 1024
              }
            }
          },
          "type": {
            "type": "keyword",
            "ignore_above": 1024
          }
        }
      },
      "system": {
        "properties": {
          "core": {
            "properties": {
              "id": {
                "type": "keyword",
                "time_series_dimension": true
              },
              "idle": {
                "properties": {
                  "pct": {
                    "type": "scaled_float",
                    "meta": {
                      "unit": "percent"
                    },
                    "scaling_factor": 1000,
                    "time_series_metric": "gauge"
                  },
                  "ticks": {
                    "type": "long",
                    "time_series_metric": "counter"
                  }
                }
              },
              "iowait": {
                "properties": {
                  "pct": {
                    "type": "scaled_float",
                    "meta": {
                      "unit": "percent"
                    },
                    "scaling_factor": 1000,
                    "time_series_metric": "gauge"
                  },
                  "ticks": {
                    "type": "long",
                    "time_series_metric": "counter"
                  }
                }
              },
              "irq": {
                "properties": {
                  "pct": {
                    "type": "scaled_float",
                    "meta": {
                      "unit": "percent"
                    },
                    "scaling_factor": 1000,
                    "time_series_metric": "gauge"
                  },
                  "ticks": {
                    "type": "long",
                    "time_series_metric": "counter"
                  }
                }
              },
              "nice": {
                "properties": {
                  "pct": {
                    "type": "scaled_float",
                    "meta": {
                      "unit": "percent"
                    },
                    "scaling_factor": 1000,
                    "time_series_metric": "gauge"
                  },
                  "ticks": {
                    "type": "long",
                    "time_series_metric": "counter"
                  }
                }
              },
              "softirq": {
                "properties": {
                  "pct": {
                    "type": "scaled_float",
                    "meta": {
                      "unit": "percent"
                    },
                    "scaling_factor": 1000,
                    "time_series_metric": "gauge"
                  },
                  "ticks": {
                    "type": "long",
                    "time_series_metric": "counter"
                  }
                }
              },
              "steal": {
                "properties": {
                  "pct": {
                    "type": "scaled_float",
                    "meta": {
                      "unit": "percent"
                    },
                    "scaling_factor": 1000,
                    "time_series_metric": "gauge"
                  },
                  "ticks": {
                    "type": "long",
                    "time_series_metric": "counter"
                  }
                }
              },
              "system": {
                "properties": {
                  "pct": {
                    "type": "scaled_float",
                    "meta": {
                      "unit": "percent"
                    },
                    "scaling_factor": 1000,
                    "time_series_metric": "gauge"
                  },
                  "ticks": {
                    "type": "long",
                    "time_series_metric": "counter"
                  }
                }
              },
              "user": {
                "properties": {
                  "pct": {
                    "type": "scaled_float",
                    "meta": {
                      "unit": "percent"
                    },
                    "scaling_factor": 1000,
                    "time_series_metric": "gauge"
                  },
                  "ticks": {
                    "type": "long",
                    "time_series_metric": "counter"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}
Index settings
{
  "settings": {
    "index": {
      "mapping": {
        "total_fields": {
          "limit": "10000"
        }
      },
      "hidden": "true",
      "time_series": {
        "end_time": "2023-06-02T15:09:20.000Z",
        "start_time": "2023-06-02T10:47:46.000Z"
      },
      "provided_name": ".ds-metrics-system.core-default-2023.06.02-000002",
      "final_pipeline": ".fleet_final_pipeline-1",
      "query": {
        "default_field": [
          "cloud.account.id",
          "cloud.availability_zone",
          "cloud.instance.id",
          "cloud.instance.name",
          "cloud.machine.type",
          "cloud.provider",
          "cloud.region",
          "cloud.project.id",
          "cloud.image.id",
          "container.id",
          "container.image.name",
          "container.name",
          "host.hostname",
          "host.id",
          "host.os.build",
          "host.os.codename",
          "host.os.family",
          "host.os.full",
          "host.os.kernel",
          "host.os.name",
          "host.os.platform",
          "host.os.version",
          "host.architecture",
          "host.mac",
          "host.name",
          "host.type",
          "agent.id",
          "system.core.id"
        ]
      },
      "creation_date": "1685710066976",
      "number_of_replicas": "1",
      "routing_path": [
        "host.name",
        "container.id",
        "cloud.instance.id",
        "cloud.availability_zone",
        "cloud.account.id",
        "agent.id",
        "system.core.id",
        "cloud.provider",
        "cloud.region"
      ],
      "uuid": "eiJy8iZKTnOrOW85vY6PQg",
      "version": {
        "created": "8080099"
      },
      "lifecycle": {
        "name": "metrics"
      },
      "mode": "time_series",
      "codec": "best_compression",
      "routing": {
        "allocation": {
          "include": {
            "_tier_preference": "data_hot"
          }
        }
      },
      "number_of_shards": "1",
      "default_pipeline": "metrics-system.core-1.32.0-beta.1"
    }
  },
  "defaults": {
    "index": {
      "flush_after_merge": "512mb",
      "time_series": {
        "es87tsdb_codec": {
          "enabled": "true"
        }
      },
      "max_inner_result_window": "100",
      "unassigned": {
        "node_left": {
          "delayed_timeout": "1m"
        }
      },
      "max_terms_count": "65536",
      "rollup": {
        "source": {
          "name": "",
          "uuid": ""
        }
      },
      "lifecycle": {
        "parse_origination_date": "false",
        "prefer_ilm": "true",
        "step": {
          "wait_time_threshold": "12h"
        },
        "indexing_complete": "false",
        "rollover_alias": "",
        "origination_date": "-1"
      },
      "routing_partition_size": "1",
      "force_memory_term_dictionary": "false",
      "max_docvalue_fields_search": "100",
      "merge": {
        "scheduler": {
          "max_thread_count": "4",
          "auto_throttle": "true",
          "max_merge_count": "9"
        },
        "policy": {
          "merge_factor": "32",
          "floor_segment": "2mb",
          "max_merge_at_once_explicit": "30",
          "max_merge_at_once": "10",
          "max_merged_segment": "0b",
          "expunge_deletes_allowed": "10.0",
          "segments_per_tier": "10.0",
          "type": "UNSET",
          "deletes_pct_allowed": "20.0"
        }
      },
      "max_refresh_listeners": "1000",
      "max_regex_length": "1000",
      "load_fixed_bitset_filters_eagerly": "true",
      "number_of_routing_shards": "1",
      "write": {
        "wait_for_active_shards": "1"
      },
      "verified_before_close": "false",
      "mapping": {
        "coerce": "false",
        "nested_fields": {
          "limit": "50"
        },
        "depth": {
          "limit": "20"
        },
        "field_name_length": {
          "limit": "9223372036854775807"
        },
        "nested_objects": {
          "limit": "10000"
        },
        "ignore_malformed": "false",
        "dimension_fields": {
          "limit": "21"
        }
      },
      "source_only": "false",
      "soft_deletes": {
        "enabled": "true",
        "retention": {
          "operations": "0"
        },
        "retention_lease": {
          "period": "12h"
        }
      },
      "max_script_fields": "32",
      "query": {
        "parse": {
          "allow_unmapped_fields": "true"
        }
      },
      "format": "0",
      "frozen": "false",
      "sort": {
        "missing": [],
        "mode": [],
        "field": [],
        "order": []
      },
      "priority": "1",
      "version": {
        "compatibility": "8080099"
      },
      "max_rescore_window": "10000",
      "bloom_filter_for_id_field": {
        "enabled": "true"
      },
      "max_adjacency_matrix_filters": "100",
      "analyze": {
        "max_token_count": "10000"
      },
      "gc_deletes": "60s",
      "top_metrics_max_size": "10",
      "optimize_auto_generated_id": "true",
      "max_ngram_diff": "1",
      "translog": {
        "flush_threshold_age": "1m",
        "generation_threshold_size": "64mb",
        "flush_threshold_size": "10gb",
        "sync_interval": "5s",
        "retention": {
          "size": "-1",
          "age": "-1"
        },
        "durability": "REQUEST"
      },
      "auto_expand_replicas": "false",
      "recovery": {
        "type": ""
      },
      "requests": {
        "cache": {
          "enable": "true"
        }
      },
      "data_path": "",
      "highlight": {
        "max_analyzed_offset": "1000000"
      },
      "routing": {
        "rebalance": {
          "enable": "all"
        },
        "allocation": {
          "disk": {
            "watermark": {
              "ignore": "false"
            }
          },
          "enable": "all",
          "total_shards_per_node": "-1"
        }
      },
      "search": {
        "slowlog": {
          "level": "TRACE",
          "threshold": {
            "fetch": {
              "warn": "-1",
              "trace": "-1",
              "debug": "-1",
              "info": "-1"
            },
            "query": {
              "warn": "-1",
              "trace": "-1",
              "debug": "-1",
              "info": "-1"
            }
          }
        },
        "idle": {
          "after": "30s"
        },
        "throttled": "false"
      },
      "fielddata": {
        "cache": "node"
      },
      "look_ahead_time": "2h",
      "max_slices_per_scroll": "1024",
      "shard": {
        "check_on_startup": "false"
      },
      "xpack": {
        "watcher": {
          "template": {
            "version": ""
          }
        },
        "version": "",
        "ccr": {
          "following_index": "false"
        }
      },
      "percolator": {
        "map_unmapped_fields_as_text": "false"
      },
      "allocation": {
        "max_retries": "5",
        "existing_shards_allocator": "gateway_allocator"
      },
      "refresh_interval": "1s",
      "indexing": {
        "slowlog": {
          "reformat": "true",
          "threshold": {
            "index": {
              "warn": "-1",
              "trace": "-1",
              "debug": "-1",
              "info": "-1"
            }
          },
          "source": "1000",
          "level": "TRACE"
        }
      },
      "compound_format": "1gb",
      "blocks": {
        "metadata": "false",
        "read": "false",
        "read_only_allow_delete": "false",
        "read_only": "false",
        "write": "false"
      },
      "max_result_window": "10000",
      "store": {
        "stats_refresh_interval": "10s",
        "type": "",
        "fs": {
          "fs_lock": "native"
        },
        "preload": [],
        "snapshot": {
          "snapshot_name": "",
          "index_uuid": "",
          "cache": {
            "prewarm": {
              "enabled": "true"
            },
            "enabled": "true",
            "excluded_file_types": []
          },
          "repository_uuid": "",
          "uncached_chunk_size": "-1b",
          "delete_searchable_snapshot": "false",
          "index_name": "",
          "partial": "false",
          "blob_cache": {
            "metadata_files": {
              "max_length": "64kb"
            }
          },
          "repository_name": "",
          "snapshot_uuid": ""
        }
      },
      "queries": {
        "cache": {
          "enabled": "true"
        }
      },
      "shard_limit": {
        "group": "normal"
      },
      "warmer": {
        "enabled": "true"
      },
      "downsample": {
        "source": {
          "name": "",
          "uuid": ""
        },
        "status": "unknown"
      },
      "override_write_load_forecast": "0.0",
      "max_shingle_diff": "3",
      "query_string": {
        "lenient": "false"
      }
    }
  }
}

cc @lalit-satapathy @mvg

@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jun 5, 2023
@lalit-satapathy lalit-satapathy added the :StorageEngine/TSDB You know, for Metrics label Jun 5, 2023
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) and removed needs:triage Requires assignment of a team area label labels Jun 5, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@martijnvg
Copy link
Member

In this case the id field is mapped as a keyword field and dimension field. Then Elasticsearch will automatically add this field to the index.routing_path index setting. Routing path fields are used on coordinating node to route documents to the right shard. Parsing numbers can be dangerous, because based on the mappings, a json field can be parsed differently on the data node upon indexing. On the coordinating node the mapping of a backing index may not be available. However in this case the id field is mapped as keyword and upon indexing on data node the values are parsed as strings (even in the case if in the json document values are specified as numbers). This we can also just do on the coordinating node, so there shouldn't be a reason to fail indexing like is reported here. I think this is safe, also because iirc only keyword fields can be defined as routing field, otherwise a failure occurs when defining the template (mapping and settings).

@ruflin
Copy link
Member

ruflin commented Jun 14, 2023

As this is a bug in Elasticsearch, @tetianakravchenko I suggest we wait with elastic/integrations#6454 until this change makes it in.

@tetianakravchenko
Copy link
Author

As this is a bug in Elasticsearch, @tetianakravchenko I suggest we wait with elastic/integrations#6454 until this change makes it in.

@ruflin then I will revert field type change to have it as keyword to avoid field type changes, and merge elastic/integrations#6454, enablement of the tsdb on the core data_stream will be waiting the fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

6 participants