Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Validation] Detect incorrectly defined dynamic mappings #624

Closed
Tracked by #539
andrewkroh opened this issue Sep 29, 2023 · 4 comments · Fixed by #628
Closed
Tracked by #539

[Validation] Detect incorrectly defined dynamic mappings #624

andrewkroh opened this issue Sep 29, 2023 · 4 comments · Fixed by #628
Assignees
Labels
discuss Issue needs discussion

Comments

@andrewkroh
Copy link
Member

Detect cases where a developer thinks they have declared a dynamic mapping, but have not. Example:

    - name: disks.states.*
      type: object
      description: |
        map of raw disk states

The rule would detect fields where

  • name contains an asterisk
  • type == "object"
  • object_type is not set

Without specifying an object_type the result index template will declare a mapping for a literal name containing * asterisks and that is unexpected/surprising/wrong.

            "disks": {
              "properties": {
                "states": {
                  "properties": {
                    "*": {
                      "type": "object"
                    }
                  }
                }
              }
            },
@andrewkroh
Copy link
Member Author

As of elastic/integrations@eaa7acf this problem occurs 61 times across 12 integrations.

@jsoriano
Copy link
Member

jsoriano commented Oct 2, 2023

@andrewkroh thanks for reporting, this actually looks like a definition we shouldn't allow. I wonder if it was assumed that this should behave as if object_type would be keyword.

I have started #628 to fix this.

I am disallowing any case of type: object without object_type. There can be fields with type object and without wildcards, on these cases it is assumed that the name is the base of the path and .* is appended to the path_match. Implemented in Fleet in elastic/kibana#137772, in coherence with the behaviour in Beats.

jsoriano added a commit that referenced this issue Oct 2, 2023
A field of type object without object_type probably assumes that the type should be keyword,
but Fleet is producing invalid mappings (see #624). So better to disallow these mappings to
remove ambiguities.
@andrewkroh
Copy link
Member Author

andrewkroh commented Oct 2, 2023

I think we are missing a comprehensive guide to declaring dynamic mappings. We need a list of all the conditions that lead to a dynamic mapping being added by Fleet. This will help guide us in deciding what validations to apply. These are case I have observed, and I wonder if there are more.

  1. Wildcard name and type (excluding type: object). In the case Fleet treats the type value like the object_type.
  2. Wildcard name and type: object and object_type

I am disallowing any case of type: object without object_type. There can be fields with type object and without wildcards, on these cases it is assumed that the name is the base of the path and .* is appended to the path_match.

It is true that there can be type: object without a wildcard name. However I don't see any dynamic_template fields being created in Fleet as a result. A type: object results in a object field type mapping (example). So we may not want to add validation that requires object_type unless we want to disallow those current usages.

There is another area where this proposed validation would cause issues in the incorrect usage of type: object in places that should use type: group (example). We don't want users to add object_type to these usages. It should probably be a requirement to use type: group when fields is non-null.

@maxcold
Copy link

maxcold commented Dec 4, 2023

Hi @andrewkroh @jsoriano , after specifying object_type: keyword for cloud_security_posture integration package in https://github.com/elastic/integrations/pull/8162/files#diff-18ed2920635e3b0afa22206ee3eed89a0454388db04fc2e8114ae16983ad048cR14 I discovered a behavior which was unexpected for me. We have this field mapping now

  - name: raw
      type: object
      object_type: keyword
      enabled: false

and as we have enabled: false I would expect that the ES would ignore the value of this field. But I see errors like failed to parse field [resource.raw.resource.data.status] of type [keyword] in document with id 'ZdYoL4wBECcgdAPTdo-k'

Am I missing smth in the understanding of how the indexing work in that case?

Btw object_type is not documented on https://www.elastic.co/guide/en/elasticsearch/reference/current/object.html which might be confusing as it's required now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs discussion
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants