Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing properties to fields.yml spec #314

Merged
merged 18 commits into from
May 9, 2022

Conversation

andrewkroh
Copy link
Member

@andrewkroh andrewkroh commented Apr 5, 2022

What does this PR do?

I looked over the code in Fleet EPM to see what properties are used when generating templates and then updated the specification to match.

The current version is a draft because it contains commentary as "TODO"s.

Why is it important?

A detailed and accurate specification is required to ensure the developers of integration packages and the implementors of the specification agree on the behavior.

Checklist

Related issues

Author Notes

Based on feedback in this PR we need to follow up on these tasks.

  • Answer why Fleet has include_in_root, include_in_parent. These are for nested type fields and are now in the spec.
  • Answer why Fleet has "group-nested" and "nested". It's internal to Fleet. No integrations will use it.
  • null_value should be supported by types other than wildcard.
  • Add support from dynamic: runtime.
  • Update schema to support dynamic mappings (e.g. path_match, path_unmatch, match, unmatch, etc). This blocks development of [Fleet] dynamic_template mappings for wildcard field names are not installed kibana#129344. We'll want to address this before incorporating this into elastic-package so that there is a migration path for fields that "think" they are creating dynamic mappings today and are using unsupported properties.
  • Fleet should support multi_field on match_only_text field types.
  • Fleet should add support for norms on text fields.
  • Need to determine how deal with what format does in Beats (e.g. applying a formatter to the data view).

- geo_point
- object
- group
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fleet EPM has a group-nested type. Does that need added to the list? Why would we have both group-nested and nested?

https://github.com/elastic/kibana/blob/0695df649705c91ed28dbc73037aea878ce419f7/x-pack/plugins/fleet/server/services/epm/elasticsearch/template/template.ts#L120

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this type used in integrations nor in beats, I would say to don't add it here, we can add it in a follow up if needed.

@joshdover do you know why this type is in Fleet?

Copy link

@juliaElastic juliaElastic Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try to follow the logic, but my take is that is seems like an internal-only value. So none of the integrations should use group-nested and it does not need to be in the spec.

@elasticmachine
Copy link

elasticmachine commented Apr 5, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-04-29T20:37:33.987+0000

  • Duration: 2 min 28 sec

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andrewkroh! Awesome revamp of the fields spec.

versions/1/data_stream/fields/fields.spec.yml Outdated Show resolved Hide resolved
versions/1/data_stream/fields/fields.spec.yml Outdated Show resolved Hide resolved
- geo_point
- object
- group
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this type used in integrations nor in beats, I would say to don't add it here, we can add it in a follow up if needed.

@joshdover do you know why this type is in Fleet?

versions/1/data_stream/fields/fields.spec.yml Outdated Show resolved Hide resolved
versions/1/data_stream/fields/fields.spec.yml Outdated Show resolved Hide resolved
Only honored for 'type: text/keyword/wildcard'.
$ref: "#" # JSON-schema syntax for pointing to the root of the schema

# TODO: Fleet only honors this for 'type: wildcard'. That seems wrong.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @joshdover

versions/1/data_stream/fields/fields.spec.yml Outdated Show resolved Hide resolved
Comment on lines 9 to 11
# TODO: We should disable additionalProperties. There are properties used in
# elastic/integrations that have no purpose or are typos.
additionalProperties: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, specially after adding these missing properties. @mtojek wdyt?

Suggested change
# TODO: We should disable additionalProperties. There are properties used in
# elastic/integrations that have no purpose or are typos.
additionalProperties: true
# TODO: We should disable additionalProperties. There are properties used in
# elastic/integrations that have no purpose or are typos.
additionalProperties: false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on my side too.

I think that we left it open to allow for extra properties like examples, patterns, etc. It seems that we aren't using the actively, so let's close this option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set additionalProperties: false and adjusted some of the test packages to remove level, group, and title.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many fields using level, group and/or title in the integrations repo, wdyt about adding these fields here by now? We could add them directly as deprecated, and eventually phase them out.

I have created this PR to check what would fail in the integrations repo after this change: elastic/integrations#3024

@andrewkroh andrewkroh marked this pull request as ready for review April 6, 2022 21:50
@andrewkroh andrewkroh requested a review from a team as a code owner April 6, 2022 21:50
@joshdover
Copy link
Contributor

@juliaElastic Do you have time to review this? I believe you're more up to date on our mapping field logic than I am.

@juliaElastic
Copy link

@juliaElastic Do you have time to review this? I believe you're more up to date on our mapping field logic than I am.

I can have a look later today.

juliaElastic
juliaElastic previously approved these changes Apr 29, 2022
Copy link

@juliaElastic juliaElastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jsoriano
Copy link
Member

I have updated elastic/integrations#3024, and there are still some missing fields (level, group, title, example...). It may be hard to support all fields on a first pass.

@andrewkroh @mtojek wdyt about setting additionalProperties back to true so we can merge all the fields and docs included here, and continue adding fields in future PRs?

@andrewkroh
Copy link
Member Author

additionalProperties back to true

👍 That's what I'm thinking to do. Even if we remove level, group, title, example there will still be some keys related to dynamic fields that we need to account for. So after we specify the dynamic field behavior we can make this strict.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I have updated elastic/integrations#3024 as a last check in case there is something unexpected.

@@ -35,3 +35,15 @@
- name: error.message
description: Error message.
type: match_only_text
- name: metric.*_bytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Is this an intentional test case? We don't use to have wildcards mixed with suffixes, not sure if everything works as expected with this.

Suggested change
- name: metric.*_bytes
- name: metric.*.bytes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intentional to test the pattern specified for name. It really is an area of unspecified behavior at the moment. I was operating under the assumption that wildcards will be used to implement behavior similar to path_match. But for now this is a test case just to check that the pattern matches fields like this:

https://github.com/elastic/integrations/blob/3c88555054fd24a461babebe624d8645b5feea6c/packages/aws/data_stream/cloudwatch_metrics/fields/package-fields.yml#L16

We should refine this area of the spec for dynamic fields.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could remove the pattern for now. Then when we get to dynamic fields add it back to match the definition there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok, lets leave it. Thanks.

jsoriano
jsoriano previously approved these changes Apr 29, 2022
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending failures in elastic/integrations#3024 don't seem to be related to this, but to #309.

@@ -35,3 +35,15 @@
- name: error.message
description: Error message.
type: match_only_text
- name: metric.*_bytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok, lets leave it. Thanks.

@andrewkroh
Copy link
Member Author

I've been looking over the issues in the integrations repo. After fixing many unused fields and typos (see https://github.com/elastic/integrations/pull/3239/commits) this is what's left:

  • 763 instances of format - used to specify things like byte, percent
  • 1162 instances of norms - used to specify https://www.elastic.co/guide/en/elasticsearch/reference/current/norms.html
  • 164 instances of object_type_mapping_type - part of dynamic fields behavior that is TBD
  • 61 issues with field names that don't match pattern. We need to include - (dash) in the list of accepted characters.

@andrewkroh
Copy link
Member Author

andrewkroh commented Apr 29, 2022

After allowing a dash there are three issues with names, and they all seems like problems with the integration (ignoring issues from the norms, format, object_type_mapping_type, and duplicate fields):

DBClusterIdentifier,Role - https://github.com/elastic/integrations/blob/4247dfe2b3fe14d5e3dd02eb2e9891814e9f92e0/packages/aws/data_stream/rds/fields/fields.yml#L13

DbClusterIdentifier, EngineName - https://github.com/elastic/integrations/blob/4247dfe2b3fe14d5e3dd02eb2e9891814e9f92e0/packages/aws/data_stream/rds/fields/fields.yml#L16

Application,Platform - https://github.com/elastic/integrations/blob/8770467808f03a38de9563d7052a22ea1e472eea/packages/aws/data_stream/sns/fields/fields.yml#L10 (edited: to fix line reference)

So I think this is good now.

@kaiyan-sheng
Copy link
Contributor

DBClusterIdentifier,Role

@jsoriano Hmmm good catch, I don't think we are actually storing dimensions like this. For example, this dimension is stored as aws.dimensions.DBClusterIdentifier and aws.dimensions.Role as two separate fields. These fields are documented wrong, I will fix them.

@andrewkroh
Copy link
Member Author

@kaiyan-sheng Thanks for addressing RDS in elastic/integrations#3253. Is the SNS dimension a bug or are we expecting to have JSON key names containing commas?

@kaiyan-sheng
Copy link
Contributor

Is the SNS dimension a bug or are we expecting to have JSON key names containing commas?

@andrewkroh Good catch! SNS dimension is also a bug and I just added the fix into the same PR. Thank you!

@andrewkroh andrewkroh requested a review from jsoriano May 4, 2022 12:46
@andrewkroh
Copy link
Member Author

I think this is ready now. I will start creating follow-up issues for the things listed in the PR description.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks Andrew! @mtojek @juliaElastic any objection to merge this?

@jsoriano
Copy link
Member

jsoriano commented May 9, 2022

Oh, one thing would be missing, @andrewkroh could you please add a changelog entry?

@jsoriano
Copy link
Member

jsoriano commented May 9, 2022

Ah, changelog is already here 🤦 Merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants