Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow customizing managed data streams at different levels of granularity #97664

Open
felixbarny opened this issue Jul 13, 2023 · 19 comments
Open
Labels
:Data Management/Data streams Data streams and their lifecycles Team:Data Management Meta label for data/management team

Comments

@felixbarny
Copy link
Member

What are we trying to achieve?

On several occasions, we've been discussing to add ways to enable users to customize data streams that are set up via Fleet and via the built-in index templates, without having to create a copy of the index template and taking the onus to maintain the whole index template going forward. Instead, we'd want to offer dedicated extension points for users so that they can configure different settings/mappings/lifecycles at different levels of the data stream naming scheme:

  • All data streams (*-*-*)
  • All data streams with a certain type ({type}-*-*)
  • All data streams with a certain type and dataset ({type}-{dataset}-*)
  • All data streams with a certain type, dataset, and namespace ({type}-{dataset}-{namespace})
  • All data streams with a certain type and namespace ({type}-*-{namespace})
  • All data streams with a certain namespace (*-*-{namespace})

Some concrete use cases:

  • A user wants to send the observability signals of their tier 1 applications to a separate namespace to keep the data in the hot tier for longer and to have a longer retention
  • Setting the default retention for logs to 30 days and for metrics to 90 days
  • Enable synthetic _source for the logs-foo-* data stream that is using the logs-*-* index template, without having to create a copy of the index template with a logs-foo-* index pattern.

Why this should be in Elasticsearch

The previous discussions (elastic/kibana#149484, elastic/kibana#121118) have mostly been focussed on Fleet. But I have a strong preference for not putting this into Fleet but into Elasticsearch so that data streams that are not managed by Fleet (such as the data streams for the built-in index templates logs-*-* and metrics-*-*) can benefit from that as well.

Why is this important

This gets more important in the context of the reroute processor as documents can be routed to data streams that aren't managed by or known to Fleet. Also, we're considering to move APM index templates out of Fleet and into Elasticsearch (see #97546).

A potential solution

I've proposed one potential solution to this here: elastic/kibana#121118 (comment)

Essentially, we'd add a couple of component templates into the index templates that are managed by Fleet and Elasticsearch. For example, the composed_of section of the logs-*-* index template that is built into Elasticsearch would be extended by component templates that have a placeholder in them (exact naming tbd).

  composed_of:
   - logs@custom
   - logs-*-{{data_stream.namespace}}@custom
   - logs-{{data_stream.dataset}}@custom
   - logs-{{data_stream.dataset}}-{{data_stream.namespace}}@custom

Valid placeholders are any constant_keyword fields.

If a user wants to customize a concrete data stream logs-foo-bar, they can create the following component templates:

  • logs@custom
  • logs-*-bar@custom
  • logs-foo@custom
  • logs-foo-bar@custom
@felixbarny felixbarny added the :Data Management/Data streams Data streams and their lifecycles label Jul 13, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Jul 13, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@BBQigniter
Copy link

imho this also relates to #91370

@joshdover
Copy link
Contributor

Big +1 on solving this with an ability to reference a "templated" component template name. I have one suggestion on the solution which may make it a bit simpler.

I think there will be a slight issue with trying to use constant_keyword fields as candidates for replacement variables as those fields themselves may be defined in component templates (and in Fleet's case, are). It's a chicken-and-egg problem of having to lookup the component templates to know which other component templates match. I guess this could be solved by looking up the component templates without variables first, and then matching the rest, but I think it's more complicated than necessary and confusing from a user perspective.

Instead, I'd suggest instead we have some ability to name the wildcards in the main template's index pattern (similar to a named regexp capture group) and then reference those as variables in the composed_of array, like this:

index_patterns:
  - logs-(*:dataset)-(*:namespace)
composed_of:
  - logs@custom
  - logs-*-{{namespace}}@custom
  - logs-{{dataset}}@custom
  - logs-{{dataset}}-{{namespace}}@custom

I think this is simpler, more obvious, and less tied to any specific convention. It also has the nice side benefit that it constrains the possibilities to only strings that appear in the actual name of the index/data stream, rather than fields in the document that may not be part of the index name.

@joshdover
Copy link
Contributor

@BBQigniter do you think this will fully solve the problems described in #91370 or is there more we need to accommodate?

@BBQigniter
Copy link

@joshdover not completely sure but your proposal looks good for me :)

@felixbarny
Copy link
Member Author

Instead, I'd suggest instead we have some ability to name the wildcards in the main template's index pattern (similar to a named regexp capture group) and then reference those as variables in the composed_of array

Seems like a much better and simpler idea compared to relying on constant_keyword fields! Love it!

From what I can tell, these are some aspects of #91370 that this proposal wouldn't tackle:

  • Having separate component templates for different ECS namespaces. However, we're moving towards using dynamic template-based approach to mapping ECS fields: [Logs+] Adding ECS dynamic templates #96171, Dynamic ECS mapping progress integrations#5055
  • Customize data streams at the integration granularity. While this proposal allows to customize at the data stream level, if an integration contains multiple data streams, you can't easily apply configurations to all data streams of an integration. I suppose this can be achieved by Fleet automatically adding a custom component template for an integration.

@joshdover
Copy link
Contributor

Customize data streams at the integration granularity. While this proposal allows to customize at the data stream level, if an integration contains multiple data streams, you can't easily apply configurations to all data streams of an integration. I suppose this can be achieved by Fleet automatically adding a custom component template for an integration.

Fleet adding an explicit component template would work.

Another option would be to make the dotted part of the dataset part of the pattern, so you could something like this (not sure I like the names I used, but you get the idea):

index_patterns:
  - logs-(*:dataset_prefix).(*:dataset_suffix)-(*:namespace)
composed_of:
  - logs@custom
  - logs-*-{{namespace}}@custom
  - logs-{{dataset_prefix}}@custom
  - logs-{{dataset_prefix}}.{{dataset_suffix}}@custom
  - logs-{{dataset_prefix}}.{{dataset_suffix}}-{{namespace}}@custom

I've wondered if the two-parted dataset should be part of the DSNS convention or not - we use this pattern fairly consistently, though not everywhere.

@felixbarny
Copy link
Member Author

As not all datasets have a prefix and suffix separated by a dot, the logs-*.*-* index pattern wouldn't match all data streams.

But either way, it seems the placeholders in component template references could also be used to add extension points to all data streams of an integration.

@ruflin
Copy link
Member

ruflin commented Jul 31, 2023

++ on moving forward with the placeholder approach. It will not solve all problems but I think it will solve quite a few.

@dakrone Would be great to get your feedback on this.

@dakrone
Copy link
Member

dakrone commented Aug 3, 2023

Thanks for bringing this up Felix, and others for the discussion so far. We met today as a team to discuss this. We have a couple of reservations and some thoughts I'll try to share.

First, the proposed solution of having placeholders where wildcards are essentially "captured" (the logs-(*:dataset_prefix).(*:dataset_suffix)-(*:namespace) suggestion), I don't think this is going to be a good solution. We rely on knowing exactly how templates are composed in order to be able to validate changes to both index and component templates when they're added/updated/removed. If we went with the pattern capturing solution it would mean that we could no longer validate the templates, because we wouldn't know what the composition is going to be until index or data stream creation time.

Second, the other option that I see currently would be for us to use a naming scheme for customizing component templates, for example, we'd change all of our logs integrations and built-in templates to reference the logs@custom component template, so that any user-customization can be done there. We'd do this at varying levels of granularity, so we'd end up with a logs-nginx@custom one for the Nginx integration, and so on. This would require Fleet to specify the correct names of the customization when installing an integration. This would be better than the previous solution, since we would still be able to validate both component and index templates. ...@custom component templates would not have to exist because we can specify in the index template to skip failing before they're created.

The challenging part of the second solution is that we run into a composition problem when it comes to a change that a user wants to make with respect to a particular attribute of a data stream. For example, imagine a user that wants to make a change to the "global" data stream configuration, to set a project-level retention to all *-*-* data. Ideally this would mean they would add a {"data_retention": "30d"} configuration to a global@custom component template. But what happens when this global component template is composed into an index template that does not specify "data_stream": {} in its configuration? Or an index template that is managed but that disables data stream lifecycles? We could either be lenient and allow the composition, ignoring it (which is unfortunate because it introduces leniency), or disallow it and force a user to reckon with varying configuration parameters at different levels of granularity that may or may not be allowed. The example is for retention, but it can be extrapolated to any template configuration such as index settings, mappings, or aliases. Depending on how strict and exactly how we want to use the ...@custom component templates, we make minimize or increase the chance of this risk.

I don't think the placeholder meets the needs we have without introducing unacceptable leniency. The second is more workable but has some pieces and use-cases that we'd need to work through to make sure that we don't end up with a rigid or brittle system. What do you think?

@felixbarny
Copy link
Member Author

If we went with the pattern capturing solution it would mean that we could no longer validate the templates, because we wouldn't know what the composition is going to be until index or data stream creation time.

If all individual component templates are valid themselves, in what situation can the composition be invalid?

This would require Fleet to specify the correct names of the customization when installing an integration.

This sounds similar to elastic/kibana#121118. We've closed this issue because we'd like a solution that doesn't rely on Fleet to set up the data streams in the right way so that we can have the same extension points for the index templates that ship with Elasticsearch, such as logs-*-*. Even if relying on Fleet for this, there wouldn't be a way to allow customization at the namespace level, without having to create an index template for each namespace (which may not be known upfront but dynamically determined by a reroute processor).

@dakrone
Copy link
Member

dakrone commented Aug 3, 2023

If all individual component templates are valid themselves, in what situation can the composition be invalid?

It's not just component templates that must be valid, but also their use by the index template. For (a contrived) example, this is valid and allows an index to be created:

PUT /_component_template/one
{
  "template": {
    "mappings": {
      "properties": {
        "field": {
          "type": "text"
        }
      }
    }
  }
}

PUT /_index_template/it
{
  "index_patterns": ["foo"],
  "data_stream": {},
  "composed_of": ["one"],
  "template": {
    "mappings": {
      "properties": {
        "alias-field": {
          "type": "alias",
          "path": "field"
        }
      }
    }
  }
}

But if you tried to change the name of the field, you get an error:

PUT /_component_template/one
{
  "template": {
    "mappings": {
      "properties": {
        "other-field": {
          "type": "text"
        }
      }
    }
  }
}
// Returns:
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "updating component template [one] results in invalid composable template [it] after templates are merged"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "updating component template [one] results in invalid composable template [it] after templates are merged",
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "composable template [it] template after composition with component templates [one] is invalid",
      "caused_by" : {
        "type" : "illegal_argument_exception",
        "reason" : "invalid composite mappings for [it]",
        "caused_by" : {
          "type" : "mapper_parsing_exception",
          "reason" : "Invalid [path] value [field] for field alias [alias-field]: an alias must refer to an existing field in the mappings."
        }
      }
    }
  },
  "status" : 400
}

This is just one contrived example.

@joshdover
Copy link
Contributor

Even if relying on Fleet for this, there wouldn't be a way to allow customization at the namespace level, without having to create an index template for each namespace (which may not be known upfront but dynamically determined by a reroute processor).

I think this is the biggest downside - these potential index templates needed are not known at integration installation time. They may only exist later.

Now you could argue that the user won't really need to make any namespace-specific customizations until there is a known namespace they want to customize, so creating a new index template is a viable option. But now the user needs to either (1) manually copy the index template and keep it up-to-date with changes to the integration; or (2) use Fleet/Integration APIs in Kibana to add customizations to handle this for them, which is a confusing experience to have to switch between ES and Kibana APIs for template management.

A similar alternative that would not have this downside is to add data stream naming scheme template management APIs to Elasticsearch directly so that users could more easily manage this directly from ES. IMO this might be the best middle ground, but I'd like to hear from @felixbarny on whether or not this fully solves the problem.

Another idea is to solve the validation problem at indexing time instead of template creation, with a fallback to a "failure data stream" - the idea we discussed at EAH for documents that fail to be processed or indexed. This case feels pretty similar and could make use of the same mechanism. That said, I believe that's a fairly large enhancement that we have not begun work on and it would be unfortunate to block on this.

@felixbarny
Copy link
Member Author

A similar alternative that would not have this downside is to add data stream naming scheme template management APIs to Elasticsearch directly so that users could more easily manage this directly from ES. IMO this might be the best middle ground, but I'd like to hear from @felixbarny on whether or not this fully solves the problem.

Could you elaborate on how that would work?

One potential issue with that may be how the precedence of these custom component templates is defined. How are they ordered among themselves, and how are they ordered with the component templates that already exist on the data stream?

@joshdover
Copy link
Contributor

joshdover commented Aug 8, 2023

I'm thinking a higher level API for managing templates that are part of the data stream naming scheme, like we've brainstormed in the past. This would solve the problem of being able to direct users to use a single API surface for template management (Elasticsearch) and having Elasticsearch manage the namespace-specific settings.

I think these APIs would need to support all of the granularity levels at the main issue description, in addition to global defaults. Under the hood it would need to dynamically create and update the required index and component templates, validating them all before committing the change.

This API would probably also need to distinguish between user-customized settings and package-managed ones. The package API would be restricted to Kibana's system user only to keep end users to use the @custom templates. I'd recommend a generic form like:

PUT /_data_stream_template/{type}-{dataset}-{namespace}/(@package|@custom)
{
  "settings": { },
  "mappings": { },
  "data_retention": { },
}

For a basic case like setting a type-wide default, no new index templates need to be created, only updating the logs@custom component template which is referenced in all index templates:

PUT /_data_stream_template/logs-*-*/@custom
{
  "lifecycle": {
    "data_retention": "7d",
  }
}

// Under the the hood ES does
PUT /_component_template/logs@custom
{
  "template": {
    "lifecycle": {
      "data_retention": "7d",
    }
  }
}

Namespace-specific customizations require more work under the hood to create index templates with higher priority if needed. In this example, a new index template for every data stream managed by the system would need to have namespace-specific template created with higher priority, referencing a namespace-specific component template:

PUT /_data_stream_template/logs-*-foo/@custom
{
  "lifecycle": {
    "data_retention": "7d",
  }
}

// Under the the hood ES does something sort of like this (for each logs dataset):
PUT /_component_template/logs-*-foo@custom
{
  "template": {
    "lifecycle": {
      "data_retention": "7d",
    }
  }
}

PUT /_index_template/logs-my.dataset-foo
{
  "index_patterns": ["logs-my.dataset-foo"],
  "data_stream": { },
  "priority": 250, // higher than whatever logs-my.dataset template is
  "composed_of": [
    "logs@global",
    "logs@custom",
    "logs-my.dataset@package",
    "logs-my.dataset@custom",
    "logs-*-foo@custom",
    "logs-my.dataset-foo@custom"
  ],
  "allow_missing": [
    "logs@custom",
    "logs-my.dataset@custom",
    "logs-*-foo@custom",
    "logs-my.dataset-foo@custom",
  ]
}

This has an added benefit of having Elasticsearch be the source of truth for how these customization layers are added on top of one another, instead of spreading that out across Fleet and Elasticsearch's default templates.

@felixbarny
Copy link
Member Author

One potential challenge I see is what happens when you create a new data stream after adding a namespace customization.
Example:

  1. PUT _data_stream/logs-ds1-foo
  2. PUT /_data_stream_template/logs-*-foo/@custom
  3. PUT _data_stream/logs-ds2-foo

How do we ensure that ds2 also gets the customizations from step 2?
Also, the implementation to create copy of the index template with a higher priority seems problematic: When making changes to the original index template, those changes won’t be reflected in the copy.

Dataset customizations (such as logs-foo-*) aren't trivial either, as a data stream, such as logs-foo-default may be created by the built-in logs-*-* index template that doesn't import a component template for logs-foo@custom, so we'd also need to create a copy of that index template.

But maybe it's fine to rely on copying the index templates? On the pro side, it makes existing data streams more immune to breaking changes caused by modifications in the global templates. However, they also don't benefit from improvements in these templates. Maybe that's the right tradeoff if it allows us to statically verify that the merged index templates are valid.

@joshdover
Copy link
Contributor

We had a brief brainstorming session on this today and discussed these requirements & constraints:

  • Users can make modifications on a type, dataset, namespace levels, or a combination of 2 or 3
  • Users need to be able to define customizations without forking index templates from integrations or included in Elasticsearch
  • Customizations are applied automatically, in a declarative way without needing to update every index outside the customization update
  • Precedence between combinations will be strictly defined by the system and not user-definable
  • These customizations will always take precedence over index templates
  • Dependent settings across separate customizations are not supported, they must be contained in the same customization
    • Example: a field alias can’t depend on a field defined in a different customization or the index template
  • We need to validate as much as possible when the customization is created

Next step is for @tylerperk to flesh these requirements out more and we'll then meet again for another brainstorming session on potential solutions.

axw added a commit to axw/elasticsearch that referenced this issue Jan 10, 2024
Use `<data_stream.type>@custom` instead of `apm@custom`.
This is an enhancement over what Fleet sets up; it is
an additive improvement in the direction of
elastic#97664.

The rollup data streams' `@custom` component templates
now include the duration, like what Fleet sets up.

Add a YAML REST test, and a unit test ensuring consistency
across the index templates.
elasticsearchmachine pushed a commit that referenced this issue Jan 17, 2024
Use `<data_stream.type>@custom` instead of `apm@custom`. This is an
enhancement over what Fleet sets up; it is an additive improvement in
the direction of #97664.

The rollup data streams' `@custom` component templates now include the
duration, like what Fleet sets up.

Add a YAML REST test, and a unit test ensuring consistency across the
index templates.
@kpollich
Copy link
Member

kpollich commented Aug 5, 2024

Hey @bytebilly and @dakrone, I wanted to bump this issue based on a recent thread we've been chatting in with @lucabelluccini about various pain points support observes when users work with component templates.

The most recent example of this was elastic/integrations#8542, where we started including the ecs@mappings component template in integration index templates. This introduces a potential breaking change for users who have cloned the index template to apply an ILM policy to a given namespace, e.g. https://www.elastic.co/guide/en/fleet/current/data-streams-ilm-tutorial.html.

Here's a writeup of this specific issue from Luca:

TL;DR
The customer started seeing their APM data being rejected:

  • upgrading to 8.13 triggered the switch to use "ecs@mapping"
  • (some) APM index templates use "dynamic": "runtime" the fields added via ingest pipelines triggered the mapping conflict (but they were working prior the upgrade)

Context
The user was on 8.11.
The user was using a global@custom ingest pipeline to add some fields called "related.ip" and "related.host" (copying the values from other fields).
The user upgraded to 8.13.
They started getting rejections due to "related.*" fields having mapping conflicts -
(illegal_state_exception): Missing intermediate object related" ("related" is the field).

In this particular instance, it seems like the ecs@mappings component template and the dynamic: runtime feature don't compose very well.

Overall, recommending users clone index templates for customizations has proven brittle and problematic. Having some more stack-level functionality for customization and preventing breakages would go a long way to helping with use cases like the above.

One of the bigger things from this issue that would help is being able to customize data streams at the namespace level, namely applying an ILM policy to all data streams under a given namespace. We don't support this today, and this has led users down hacky paths that introduce lots of headaches for them when attempting stack upgrades.

It seems like the list Josh provided previously would still be very relevant today, and having a means to provide customizations outside of actually creating new templates with their own set of rules/logic would help a lot with cases like this one above. What would it take to bump the priority on this, and what might a path forward look like?

@lucabelluccini
Copy link
Contributor

Thanks @kpollich for sharing the feedback.

We instruct users to clone index templates to customize per namespace in the Fleet docs & APM docs.

At the moment this is playing not very well with our Index Templates "structure" updates.
Examples:

In parallel, spotting those situations can be done using heuristic approaches as we do not have a "marker" for cloned index templates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

8 participants