[Fleet] `logs-` and `metrics-` index patterns get overwritten on install/removal/upgrade of packages, breaking runtime fields and CCS #120340

hop-dev · 2021-12-03T12:14:51Z

Kibana version: > 7.10

Describe the bug:

On package install, upgrade or removal the Fleet index patterns are regenerated, meaning any settings that the user may have specified on the index pattern e.g runtime or scripted fields

Steps to reproduce:
given a kibana instance with fleet setup:

Add a scripted field to the metrics-* index pattern

Install the assets of a package (or remove a package, or upgrade a package)

Return to view the metrics-* index pattern to see that the scripted field has been removed

Expected behavior: User changes are preserved.

elasticmachine · 2021-12-03T12:15:20Z

Pinging @elastic/fleet (Team:Fleet)

joshdover · 2021-12-03T16:13:31Z

Sorta surprised we haven't heard this one before. Though most objects installed by Fleet are considered "managed" I think this is a pretty broken experience for one of the headline features in 7.x: runtime fields + frozen tier storage. I don't think we should be overwriting these objects at all if possible.

The issue that is blocking us from excluding the index patterns from the import is this one: #120312

We could either wait for that to be fixed (which is being considered) or we could work around this by fetching the existing index pattern before the import so that we overwrite it with the existing value. This workaround could still result in lost writes due to a data race between a user updating the index pattern and a user upgrading a package concurrently.

joshdover · 2021-12-03T16:14:03Z

@jen-huang is this a known limitation that we've avoiding fixing for any reason?

jen-huang · 2021-12-03T22:32:36Z

@joshdover I think this is a known (but undocumented) limitation. AFAIK the only reason for the lack of a fix is just that other things took priority :) I think we've also been under the assumption that these are sort of "managed index patterns" and haven't thought through the UX for how they should work with runtime fields.

jportner · 2021-12-07T22:28:33Z

The issue that is blocking us from excluding the index patterns from the import is this one: #120312

We could either wait for that to be fixed (which is being considered)

We are going to fix this for the 8.0 release 👍 working on a PR now.

mattkime · 2021-12-08T22:18:25Z

We should probably come up with a full user story for 'managed' data views. This would be driven by Fleet needs and the expected user experience. App services is happy to help once there's a good definition of what needs to be accomplished.

nerophon · 2021-12-09T16:03:04Z

The following has been proposed by a user:

I was thinking of a solution of the CCS index patterns if deploying from Fleet.

Fleet should name the logs and metrics index patterns "logs" and "metrics" explicitly. Don't use random ID's.
Fleet shouldn't overwrite the logs and metrics index patterns if they already exist
Users delete and re-create the logs and metrics index patterns so they include the remote cluster name i.e. *-cluster, and use the same logs and metrics name.

If we follow this process the dashboards should work.

mbudge · 2023-08-23T12:10:18Z

We're an enterprise customer with data in multiple remote clusters.

Users access data from a central search cluster.

We run Fleet in the search cluster to make it easier to deploy and update integration Assets. This is because we don't have time to copy them from remote cluster and use python scripts to update the configuration.

Please can this get fixed asap?

joshdover · 2023-08-23T12:38:21Z

Hi @mbudge, thanks for chiming in here.

One question for you: do you need to have full control over the CCS pattern used or would it be ok if the data view was configured with a wildcard to cover all remote clusters, such as *:metrics-*-*?

mbudge · 2023-08-23T12:48:47Z

I have no problem having these added by default to fix this

*:metrics-*
*:logs-*

Thanks in advanced for fixing it.

joshdover · 2023-08-23T12:59:57Z

@hop-dev How do you think we should handle this? I think we also need to meet the runtime fields requirement. So broadly speaking, I think we should:

Change the index pattern by default to also cover CCS clusters by adding the *: prefix
Avoid overwriting any fields in the data view object that are not the index pattern and name, to preserve any other customizations the user has made (like runtime fields).
Rename the data views to "Logs" and "Metrics" (optional)

So we need a way to solve item (2). My understanding is that we have to import the data views at the same time as the other SOs with the current import logic and we don't have a way to not override some objects but not others in the same import call.

I see two routes here:

Try work around the issue by reading the data view before the import and writing back any fields after the import
- slightly hacky, probably some race conditions bugs here
Add a new option to the import that allows us to not overwrite some objects in the import but still overwrite the others
- This one still isn't perfect because I also don't want users to break dashboards by changing the index pattern to something that doesn't work with our integrations.

mbudge · 2023-08-23T15:47:01Z

I don't think you need to do 3.

Anyone who's build process documentation on elastic will have to update it.

mbudge · 2023-09-18T09:49:07Z

When Fleet overwrites the data view on install/removal/upgrade of packages, does the logs-* data view object ID change?

The logs-* data view object ID has changed which has broken all our dashboards. All the visuals in the dashboards are showing a "Could not find the data view" error which is having high impact on the operations we've built on elastic. We need to get this resolved ASAP.

mattkime · 2023-09-18T22:09:29Z

When Fleet overwrites the data view on install/removal/upgrade of packages, does the logs-* data view object ID change?

I don't think it should be. I'm pretty sure the data views fleet would create always have the same id. Therefore I'm surprised that dependent saved objects would be broken. It would be worthwhile to go to saved object management and inspect the data view and a dependent broken saved object, paying attention to ids and relationships.

Are you using spaces?

joshdover · 2023-09-19T09:25:31Z

Are you using spaces?

I'm curious about this as well. Spaces is the only situation I can think of where objects may have different IDs. Integration assets don't properly support spaces at this moment, mostly because the underlying assets (dashboards, etc.) don't support multiple spaces yet. So you may have run into a bug/limitation around package upgrades when installed in different spaces. For instance, if the package was installed in once space, but then upgraded while the user was in another, the assets in the original space are deleted, which could break a reference.

@mbudge I'm also curious what version of Kibana you were on when this happened. There was an import issue recently that was affecting 8.8.0-8.9.2, that could have impacted you depending on the integration that was modified: #164712.

I've also changed this overall issue from an enhancement to a bug to help with properly prioritizing this. Data views from Integrations should work with the rest of the features in the Stack, including CCS and runtime fields.

mbudge · 2023-09-19T10:54:52Z

We are using spaces. v8.9.1 I think.

Different operations teams use different spaces so spaces are very important. Fleet needs to support them.

We use CCS to search data in remote clusters. The Fleet policies+assets are in the remote clusters.

Elastic previously recommended exporting assets and using a python script to change the data view object IDs in the json data, before importing them to the search cluster the users access. We can't support this any more due to the large number of assets we have to transfer when we upgrade.

We tried setting up Fleet in the search cluster to deploy assets to one of the spaces used by the operations teams, without any elastic-agents connecting to the cluster. It also looks like assets can only be deployed to one space through Fleet, so we have to manually transfer assets between spaces when multiple operations teams need to access them.

Would it not be easier if Fleet created the logs-* and metrics-* data views if they don't exist, but not re-create/overwrite them. Or get all the assets to use saved object names instead of IDs so Fleet becomes space aware.

kpollich · 2023-10-06T14:39:51Z

Change the index pattern by default to also cover CCS clusters by adding the *: prefix

Can we actually change these defaults? Seems like we should instead add *:logs-* and *:metrics-* as new data views to Fleet. I tried changing these in a working branch and it results in no data showing up in Discover when the wildcard prefix CCR-friendly versions of these data views are used.

kpollich · 2023-10-06T14:54:07Z

Example of the above:

Screen.Recording.2023-10-06.at.10.52.48.AM.mov

joshdover · 2023-10-10T15:14:22Z

Does it work if you use a pattern like logs-*,*:logs-*?

joshdover · 2023-10-10T15:15:23Z

I think we need a single data view so that all of the dashboards will just work. We would also need the id of the index-pattern object not to change, only the underlying index pattern.

kpollich · 2023-10-16T17:37:42Z

Does it work if you use a pattern like logs-,:logs-*?

Yes this looks like it works as expected. However, we should definitely do number 3 above and give these a vanity title if we're going to use this pattern. It's quite a bit less presentable than logs-* or metrics-* in the UI, imho

I think we need a single data view so that all of the dashboards will just work. We would also need the id of the index-pattern object not to change, only the underlying index pattern.

This is correct - we need to make sure the underlying ID stays the same. Since we'll be passing overwrite: false in the installation process for these data views, we'll want a one-time migration that updates the name and title (that's the property for the actual pattern, see https://www.elastic.co/guide/en/kibana/current/data-views-api-create.html) to conform with this new spec. Then, the code can continue working with overwrite: false which will prevent blowing away users' customizations to these data views.

kpollich · 2023-10-16T17:46:00Z

Hmm actually I'm not sure that we can create a migration for these data views since they aren't registered as Fleet saved objects. Maybe we can do a "just-in-time migration" by detecting a case where a dataview with a given ID (logs-* or metrics-*) already exists, but it has the wrong name. That first save will happen with override: true to set the dataview to the new values following this fix, then we'll use override: false for every save in the future, or just skip the create call entirely if a dataview with the desired ID already exists.

mbudge · 2023-10-16T20:32:36Z

Just an FYI we do this to make it easier for users to find log data. When using logs-*, users reported constantly filtering to find data from an particular Fleet integration was a bit cumbersome. This way they know what data is available as opposed to using a dashboard/visual to display a table of data_stream.dataset, as datasets disappear from view if there's no logs within the time range they are filtering on or if the feed drops.

wd

I find users just want to view the data without knowing if it's in a local or remote cluster.

kpollich · 2023-10-19T15:57:49Z

users reported constantly filtering to find data from an particular Fleet integration was a bit cumbersome

This can definitely be cumbersome, especially if you're creating a data view for every single dataset.

I can recommend adding a filter via the + icon and filtering on the event.dataset value to see data limited to a particular integration as a potential alternative that might alleviate some pain or prove useful elsewhere.

I've filed a PR for the root cause fixes here, though I expect there might be some churn on the implementation specifics. See #169409.

ruflin · 2023-10-24T07:00:32Z

@mbudge There is a feature we are working that will not solve yet your CCS issue but help with the integration / dataset selection. If you are on 8.10.* you can type into the Kibana Search bar "Logs Explorer" and Logs Explorer entry will show up. If you jump there, you see a similar experience to Discover but with an Integration drop down to select your data. There is more to come here but would be great to hear if this helps with your selection of dataset.

mbudge · 2023-10-24T08:44:42Z

@ruflin yes this looks very good. Our analysts will be very keen to start using this.

To get this working on our search cluster we need

Support for CCS
Fleet support Kibana spaces (The kibana instance we use has spaces for different business departments, and I don't think Fleet works with Kibana spaces yet).

Thanks

kpollich · 2023-10-24T15:10:50Z

@mbudge What does better CCS support here look like to you in this context? Fleet isn't "doing" much with its managed logs/metrics data views. Would having a managed data view for cross-cluster logs/metrics be helpful, or would having the pattern on the existing managed data views be preferred?

mbudge · 2023-10-24T20:20:16Z

@kpollich I find our users just want to search the logs and don't want to know about the remote clusters, so adding it to the existing data view is preferred. Elastic already has a steep leaning curve so we like to keep it simple and keep training to a minimum, and we can add our own data views if required. For us the Logs Explorer also needs to work with CCS.

kpollich · 2023-11-02T11:59:57Z

The root issue with overwriting the logs-* and metrics-* data streams has been fixed in #170188

hop-dev added the bug Fixes for quality problems that affect the customer experience label Dec 3, 2021

botelastic bot added the needs-team Issues missing a team label label Dec 3, 2021

hop-dev added the Team:Fleet Team label for Observability Data Collection Fleet team label Dec 3, 2021

botelastic bot removed the needs-team Issues missing a team label label Dec 3, 2021

joshdover changed the title ~~[Fleet] logs-* and metrics-* index patterns overwritten on install/removal/upgrade of package~~ [Fleet] logs-* and metrics-* index patterns get overwritten on install/removal/upgrade of packages, breaking runtime fields support Dec 3, 2021

joshdover added enhancement New value added to drive a business result and removed bug Fixes for quality problems that affect the customer experience labels Mar 14, 2022

joshdover added bug Fixes for quality problems that affect the customer experience impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. and removed enhancement New value added to drive a business result labels Sep 19, 2023

joshdover mentioned this issue Sep 21, 2023

Read-only UI behaviour for system managed dashboards installed as part of fleet packages #140364

Closed

kpollich self-assigned this Oct 6, 2023

kpollich mentioned this issue Oct 19, 2023

[Fleet] Fix data view overwriting on logs-* and metrics-* #169409

Closed

kpollich closed this as completed Nov 2, 2023

kpollich added QA:Needs Validation Issue needs to be validated by QA and removed QA:Needs Validation Issue needs to be validated by QA labels Nov 8, 2023

ruflin mentioned this issue Dec 8, 2023

[Logs Explorer] Add support for cross cluster search #172905

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fleet] `logs-` and `metrics-` index patterns get overwritten on install/removal/upgrade of packages, breaking runtime fields and CCS #120340

[Fleet] `logs-` and `metrics-` index patterns get overwritten on install/removal/upgrade of packages, breaking runtime fields and CCS #120340

hop-dev commented Dec 3, 2021 •

edited

Loading

elasticmachine commented Dec 3, 2021

joshdover commented Dec 3, 2021

joshdover commented Dec 3, 2021

jen-huang commented Dec 3, 2021

jportner commented Dec 7, 2021

mattkime commented Dec 8, 2021

nerophon commented Dec 9, 2021

mbudge commented Aug 23, 2023

joshdover commented Aug 23, 2023

mbudge commented Aug 23, 2023 •

edited by joshdover

Loading

joshdover commented Aug 23, 2023

mbudge commented Aug 23, 2023

mbudge commented Sep 18, 2023

mattkime commented Sep 18, 2023

joshdover commented Sep 19, 2023 •

edited

Loading

mbudge commented Sep 19, 2023

kpollich commented Oct 6, 2023

kpollich commented Oct 6, 2023

joshdover commented Oct 10, 2023

joshdover commented Oct 10, 2023

kpollich commented Oct 16, 2023

kpollich commented Oct 16, 2023

mbudge commented Oct 16, 2023 •

edited

Loading

kpollich commented Oct 19, 2023

ruflin commented Oct 24, 2023

mbudge commented Oct 24, 2023

kpollich commented Oct 24, 2023

mbudge commented Oct 24, 2023

kpollich commented Nov 2, 2023

[Fleet] logs-* and metrics-* index patterns get overwritten on install/removal/upgrade of packages, breaking runtime fields and CCS #120340

[Fleet] logs-* and metrics-* index patterns get overwritten on install/removal/upgrade of packages, breaking runtime fields and CCS #120340

Comments

hop-dev commented Dec 3, 2021 • edited Loading

elasticmachine commented Dec 3, 2021

joshdover commented Dec 3, 2021

joshdover commented Dec 3, 2021

jen-huang commented Dec 3, 2021

jportner commented Dec 7, 2021

mattkime commented Dec 8, 2021

nerophon commented Dec 9, 2021

mbudge commented Aug 23, 2023

joshdover commented Aug 23, 2023

mbudge commented Aug 23, 2023 • edited by joshdover Loading

joshdover commented Aug 23, 2023

mbudge commented Aug 23, 2023

mbudge commented Sep 18, 2023

mattkime commented Sep 18, 2023

joshdover commented Sep 19, 2023 • edited Loading

mbudge commented Sep 19, 2023

kpollich commented Oct 6, 2023

kpollich commented Oct 6, 2023

joshdover commented Oct 10, 2023

joshdover commented Oct 10, 2023

kpollich commented Oct 16, 2023

kpollich commented Oct 16, 2023

mbudge commented Oct 16, 2023 • edited Loading

kpollich commented Oct 19, 2023

ruflin commented Oct 24, 2023

mbudge commented Oct 24, 2023

kpollich commented Oct 24, 2023

mbudge commented Oct 24, 2023

kpollich commented Nov 2, 2023

[Fleet] `logs-` and `metrics-` index patterns get overwritten on install/removal/upgrade of packages, breaking runtime fields and CCS #120340

[Fleet] `logs-` and `metrics-` index patterns get overwritten on install/removal/upgrade of packages, breaking runtime fields and CCS #120340

hop-dev commented Dec 3, 2021 •

edited

Loading

mbudge commented Aug 23, 2023 •

edited by joshdover

Loading

joshdover commented Sep 19, 2023 •

edited

Loading

mbudge commented Oct 16, 2023 •

edited

Loading