[APM] Investigate using transforms for APM UI data #74498

dgieselaar · 2020-08-06T10:52:02Z

We are currently experimenting with APM Server creating transaction duration metrics from transaction events. See elastic/apm#104 for more details.

These metric documents are more efficient in terms of storage, and in some cases, will speed up our APM UI requests as well. However, they serve a broad purpose (almost every use case in APM UI), and if we create more specific metrics for certain views, we can make bigger gains. Using one example (based on real-world data), the generic transaction duration metrics aggregation creates about 1 metric document for each 7 transaction events. Metrics that only support the service overview would create 1 metric document for 6000 transaction events. Elasticsearch supports transforms, which could allow us to easily create these metrics.

Possible benefits are:

Storage cost: these metrics are extremely efficient in terms of storage because they only store the data we need in the UI. They could be retained for longer, at a lower cost.
Performance improvements: Because these metrics generate significantly less documents, searches and aggregations should be significantly faster.
Backwards compatible: if we create a new metric or change an existing one, simply re-installing the transform should give us metrics for historical data as well (depending on its availability).
Easier integration with other Kibana apps: some of our charts require post-processing or complicated queries, which makes it hard to visualise in other Kibana apps. Pre-aggregating this data could make this more straight-forward. We could also more easily leverage things like search strategies and embeddables.

Here's a rough idea for what questions a POC should aim to answer:

Which visualisations in our UI would benefit from transforms? We can select two or three for this POC.
What pieces are missing? E.g., ES transforms don't support creating HDR histograms. Also, the kibana_system user doesn't have the appropriate permissions to manage the transforms or indices. What else?
What's the cost of trying to support multiple layers of data? Ideally we would show UI metrics first, then allow the user to drill-down into higher-fidelity data (for instance, when they use the query bar). Is that do-able?
Can we more easily create dashboards or leverage concepts like Kibana's search strategies?
What's the performance gain and the storage savings?
What role should rollups play?
If we install a transform, can we configure it so that newer data is processed first? This would mean that the user doesn't have to wait until the transform is caught up before using the UI.

Some possible UI metrics we can create:

Service overview metrics
Derived service annotations
Transaction breakdown data
Garbage collection metrics
A list of services (to be used in various configuration wizards)

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-08-06T10:52:04Z

Pinging @elastic/apm-ui (Team:apm)

felixbarny · 2020-08-18T06:52:07Z

An alternative that could be considered is that APM Server "collapses" some dimensions that are known to be higher in cardinality, that are not needed for a lot of aggregations.

Looking at https://github.com/elastic/apm-server/blob/5cb4101d705effdf2f54e2e45847ee92033e806e/x-pack/apm-server/aggregation/txmetrics/aggregator.go#L414, the service map or the service landing page probably only need the service.name and service.environment dimensions.

The server could collect aggregate metrics that set a special value, for example, service.name: _all to group the metrics of all services together. That way there's just one time series to look at for the UI vs having to aggregate that on the fly.

It should be fairly simple to do that server-side, wouldn't require significantly more memory, and the metrics would be instantly available, without a delay.

I used this technique in my previous project to significantly speed up aggregate graphs.

dgieselaar · 2020-08-18T08:32:04Z

@felixbarny Do you mean that APM Server would create the specific metrics this POC aims to create via transforms? e.g., a service overview metric that doesn't record transaction.name, transaction.type etc so it's more efficient?

FWIW, I don't think APM Server is ideal here. The fact that aggregation happens per-instance means that the efficiency of recorded metrics will always be more limited than using ES for those aggregations.

felixbarny · 2020-08-18T09:07:04Z

Do you mean that APM Server would create the specific metrics this POC aims to create via transforms?

Yes, that's what I meant.

The fact that aggregation happens per-instance means that the efficiency of recorded metrics will always be more limited than using ES for those aggregations.

Excellent point. It's probably not a big issue when you just have a couple of central APM Servers but with the server-per-host model, we're heading towards it is an issue.

sophiec20 · 2020-09-10T11:12:59Z

ping @elastic/ml-core for visibility

sorenlouv · 2021-02-02T13:13:08Z

@dgieselaar What do you think about tackling this for 7.13?

dgieselaar · 2021-02-02T13:21:16Z

@sqren sounds good to me!

botelastic · 2022-02-09T08:05:53Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

felixbarny · 2022-02-10T12:42:02Z

Entity extraction is one of the top asks that we have for the platform team.

They want to know what's missing in transforms in order for use to be able to use them. I've summarized some raw feedback in a dedicates section of the Stream processing use cases document.

Note that it seems like the security team has been able to successfully adopt transforms. See https://github.com/elastic/security-team/issues/157.

dgieselaar · 2023-07-12T09:28:29Z

Not planned.

dgieselaar added Team:APM All issues that need APM UI Team support enhancement New value added to drive a business result v7.11.0 labels Aug 6, 2020

dgieselaar mentioned this issue Aug 6, 2020

Process time-based transform from newest to oldest data elastic/elasticsearch#60829

Open

cauemarcondes changed the title ~~[APM] Investigate using transforms for APM UI data~~ [POC][APM] Investigate using transforms for APM UI data Aug 10, 2020

cauemarcondes added [zube]: Inbox and removed [zube]: Inbox labels Aug 10, 2020

axw mentioned this issue Aug 18, 2020

docs/agents: add sampling spec elastic/apm#307

Merged

dgieselaar mentioned this issue Aug 25, 2020

[APM] Metrics-powered UI #73953

Merged

sorenlouv added [zube]: (7.11) and removed [zube]: Backlog v7.11.0 labels Sep 29, 2020

dgieselaar changed the title ~~[POC][APM] Investigate using transforms for APM UI data~~ [APM] Investigate using transforms for APM UI data Oct 14, 2020

sorenlouv added [zube]: Backlog v7.12.0 and removed [zube]: (7.11) labels Dec 2, 2020

sorenlouv added v7.13.0 and removed v7.12.0 labels Jan 12, 2021

sorenlouv added v7.15.0 and removed v7.13.0 labels May 25, 2021

sorenlouv removed the v7.15.0 label Aug 13, 2021

botelastic bot added stale Used to mark issues that were closed for being stale and removed stale Used to mark issues that were closed for being stale labels Feb 9, 2022

sorenlouv removed the [zube]: Backlog label Jul 4, 2023

dgieselaar closed this as completed Jul 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[APM] Investigate using transforms for APM UI data #74498

[APM] Investigate using transforms for APM UI data #74498

dgieselaar commented Aug 6, 2020 •

edited

Loading

elasticmachine commented Aug 6, 2020

felixbarny commented Aug 18, 2020

dgieselaar commented Aug 18, 2020

felixbarny commented Aug 18, 2020

sophiec20 commented Sep 10, 2020

sorenlouv commented Feb 2, 2021

dgieselaar commented Feb 2, 2021

botelastic bot commented Feb 9, 2022

felixbarny commented Feb 10, 2022

dgieselaar commented Jul 12, 2023

[APM] Investigate using transforms for APM UI data #74498

[APM] Investigate using transforms for APM UI data #74498

Comments

dgieselaar commented Aug 6, 2020 • edited Loading

elasticmachine commented Aug 6, 2020

felixbarny commented Aug 18, 2020

dgieselaar commented Aug 18, 2020

felixbarny commented Aug 18, 2020

sophiec20 commented Sep 10, 2020

sorenlouv commented Feb 2, 2021

dgieselaar commented Feb 2, 2021

botelastic bot commented Feb 9, 2022

felixbarny commented Feb 10, 2022

dgieselaar commented Jul 12, 2023

dgieselaar commented Aug 6, 2020 •

edited

Loading