Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APM] Idea: Alternative transaction navigation for RUM #26544

Open
roncohen opened this issue Dec 3, 2018 · 26 comments
Open

[APM] Idea: Alternative transaction navigation for RUM #26544

roncohen opened this issue Dec 3, 2018 · 26 comments
Labels
discuss needs design roadmap Team:APM All issues that need APM UI Team support

Comments

@roncohen
Copy link
Contributor

roncohen commented Dec 3, 2018

The RUM agent does not know about the abstract page patterns that the website it is installed on uses (/blog/:blogID). It only knows the concrete page path: /blog/10-tips-when-youre-building-your-own-airplane. Sorting in Elasticsearch #26443 will greatly improve the problems that we have with high cardinality due to concrete page names.

However, there are still cases where you could have a whole section of a website that should have a high impact, but does not show up in the top of the list because each page view is counted only once or very few times because the path names contains variables or similar. For example, it might be that you have a group of pages that is very slow /feed/:userID. Because the path contains a user ID, the transaction name will be /feed/42, /feed/43 etc. Because each user only looks at their feed a few times, it will be counted separately and it will never sum up to something significant compared to for example /blog/10-tips-when-youre-building-your-own-airplane which is a page many people will load. Another example is something like /my/search?q=every-search-is-a-snowflake

This has the effect that when the user logs in, they will not see single pages that have only been loaded once or very rarely with a high average response time because they drown out in the sea of pages that have names without IDs or parameters.

Details on why we can't get better transaction names We rely on an API call that web developers installing the agent must call on each page load to set the transaction name. We hoped that setting the default transaction name to "unknown" would make it obvious that developers need to make a conscious decision and an effort to figure out the abstract path names and use them in the API call. Setting the transaction name to the concrete path name would look correct in the UI by first glance so developers would just move on, thinking that it was installed correctly (we saw this in Opbeat).

However, developers don't necessarily have a single place where the URL structure is defined that they can just pull in and pass on to the RUM agent. Additionally, lots of users only have sporadic access to the "master" template of their website. They might be using an external consultancy to develop it etc.

So in the end, because that's the only thing that is convenient, web developers just resolve to setting the concrete page name as the transaction name.


Instead of trying to come up with better transaction names automatically or asking users to come up with complicated custom code to fix it, I suggest we change the navigation to be a path-hierarchy based navigation for RUM. This is similar to the "Content" navigation in Google Analytics. You see the top level paths first, and stats for every page that has url prefix:

1:
image
(numbers and order here are totally made up)

User then clicks "https://www.elastic.co/guide/en" and sees subpages for that with the stats for each:

2:
image
(again, numbers and order here are totally made up)

This should fix the problem of single/rate page urls not being counted/seen anywhere. It would also mean we can probably use page address as default in the RUM agent instead of asking the users to call the apm.setInitialPageLoadName(name). If this works as intended, it will make setting up RUM much much easier.

When using the hierarchy based navigation, we could also consider adding the option to go from the list to the transaction group details on a specific path prefix instead of the full URL. In other words, give the user the option between going a level deeper and going to page showing all the transaction that match the prefix:

image
(again, a totally faked screenshot)

Path hierarchical querying

query for (1)

  • the 29 number is the length of the top level filter https://www.elastic.co/guide/.
  • sum is actually avg * count, so we can use that for impact directly
GET apm-6.4.0-transaction-2018.11.23-reindex/_search?size=0
{
  "query": {
    "match": {"context.page.url.hierarchical": "https://www.elastic.co/guide"}
  },
  "aggs": {
    "txs": {
      "terms": {
         "script" : {
            "source": "def d = doc['context.page.url'].value; if (d.length() > 29) { def c = d.indexOf('/', 29); if (c>0) { return d.substring(0,c);}} return d",
            "lang": "painless",
          "order": {
            "duration_sum": "desc"
          }
        }
      },
      "aggs" : {
        "duration_avg" : { "avg": { "field" : "transaction.duration.us" } },
        "duration_sum" : { "sum": { "field" : "transaction.duration.us" } },
        "duration_p99" : { "percentiles": { "field" : "transaction.duration.us", "percents" : [99] } }
      }
    }
  }
}

query for (2)

GET apm-6.4.0-transaction-2018.11.23-reindex/_search?size=0
{
  "query": {
    "match": {"context.page.url.hierarchical": "https://www.elastic.co/guide/en"}
  },
  "aggs": {
    "txs": {
      "terms": {
         "script" : {
            "source": "def d = doc['context.page.url'].value; if (d.length() > 32) { def c = d.indexOf('/', 32); if (c>0) { return d.substring(0,c);}} return d",
            "lang": "painless",
          "order": {
            "duration_sum": "desc"
          }
        }
      },
      "aggs" : {
        "duration_avg" : { "avg": { "field" : "transaction.duration.us" } },
        "duration_sum" : { "sum": { "field" : "transaction.duration.us" } },
        "duration_p99" : { "percentiles": { "field" : "transaction.duration.us", "percents" : [99] } }
      }
    }
  }
}

These queries work by relying on the path hierarchical analyzer for context.page.url:

PUT apm-6.4.0-transaction-2018.11.23-reindex
{
  "settings": {
    "analysis": {
      "filter": {
        "url_stop": { 
          "type": "stop"
        }
      },
      "analyzer": {
        "page_hierarchy_analyzer": {
          "tokenizer": "path_hierarchy"
        }
      }
    }
  }
}

PUT apm-6.4.0-transaction-2018.11.23-reindex/doc/_mapping
{
  "properties": {
    "context.page.url": {
      "type": "keyword", 
      "fields": {
        "hierarchical": {
          "type": "text",
          "analyzer": "page_hierarchy_analyzer",
          "search_analyzer": "keyword"
        }
      }
    }
  }
}

Optimizations

We can avoid the performance hit from script based terms aggregation by trading for an increased index size. To avoid the script based term aggregation, we would instead create fields for the first 3-4 levels and store them in the index. That would allow us to avoid the scripted aggregation on the first 3-4 levels where the amount of data is the largest, and only use the scripted aggregation for levels that are deeper than those, where the amount of data that we need to aggregate over is significantly less.

Example:

{
  "context.page.url": "https://www.elastic.co",
  "context.page.url.level1": https://www.elastic.co",
  "context.page.url.level2": https://www.elastic.co/guide",
  "context.page.url.level3": https://www.elastic.co/guide/en"
}
Ingest pipeline to achieve this

This rudimentary ingest pipeline will parse the first levels. We could also imagine doing it in APM Server instead.

PUT _ingest/pipeline/levels
{
    "description": "parse levels",
    "processors": [
      {
        "script": {
          "source": """
            def s = ctx['context.page.url'];
            def i1 = s.indexOf('/', 8);
            def i2 = s.indexOf('/', i1+1);
            ctx['context.page.url-levels.level1']= s.substring(0, i1);
            ctx['context.page.url-levels.level2'] = s.substring(0, i2);
            ctx['context.page.url-levels.level3'] = s.substring(0, s.indexOf('/', i2+1));
          """
        }
      }
    ]
}

Note: this also needs a separate mapping update


For level 4 and up, we'd resort back to the scripted aggregation. This would be trading index size for speedier queries.

This query would show all sub paths to https://www.elastic.co/guide and group by the third level: https://www.elastic.co/guide/*

GET apm-6.4.0-transaction-2018.11.23-reindex/_search?size=0
{
  "query": {
    "match": {"context.page.url.hierarchical": "https://www.elastic.co/guide"}
  },
  "aggs": {
    "txs": {
      "terms": {
          "field": "context.page.url.level3",
          "order": {
            "duration_sum": "desc"
          }
        }
      },
      "aggs" : {
        "duration_avg" : { "avg": { "field" : "transaction.duration.us" } },
        "duration_sum" : { "sum": { "field" : "transaction.duration.us" } },
        "duration_p99" : { "percentiles": { "field" : "transaction.duration.us", "percents" : [99] } }
      }
    }
  }
}

It's possibly that there's an even better way to do the querying. We should investigate that if we chose to do this.

@roncohen roncohen added discuss Team:APM All issues that need APM UI Team support labels Dec 3, 2018
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui

@roncohen
Copy link
Contributor Author

roncohen commented Dec 3, 2018

in addition to APM UI team, it would be great to get your input @makwarth @jahtalab

@makwarth
Copy link

makwarth commented Dec 3, 2018

Nice @roncohen. I’m ++ on this. It’s a bummer to not be able to keep the generic design across languages, but it was bound to end some day. Some of the geo / user agent stuff that I’ve mentioned earlier, would also require a unique UI element(s) for RUM.

It’d be nice if we can avoid groupings like so:

http://google.com
http://www.google.com
http://www.google.com/
https://www.google.com

Not sure how we'll solve the IA. Looks like most of the filtering can happen in the list view?

@roncohen
Copy link
Contributor Author

roncohen commented Dec 3, 2018

thanks!

http://www.google.com
vs.
http://www.google.com/

shouldn't be a problem, but as for the rest we could massage the page url to collapse those, but i could also see some users asking us to show them as individual items. For example, you might want to see the load times on https:// vs. http://. Could get tricky. We could potentially add a config option in the pipeline or APM Server if that’s what we end up going with which would massage the url to remove prefixes like http://www., https://www, http://, https:// etc. or only operated on the /path segment of the url.

That means it would happen at ingest time instead of query time and you would not be able to change the setting for old data.

In theory, we could do both, e.g. keep the original url and add another set of fields that would contain the massaged url. You could then switch between using the two sets of fields (original full length and the massaged) in the UI. But that's definitely something i'd defer to a later iteration.

@hmdhk
Copy link

hmdhk commented Dec 5, 2018

Thanks @roncohen ! I like the idea!

The could be an issue with having too many hierarchies! For example some websites have a common prefix for all pages, this would result in top level hierarchies have only one child.

One solution to that is if the UI would just group all parents with one child together and show the deepest child with more than one child!

Another point is if there's a way to make this a configuration on the Kibana side? I would prefer that since the user can just change this configuration if they have a needs instead of having to change their ingest configuration!

@sorenlouv
Copy link
Member

sorenlouv commented Dec 5, 2018

Sounds super useful for RUM.

Opt in/out
I imagine this should be enabled by default for RUM data. Should it also be possible to opt out if the user is not interested in this?
Should it be possible for users to enable this behaviour for other agents?

Complexity
This is definitely more complex than the solution we use today. Considering the value it'll likely add for RUM users I think it's still worth it. We should just plan accordingly.

One solution to that is if the UI would just group all parents with one child together and show the deepest child with more than one child!

Good point @jahtalab. That would be a good enhancement.

@sorenlouv
Copy link
Member

Another point is if there's a way to make this a configuration on the Kibana side?

If we decide that it should be configurable (perhaps even per agent) we can do this via kibana.yml.

@alvarolobato
Copy link

@elastic/apm-ui @roncohen @makwarth what's the situation on this discussion? can we try to wrap it up and create an implementation issue?

@sorenlouv
Copy link
Member

This would be great to have, but also sounds like a huge effort. How requested is something like this, now that we have made improvements to the issue that originally spurred this?

@formgeist formgeist self-assigned this Apr 16, 2019
@formgeist formgeist changed the title [APM] Idea: Alternative transaction navigation for RUM [APM] Idea: Alternative transaction navigation for RUM May 2, 2019
@formgeist
Copy link
Contributor

I forgot to link it here, but I've created a design document for the UI enhancements for RUM (public doc)

@formgeist
Copy link
Contributor

Thought I'd post the GIF from the design document in here too;

Kapture 2019-06-03 at 13 43 50

@makwarth Interested in your thoughts on this breadcrumb navigation concept.

@makwarth
Copy link

makwarth commented Jun 6, 2019

Agree, I guess there's really two issues: Grouping and navigation. Are you worried about the group aggregate value in use cases like the SaaS use case with /:org:/ - or with /2019/? (One org might be super slow but rest are fast)?

@roncohen
Copy link
Contributor Author

roncohen commented Jun 7, 2019

A bit worried, but i think the proposed navigation will still be a significant improvement over what people experience today. I think it's interesting to think about how we can incorporate search here, but i think we could do it separately.

@formgeist
Copy link
Contributor

As for the navigation concept, I've updated it according to Ron's comment about being able to enter detail page for any prefix in the URL

Original comment

I think it wasn't clear from my proposal that you need to be able to see the details page for a prefix, not just for a single page. E.g. you shouldn't need to go all the way "in" to get the details page.

Marvel prototype

  • Clicking on the accordion arrows will navigate the tree structure inside the table.
  • Click the prefix or transaction names to navigate to the transaction group details page for each.

Kapture 2019-06-07 at 15 18 01

I can add the search option in there too, but I imagine that's a considered a search option in the table itself.

Thoughts?

@roncohen
Copy link
Contributor Author

roncohen commented Jun 7, 2019

the nested one is interesting, but i worry there might be millions of subpages. For the navigation to work, maybe we need to step back and rethink how the graphs and the navigation connect? Does it make sense to update the graphs every time you click "into" something? e.g. you click /guide/ and the graphs update to show you all of "guide" and the list updates to show you sub pages?

That would perhaps require the graphs to move around so they are below the list or on the side? Just brainstorming here.

@vigneshshanmugam
Copy link
Member

Regarding the navigation grouping of the new proposed design, Will it be even possible for the user to view the aggregated graphs for /guide/elasticsearch across all locales? . Having the ability to go deeper across the locale seems good. But it would be better if the UI allows some form of grouping(group /guide/elasticsearch across all locale) vs having it per sublevel by default.

@formgeist
Copy link
Contributor

@vigneshshanmugam I think that was what @roncohen was getting at in his feedback above about being able to render a full transaction detail page per transaction "group" with a list of children. I'll have some time soon to explore other designs so stay tuned.

@alvarolobato
Copy link

alvarolobato commented Oct 23, 2019

@roncohen are we still doing this or will we go to a completely different UI for RUM first?

cc @nehaduggal @Tanya-Bragin

@roncohen
Copy link
Contributor Author

I don't remember seeing much discussion on a completely different UI. Sounds like we should set up a call to discuss

@zube zube bot removed the v7.6.0 label Dec 4, 2019
@formgeist
Copy link
Contributor

I removed the 7.6 version label from this issue. Still need to figure out the prioritization of this UI.

@hmdhk
Copy link

hmdhk commented Jan 17, 2020

Another example use-case for improving the grouping of transactions: https://discuss.elastic.co/t/elastic-apm-rum-js-agent-grouping-http-request-transactions/215380.

We should have a zoom call to discuss the potential solution we can take with this one.
cc @lreuven @drewpost

@botelastic
Copy link

botelastic bot commented Dec 13, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@botelastic botelastic bot added the stale Used to mark issues that were closed for being stale label Dec 13, 2021
@stale stale bot removed the stale Used to mark issues that were closed for being stale label Jul 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss needs design roadmap Team:APM All issues that need APM UI Team support
Projects
None yet
Development

No branches or pull requests

9 participants