Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APM] Latency correlations meta issue #118009

Open
2 of 16 tasks
walterra opened this issue Nov 9, 2021 · 1 comment
Open
2 of 16 tasks

[APM] Latency correlations meta issue #118009

walterra opened this issue Nov 9, 2021 · 1 comment

Comments

@walterra
Copy link
Contributor

walterra commented Nov 9, 2021

Follow up to #109220.

Must have

Should have

Could have

Backlog

  • support metrics based indices with a first pass to analyze metric data --> pending ability to self-select which fields can be summarised
  • Improve progress bar, e.g. show current service task instead of raw %
  • Deduplicate results - revisit initial design thoughts. --> pending further customer feedback
    • Feedback on initial deduplication prototype: Duplicates look odd. The display of duplicates feels a bit weird. The first-found field gets its own columns for name and value. Whereas the duplicates are concatenated (name: value). The first-found field is likely to be from the priority list, however it is still a bit arbitrary and the duplicates might also be priority fields. I suspect it would look better as a table of name+value pairs, within a row.
    • there tends to be lots of dups in the data examples we have so far. e.g. kubernetes.pod.name and kubernetes.pod.uid. This seems more so for failures.
  • field/value candidates prioritization based on user selected fields (this could reuse the existing UI in the correlation tab where a user selects field candidates)
  • caching results in tabs - Previous thoughts on caching in flyout: Clicking on a filter value should not take you directly back to main page -- if correlations takes several minutes to calculate, then user should not lose this time because it is too easy to click on a filter.
  • Indicate percentile where 'slow' transactions begin (Need more thoughts from Steve D)
  • Should we continue to limit correlation analysis to a single named transaction? --> pending development of a "generic and transversal APM trace explorer"
  • Candidate terms selection - we have an optimization that works well on test data. --> pending performance assessment on large customer data
  • The trace samples page size is now 500, but with the compressed style of EUI pagination control this means the user has to click 499 times to get to page 500. Could the compressed mode be removed, as otherwise setting the page size to 500 seems to add very little value.

Tech Debt

  • Optimize ES queries (e.g. summarize field/value queries as part of nested aggs or multi search), investigate use of p-limit
  • destructure arguments of query calls
    [ML] APM Correlations: Chart for failed transactions correlations tab. #110172 (comment)
  • Revisit the naming of callbacks of CorrelationTable, for example setSelectedSignificantTerm still references significant terms which was used by the previous version of correlation analysis.
  • Revisit hooks that fetch correlations results and possibly consolidate duplicate code like fetching the overall histogram.

Research

  • Benchmark edge cluster results with a variant that just uses a hard coded list of fields instead of identifying all fields by itself
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants