Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage Analytics on Saved Object Tags #81847

Closed
alexfrancoeur opened this issue Oct 27, 2020 · 8 comments · Fixed by #83160
Closed

Usage Analytics on Saved Object Tags #81847

alexfrancoeur opened this issue Oct 27, 2020 · 8 comments · Fixed by #83160
Labels
Feature:Saved Object Tagging Saved Objects Tagging feature Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@alexfrancoeur
Copy link

alexfrancoeur commented Oct 27, 2020

Relates to #79096 & #74571

As we introduce tags, it'll be great to understand how they are used. While I'm sure we can iterate here, there are some initial questions we'd like to answer. I realize it's late in the game to make this request, but given the long standing need for such content management mechanisms, it's be nice to capture usage metrics in 7.11 along with the tags themselves.

Number of tags

Questions we want answered

  • How many tags are actively in use in a cluster?
  • What % of all objects are tagged?
  • How many saved objects are tagged?

Metrics to help answer these questions

  • Total number of tags attached to a saved object
  • Total number of user consumable saved objects (may already exist)
  • Total number of saved objects tagged

Number of tags by type

Questions we want answered

  • How many tags are actively in use for [dashboards, visualizations, etc.]?
  • What % of [dashboard, visualization, etc.] objects are tagged?
  • How many [dashboard, visualization, etc.] saved objects are tagged?

Metrics to help answer these questions

  • Total number of tags by saved object type
  • Total number of saved objects by type (may already exist)
  • Total number of saved objects tagged by type

Open questions

  • Can we differentiate between tag author? While I don't believe there is a difference technically, we'll have user generated and system / integration generated tags. If we can differentiate, it will be good to include the author in the above metrics to filter out integration vs. user generated
  • How important is it to understand tags by spaces? I'm leaning towards less important, but might be a nice to have metric.

cc: @joshdover @pgayvallet @ryankeairns , let me know what you think

@alexfrancoeur alexfrancoeur added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Oct 27, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-platform (Team:Platform)

@pgayvallet
Copy link
Contributor

pgayvallet commented Oct 30, 2020

Total number of tags attached to a saved object

Is that the total number of tag objects that are at least assigned to 1 object?

Total number of user consumable saved objects (may already exist)

The kibana usage collector is already returning this info

Total number of saved objects by type (may already exist)

Not sure about this one. @elastic/kibana-telemetry can you help us here?

Can we differentiate between tag author

atm we can't, Not sure how we can dissociate user created versus system created tags. We would need a 'createdBy' meta that I did not add.

How important is it to understand tags by spaces? I'm leaning towards less important, but might be a nice to have metric.

TBH, I don't really know if we already have collectors that are doing 'per space' collection / distinction. I don't think we do?

@pgayvallet
Copy link
Contributor

cc @elastic/kibana-telemetry could you take a look and check if the metrics suggested by @alexfrancoeur makes sense to you? Do you see anything we should avoid?

Also, I see the kibana usage collector that is collecting per-type so counts is using raw ES queries instead of the SO apis to perform aggregations

const savedObjectCountSearchParams = {
index: kibanaIndex,
ignoreUnavailable: true,
filterPath: 'aggregations.types.buckets',
body: {
size: 0,
query: {
terms: { type: TYPES },
},
aggs: {

Would we be forced to do the same for tag-related SO count metrics, or would it be alright to just perform calls to the SO apis?

How important is it to understand tags by spaces? I'm leaning towards less important, but might be a nice to have metric.

Do you know if we already have per-space metrics / if that makes sense to have some here?

@afharo
Copy link
Member

afharo commented Nov 10, 2020

Hey! Sorry for the late response!

As @pgayvallet rightly mentioned, we already capture the number of SOs per type in the collector he shared. The output of that collector has the following structure:

{
  dashboard: { total: number };
  visualization: { total: number };
  search: { total: number };
  index_pattern: { total: number };
  graph_workspace: { total: number };
  timelion_sheet: { total: number };
}

Maybe we can extend it to report the number of tags per each type?
We could also include a total aggregator that will include all the SOs (not only those 6), but we should be aware that there are many telemetry and system SOs that might affect the %, implying low usage when it's not necessarily true. What do you think @alexfrancoeur?

Would we be forced to do the same for tag-related SO count metrics, or would it be alright to just perform calls to the SO apis?

@pgayvallet this piece of logic was migrated from the Legacy codebase to KP. I'm not a big fan of using the raw index. However, when figuring out the implementation via the SO APIs, we found we might hit the pagination limit (10k docs max). It'd be great if the SO APIs allows aggregations of some sort.

Do you know if we already have per-space metrics / if that makes sense to have some here?

AFAIK, the only space-related metrics are counting the number of spaces and the number of disabled features in them. But no SO-count per space apparently.
https://github.com/elastic/kibana/blob/70a91647905f131ad7575a4a8c91993bcf26b7a1/x-pack/plugins/spaces/server/usage_collection/spaces_usage_collector.ts

@pgayvallet
Copy link
Contributor

@afharo thanks for the pointers

Maybe we can extend it to report the number of tags per each type?

tags is an xpack / licensed feature. We can't use the same OSS collector here.

when figuring out the implementation via the SO APIs, we found we might hit the pagination limit (10k docs max). It'd be great if the SO APIs allows aggregations of some sort.

yea, I kinda figured that would be the answer. Performing such raw queries with the required filters gonna be harder than just filtering / aggregating by type though, so we might have to still use SO apis.

Btw regarding aggregations for SOs: it's kinda planned, even if far from ready yet: #64002

@pgayvallet
Copy link
Contributor

pgayvallet commented Nov 11, 2020

@alexfrancoeur

What % of all objects are tagged?
What % of [dashboard, visualization, etc.] objects are tagged?

These two indicators depends on the following metrics:

Total number of user consumable saved objects (may already exist)
Total number of saved objects by type (may already exist)

Which are already available from the kibana usage collector. To avoid performing duplicate requests, I would not directly expose these metrics from the tag usage collector. Meaning that to get the info of What % of all objects are tagged, we would need to use data from both collectors. I guess that would be alright.

I would go with the following data structure for the tag collector:

{
  usedTags: number;
  taggedObjects: number;
  types: Record<string, { usedTags: number; taggedObjects: number;}> 
}

An example would be:

{
  usedTags: 3;
  taggedObjects: 7;
  types: {
    dashboard: {
       usedTags: 3;
       taggedObjects: 5;
    },
    visualization: {
       usedTags: 1;
       taggedObjects: 2;
    },
  }
}

@alexfrancoeur
Copy link
Author

These two indicators depends on the following metrics

++ that was my assumption as well. I don't think these need to be part of the tag usage collector but instead an analysis done in Kibana later on. What we need here is exactly what you've surface.

The example provided LGTM. Thanks for jumping on this @pgayvallet !

@pgayvallet
Copy link
Contributor

Ok great. #83160 is ready then 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Saved Object Tagging Saved Objects Tagging feature Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants