Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Telemetry] Count of Visualization by Type #22010

Closed
alexfrancoeur opened this issue Aug 15, 2018 · 19 comments
Closed

[Telemetry] Count of Visualization by Type #22010

alexfrancoeur opened this issue Aug 15, 2018 · 19 comments
Assignees
Labels
Feature:Telemetry Feature:Visualizations Generic visualization features (in case no more specific feature label is available) Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@alexfrancoeur
Copy link

When we initially introduced telemetry to Kibana, we wanted to be able to track visualization type usage. This will help us determine usage and popularity of our visualizations, specifically those that are marked as experimental or lab.

We are currently reporting metrics across multiple clusters and I'm not sure of the plans for telemetry in Spaces, but it probably makes sense to report back metrics min, max, avg and total for each visualization type.

Below is an example from the original issue. We should discuss how we want to implement. For example, should we prefix experimental or lab visualizations with their own label? Or a boolean value? There are also some missing visualizations such as vega and input controls.

{
  "cluster_uuid": "blah_blah_blah",
  "stack_stats": {
    "xpack": { },
    "kibana": {
      "count": 3,
      "versions": [
        { "version": "5.3.0", "count": 2 },
        { "version": "5.2.2", "count": 1 }
      ],
      "index_pattern": {"total": 10, "min": 1, "max": 5, "avg": 3.333333333333333},
      "dashboards": {"total": 20, "min": 3, "max": 15, "avg": 6.666666666666667},
      "visualizations": {"total": 30, "min": 5, "max": 20, "avg": 10},
      "saved_searches": {"total": 7, "min": 1, "max": 5, "avg": 2.333333333333333},
      "timelion_sheets": {"total": 1, "min": 1, "max": 1, "avg": 0.333333333333333},
      "daily_reports": {"total": 25, "min": 5, "max": 20, "avg": 8.333333333333333},
      "viz_basic_area": {"total": 1, "min": 1, "max": 1, "avg": 0.333333333333333},
      "viz_basic_heatmap": {"total": 1, "min": 1, "max": 1, "avg": 0.333333333333333},
      "viz_basic_hbar": {"total": 3, "min": 3, "max": 3, "avg": 1},
      "viz_basic_line": {"total": 20, "min": 5, "max": 15, "avg": 6.666666666666667},
      "viz_basic_pie": {"total": 1, "min": 1, "max": 1, "avg": 0.333333333333333},
      "viz_basic_vbar": {"total": 0, "min": 0, "max": 0, "avg":0},
      "viz_data_table": {"total": 1, "min": 1, "max": 1, "avg": 0.333333333333333},
      "viz_data_metric": {"total": 0, "min": 0, "max": 0, "avg":0},
      "viz_map_tile": {"total": 0, "min": 0, "max": 0, "avg":0},
      "viz_map_vector": {"total": 0, "min": 0, "max": 0, "avg":0},
      "viz_ts_timelion": {"total": 3, "min": 3, "max": 3, "avg": 1},
      "viz_ts_vb": {"total": 0, "min": 0, "max": 0, "avg":0},
      "viz_other_markdown": {"total": 0, "min": 0, "max": 0, "avg":0},
      "viz_other_tagcloud": {"total": 0, "min": 0, "max": 0, "avg":0}
    },
    "logstash": { }
  }
}

We should be able to utilize the new stats API #20577

Something worth noting, experimental visualizations like TSVB are packaged with modules or sample data. So we'd need a way to differentiate visualizations that were loaded with modules and visualizations that were not. Maybe we can prefix visualizations that come from a module, module_viz_tsvb / sample_data_viz_tsvb? Or we could only count visualizations that are new and saved through Kibana? Similar to the way we might handle sample data telemetry. Definitely some things to discuss in more detail. Having this type of data will help if we decide to consolidate editors.

cc: @elastic/kibana-visualizations, @tbragin, @AlonaNadler @jimgoodwin @rayafratkina

@alexfrancoeur alexfrancoeur added Feature:Visualizations Generic visualization features (in case no more specific feature label is available) Feature:Telemetry labels Aug 15, 2018
@markov00
Copy link
Member

Hey @alexfrancoeur for sure this is a great idea. Do you think it's feasible add some more details on each visualization type like: number of metrics and buckets configured , splitted series and splitted charts? It will be very useful to have such information so we can understand where we need to put our effort: more on basic charts or more on complex, grouped, splitted chart.

@alexfrancoeur
Copy link
Author

@markov00 I would love this information, but thought it made sense to start small. Want to add the metrics you're looking for specifically? Let's use this issue to iterate on metrics we'd like to see

@rayafratkina
Copy link
Contributor

All of such data will be very useful, but I wonder if the best approach here is to leave the details up to the developers and add what is easy to include? Per Alex's comment, we don't want to complicate the initial implementation if that will delay it, but if something is easy to include, we should totally do that.

@timroes timroes added the Team:Visualizations Visualization editors, elastic-charts and infrastructure label Sep 13, 2018
@tsullivan tsullivan self-assigned this Nov 9, 2018
@tsullivan
Copy link
Member

Assigning to myself. Our goal is to use the Task Manager to schedule summarizing up the state of visualizations between long intervals

@tsullivan
Copy link
Member

Our goal is to use the Task Manager to schedule summarizing up the state of visualizations between long intervals

Blocked on #24356 (Task Manager)

@epixa @njd5475 to keep you in the loop on any work I'd want to do using Task Manager without an Alerting service, this is one of them.

@tsullivan
Copy link
Member

@alexfrancoeur I'm working on fetching data that will look something like this:

    "visualization_types": {
      "area": {
        "total": 9
      },
      "heatmap": {
        "total": 1
      },
      "histogram": {
        "total": 1
      },
      "input_control_vis": {
        "total": 1
      },
      "line": {
        "total": 8
      },
      "markdown": {
        "total": 2
      },
      "metric": {
        "total": 11
      },
      "metrics": {
        "total": 70
      },
      "pie": {
        "total": 7
      },
      "table": {
        "total": 10
      },
      "timelion": {
        "total": 10
      }
    }

This is the data I'm seeing after loading the Metricbeat saved objects: https://www.elastic.co/guide/en/beats/metricbeat/current/running-on-docker.html#_run_the_metricbeat_setup

@tsullivan
Copy link
Member

Something worth noting, experimental visualizations like TSVB are packaged with modules or sample data. So we'd need a way to differentiate visualizations that were loaded with modules and visualizations that were not. Maybe we can prefix visualizations that come from a module, module_viz_tsvb / sample_data_viz_tsvb?

I'm afraid I'm not following this comment clearly. Is the concern about getting counts of visualizations that were added by internal tools like in my Metricbeat setup example, right above this comment?

@alexfrancoeur
Copy link
Author

@tsullivan yes, that was the original concern. Basically, better understanding what users create vs. what they use out of the box. There are probably better ways to determine this though. I'm fine with removing this requirement from the telemetry if it helps speeds things along.

In regards to the current structure, this looks good to me overall but I do have one question. Given that we'll have multiple spaces, is there any way to include these metrics per space as well? Something like this:

"visualization_types": {
      "area": {
        "total": 9
        "space_max": 7
        "space_min": 0
        "space_avg": 3
      },
...

If it's something to defer to a future PR, that's fine too. Any visibility we can get into visualizations would be amazing so I don't want to delay and progress here.

Also, would we have support for all visualization types (tsvg, vega, etc.)? We really want to understand how these are being used and would be important to these metrics. I only ask because they weren't in your comment with the metricbeat saved objects so thought I'd confirm.

@AlonaNadler
Copy link

Thanks for making progress on this @tsullivan, it will be important to get also TSVB, we really trying to understand the usage of TSVB and how it compares to other visualization types in Kibana, in addition having the granularity of which visualization within TSVB will help as well.
Regarding the sample data and beats modules, I am a bit concerned specifically on sample data visualizations, which might impact our analysis if we are counting sample data visualization objects, any suggestions?
@alexfrancoeur do you think it is important to have it per space? how did you think of using it per space?

@alexfrancoeur
Copy link
Author

@AlonaNadler as I mentioned in my previous comment, I think of it as a nice to have and do not want to block telemetry of visualizations on it. If there are 20 spaces in a Kibana instance, I'm sure how useful a total number is but instead would like to better understand what a normal distribution looks like. This applies to more than just visualizations though. Having this type of granularity would be great to better understand environments for our users but I don't think it is necessary for a V1.

I had a similar concern for sample data and am already running into this issue with Canvas workpads and soon, Maps maps. When we have #19319 we'll at least be able to determine if clusters either used and are still using sample data.

+1 on TSVB and TSVB granularity (though I understand the complexity there)

@tsullivan
Copy link
Member

is there any way to include these metrics per space as well? Something like this:

I will look into this. Thanks for bringing that up, it explains the min/max/avg metrics in the original description.

Also, would we have support for all visualization types (tsvg, vega, etc.)?

It would support any visualization that stores a "visualization" type of saved object in the .kibana index. As far as I know, they all do. I don't know if it is possible to implement a visualization type that strays from the schema - I will find out.

BTW, TSVB is the visualization type identified by the keyword metrics (yes I know it is horribly confusing because there is also a metric type). There are 70 of those in my data sample.

@alexfrancoeur
Copy link
Author

BTW, TSVB is the visualization type identified by the keyword metrics (yes I know it is horribly confusing because there is also a metric type). There are 70 of those in my data sample.

Ahh good to know. That was the original name and I believe some advanced settings use this terminology as well. Not too big of a deal because this is internal facing, but something we may want to address in the future @AlonaNadler

Thanks Tim! Again, if it's too many cycles to include spaces I think we can treat it as an enhancement and open a separate issue to track.

@tsullivan
Copy link
Member

Ok, I found that the space identifier is part of the saved object _id (which is familiar from all the walkthroughs of Spaces that we've had in our team). By breaking up those ID strings, I can find which visualizations are part of the default space, and which are in a special space.

Example of what the stats look like, when I have 1 extra space with 1 area chart visualization:

image

As it is anonymized, it doesn't say much about what is in what space. But my default space has 10 area charts, and my custom space has 1 area chart. So I think this gets us in pretty good shape.

@alexfrancoeur
Copy link
Author

@tsullivan this is perfect! Thanks for taking the time to look into it. Thoughts @AlonaNadler?

A quick observation. Do we want spaces_avg instead of space_avg to be consistent with the other fields? Or space_max and space_min alternatively.

@tsullivan
Copy link
Member

Do we want spaces_avg instead of space_avg to be consistent with the other fields? Or space_max and space_min alternatively.

Good catch, that's a bug with my WIP code. I think the prefix should be spaces_*. Thanks!

@alexfrancoeur
Copy link
Author

@tsullivan are we good to close now that the PR is merged?

@tsullivan
Copy link
Member

Closed via #28793

@AlonaNadler
Copy link

Hi I still can't see TSVB in telemetry, any idea if we can fix that?

@timroes
Copy link
Contributor

timroes commented Jun 11, 2019

@AlonaNadler As I last checked you should see metrics as a visualization type inside the data (also shown in the PR Tim references above). metrics is the plugin name of TSVB, so everything that's metrics will be one TSVB chart (we can't separate between different types of TSVB charts, since that's a TSVB specific detail, and not covered by the general vis types at all).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Telemetry Feature:Visualizations Generic visualization features (in case no more specific feature label is available) Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

No branches or pull requests

6 participants