Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

basicstats aggregator - Add filtering based on fields #3402

Closed
xkilian opened this issue Oct 27, 2017 · 8 comments
Closed

basicstats aggregator - Add filtering based on fields #3402

xkilian opened this issue Oct 27, 2017 · 8 comments

Comments

@xkilian
Copy link

xkilian commented Oct 27, 2017

basicstats aggregator as proposed by @toni-moreno will create an additionnal set of statistics for all metrics generated by telegraf. You can set it to not send the original metrics, but this is a very blunt configuration.

Use case
Using basicstats aggregator to get actionable/alertable metrics for volatile metrics. (cpu, diskio, contextswitches, memory usage, number of emails received/sent, etc)

Only about 5-10% of collected metrics are volatile and need 5 minute aggregation.
Second, for actionable metrics, only the mean or stdv is of interest.

Feature Request

Implement a simple filter based on fields, as implemented in the Histogram Aggregator
Only aggregate the following fields or field glob pattern.

fields = ["Percent_Processor_Time","Context_Switches_persec"]

Bonus feature request : Option to limit the new values to a single calculation (stdv or median or all)

metric= "median" or metric = "all"

This will make the new aggregator much more flexible for hosts that generate lots of metrics.

@danielnelson
Copy link
Contributor

You can use the measurement filtering options to select which metrics are added to an aggregator. Metrics that do not match are still sent onwards to the outputs. We probably need to improve the documentation around this.

I think the idea to control which stats are calculated is a good one, it could look like:

stats = ["stdev", "s2"]

If unset it would do all.

@danielnelson
Copy link
Contributor

I updated the documentation to better describe what happens when you use the measurement filtering parameters with aggregators. Let me know if there is something else we should add.

@xkilian
Copy link
Author

xkilian commented Oct 31, 2017

Thank you for the updated documentation. I was using namepass, fieldpass methods which were dropping all metrics. So I was under the impression that the filtering was not working.

namepass = ["Percent_Processor_Time"]
or
fieldpass = ["Percent_Processor_Time"]

Both of these measurement filters do not work in an aggregator. (basicstats or minmax) For data from the windows input : [[inputs.win.perf_counters.object]].

Unfortunately, I cannot for the life of me figure out what the actual name of the fields are for the windows inputs.
Is there are way to know against what the Measurement filtering is acting?!

@xkilian
Copy link
Author

xkilian commented Oct 31, 2017

You have my thumbs up for the way you propose to filter the statistical function to apply and forward.

stats = ["stdev", "s2"]

@xkilian
Copy link
Author

xkilian commented Oct 31, 2017

For data from the windows input : [[inputs.win.perf_counters.object]].
[[
This works as expected:
[[aggregators.minmax]]
period = "60s"
drop_original = false
fieldpass = ["*"]

These do not work.
fieldpass = ["Percent*"]
fieldpass = ["*Time"]
fieldpass = ["*time"]

[[inputs.win_perf_counters.object]]
# Processor usage, alternative to native, reports on a per core.
ObjectName = "Processor"
Instances = [""]
Counters = ["% Idle Time", "% Interrupt Time", "% Privileged Time", "% User Time", "% Processor Time"]
Measurement = "win_cpu"
#IncludeTotal=false #Set to true to include _Total instance when querying for all (
).

Here are the expected names to use with namepass, fieldpass : Percent_Idle_Time, Percent_Processor_Time, etc.

This issue should be reopened, as aggregator measurement filtering does not work for windows perf counters. Or I can create a new issue specifically for windows perf counters, as this one references basicstats plugin.

@danielnelson
Copy link
Contributor

The way I usually check the actual field names is by running with --test, or when using a service input I will run with the file output for a minute.

I used a socket_listener input and the basicstats aggregator configured like this:

[[inputs.socket_listener]]
  service_address = "udp://:8094"

[[aggregators.basicstats]]
  period = "10s"
  drop_original = false
  fieldpass = ["*Time"]

[[outputs.file]]
  files = ["stdout"]

Then I write to the socket_listener this point, this is just an easy way to add arbitrary test data. In Windows it might be easier to use the http_listener or another input.

echo "test Percent_Processor_Time=123,Something_Else=42" | nc -u localhost 8094

The output to the file output was:

test Percent_Processor_Time=123,Something_Else=42 1509474476939113369
test Percent_Processor_Time_count=1,Percent_Processor_Time_min=123,Percent_Processor_Time_max=123,Percent_Processor_Time_mean=123 1509474485000000000

This is what I was expecting as the non matching field was not aggregated.

@danielnelson
Copy link
Contributor

I opened the stat limiting as a new issue

@xkilian
Copy link
Author

xkilian commented Oct 31, 2017

Thank you for opening the stats issue.

As for the windows issue, I checked with --test and the name is as expected Percent_Processor_Time. If someone can test namepass or fieldpass with actual data from a windows perf counter to confirm my issue this would be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants