-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adds an alternate windows performance counter input plugin #1629
Conversation
@politician thank you for the contribution but I don't think I can merge this in it's current state. If you would like to change the windows perf counters, we will need to do a straight replacement. Windows support is still in an "experimental" state so currently it's OK to make breaking changes to the measurement schema. FWIW, I completely support the idea behind this PR, but I was under the impression that windows users prefer the verbose and complicated names. This being said, we will need to open up a discussion and get input from other cc @TheFlyingCorpse @butitsnotme @ricardclau @steverweber @elvarb @cwegener @G-regL please let us know your thoughts on normalizing the |
I'm inclined to agree with you Cam. I'm a user of both platforms in my environment, and I'm happy to use the names provided by each. I can't speak to the use of Influx as a TSDB, but with Graphite, I use a relay to rewrite metrics names I don't like into ones I do. I also use I actually rather like the current plugin. That said, if I had to make a change, it would be geared towards the source of the metrics. |
@G-regL if you use the regular plugins (inputs.cpu, inputs.mem, etc.), those should use WMI. The reason I decided to default to |
@sparrc
|
@G-regL have you tried using a PowerShell script from the |
@politician, no. I hadn't thought of it, and now that I have, I think it would be slower than having something built-in. Something to test though I suppose. |
We run a mixed environment of, mac, windows, Linux, systems... it be best if the metric names were uniform on all the os types. This will simplify queries to display the data... +1 |
I am inclined to believe that telegraf should produce as close to the same set of metrics across all platforms as possible, including using the same names. This means less re-work of the data to be able to compare across platforms. @politician I have a pull request in progress which will remove carriage returns allowing the data from a powershell script (or other program) to be processed on Windows. See pull request #1606. |
Thanks for the quick replies - it seems like there are two camps forming: folks who like origin names or can post-process metrics, and folks that prefer uniform names or don't want to post-process metrics. This discussion might be raising the need for general transform plugins that can alter/regroup metrics before they're emitted to an output plugin. But short of that sort of large change, here are a couple of other ideas that I was playing around with before settling on the
That said, I suppose I could make the field rewriting aspect optional. It doesn't sound like the series minimization code is contentious. |
https://github.com/mozilla-services/heka |
@steverweber I'll admit to never getting heka to actually work. On Windows or Linux, even with the default "Hello, World" example. On the other hand, |
i also found heka kinda frustrating to get working... that's why i'm here :) Seemed more simple. |
Graphite powershell https://github.com/MattHodge/Graphite-PowerShell-Functions has this feature of renaming metrics, it's just in the powershell code but very easy to modify there. It is one way of doing this, have telegraf rename metrics before they are sent. Regarding unified naming conventions between platforms I would be extremely cautious. Not all platforms report basic metrics on the same format, cpu load for example. |
@steverweber please keep on-topic, your opinion about merging telegraf & heka has been heard many times by the telegraf committee (of one).....I think you can guess by now that it's not going to happen. |
@politician what is an I would support having the ability to specify arbitrary WQL statements and lastly, remember that most of the regular system plugins work and produce the same names as the linux plugins (inputs.cpu, inputs.mem, etc). These were not made the default because WMI is resource-intensive. if anyone has time & expertise to rewrite the code behind these to use windows perf counters instead of WMI, I'm sure that @shirou would appreciate it a lot: https://github.com/shirou/gopsutil |
@sparrc A primary use case is monitoring the standard Windows HTTP server, IIS. So, I briefly considered building a dedicated plugin for it (cf. nginx, etc). |
My preference is for the format suggested here, but we would need to replace Having a plugin-like interface for modifying metrics as they pass through the system is in the pipeline, and a high-priority. |
Sorry about the delay answering here In our case, we never compare Linux and Windows metrics as they do completely different things in our setup so win_perf_counters is totally fine for us. I agree the names are a bit cumbersome but this is just how Windows stores them. On the other hand, I agree, it is very difficult to show the same metric (even something as simple as Free Memory) for both Win and Linux hosts in the same Grafana dashboard. If you ask me, I am happy with the way win_perf_counters plugin works but if you go ahead with this new plugin (which makes total sense, as Windows support is experimental) I would appreciate some comments with an easy migration guide for the telegraf.conf files. We have hundreds of servers reporting metrics to our Grafana / InfluxDB setups and CI/CD pipelines to generate and install dashboards and this change can be a bit tricky for us :) |
Hi to everybody. I would like to contribute in this discussion. We are currently working with Graphite Powershell https://github.com/MattHodge/Graphite-PowerShell-Functions . We are now renaming metric names to something more user friendly. And We would like a lot to have this new capability also in telegraf. ( I think is really important to users like us that will need a migration from Graphite Powershell to telegraf in the future ) . We have also need any way to get data from other windows sources , like WMI , we need by example to get the total physical memory in the system. ( not available in native performance counters in windows 2008/2012 servers ). Thank you very much. |
Hi all, gopsutil author here. I noticed lxn/win is now not using cgo. Since gopsutil has "pure golang" policy, I could not use lxn/win, but now it looks changed. I am thinking about gopsutil change to use lxn/win. But if someone make a PR, I really appreciate. (Sorry not directly related to telegraf itself) |
@sparrc It sounds like there is a general consensus in favor of the following:
There hasn't been enough discussion to develop a consensus around the following questions:
I'd love to take a look at the progress on this - can you point me to any commits? |
Can't this be done already via configuration?
I'm not sure....what would be the benefit? can you provide and example of how that would look vs. the current plugin?
there is none so far |
I should have been more specific. The current plugin will coalesce points by objectname via configuration, but
The proposed The sample below is slightly modified from the README.md. # A plugin to collect stats from Windows Performance Counters
[[inputs.wpc]]
## If the system being polled for data does not have a particular Counter at startup
## of the Telegraf agent, it will not be gathered.
# Prints all matching performance counters (useful for debugging)
# PrintValid = false
[[inputs.wpc.template]]
# Processor usage, alternative to native.
Counters = [
# Use double-backslashes to work around a TOML parsing issue.
[ "usage_idle", "\\Processor(_Total)\\%% Idle Time" ],
[ "usage_user", "\\Processor(_Total)\\%% User Time" ],
[ "usage_system", "\\Processor(_Total)\\%% Processor Time" ],
[ "available_bytes", "\\Memory\\Available Bytes" ]
]
Measurement = "win_system"
# Print out when the performance counter is missing from object, counter or instance.
# WarnOnMissing = false The current |
@politician putting all of your fields into a single measurement is not an encouraged way to setup your influxdb schema, It's important to note that it's exponential but where
So adding just a few more series to separate CPU and memory usage shouldn't have any significant impact. The other consideration is that if the fields are part of the same series, then they can never be differentiated from each other, meaning that you can't separate out CPU usage of the various CPUs. It works if you are only differentiating based on the hostname, but it falls apart if you want any more granularity beyond that. |
I'm closing this for now as I don't want to merge duplicate plugins. If there is something lacking in the current win_perf_counters plugin, the proper way to go about requesting/discussing changes would be to open an issue. Then we can come to a consensus over whether we can introduce breaking changes if they would be of use to the community. |
@sparrc Is there a contrib repository for plugins like this? |
not at the moment, no, Go doesn't have a very good facility for doing this unfortunately. |
After getting Telegraf installed as a Windows Service earlier today, I noticed that the
win_perf_counters
plugin was generating a large amount of difficult to query series. So, I built this alternate input plugin that works to minimize the number of series generated and simplify queries.From the README.md,
I'm open to suggestions for improving it further.
Required for all PRs: