Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

telegraf should be able to report its version #5958

Closed
nisley opened this issue Jun 5, 2019 · 20 comments · Fixed by #6216
Closed

telegraf should be able to report its version #5958

nisley opened this issue Jun 5, 2019 · 20 comments · Fixed by #6216
Labels
feature request Requests for new plugin and for new features to existing plugins
Milestone

Comments

@nisley
Copy link

nisley commented Jun 5, 2019

Feature Request

As an operator, I should be able to check InfluxDB (or the backend datastore) to query/understand how many of each version of telegraf agents are reporting in.

Proposal:

Current behavior:

Desired behavior:

Use case:

If I have 4,000 telegraf agents on 1.X and I'm performing an upgrade to 1.Y, I would like to ask/use InfluxDB to understand if all 4,000 agents are running 1.Y. If they are not all 1.Y, how many and which are still 1.X.

@nisley nisley changed the title telegraf should be able to report it's version telegraf should be able to report its version Jun 5, 2019
@danielnelson danielnelson added the feature request Requests for new plugin and for new features to existing plugins label Jun 5, 2019
@danielnelson danielnelson added this to the 1.12.0 milestone Jun 5, 2019
@poulhs
Copy link

poulhs commented Jun 5, 2019

It could be achieved with a tag, that is "hard-coded" in the configuration (that would obviously be a tag, and that feels wrong - it would depend on the configuration that may not be changed when upgrading the telegraf). One could also implement it with an exec plugin, parsing the output from telegraf --version... argh...

This would fit nicely in the [[inputs.internal]] ?

@danielnelson
Copy link
Contributor

Yeah, I'm thinking we add it as a tag to the internal_agent measurement:

internal_agent,host=example.org,version=1.10.4 gather_errors=0i,metrics_dropped=0i,metrics_gathered=1i,metrics_written=0i 1559775442000000000

Probably the best way to get this info today for most is by looking at the User-Agent header from the InfluxDB output.

@Esity
Copy link

Esity commented Jun 6, 2019

@danielnelson Can we add a flag for this one inside the config?

[[inputs.internal]]
  ## If true, collect telegraf memory stats.
  # collect_memstats = true
  ## If true, collect telegraf version number as a tag
  # collect_version = true

@poulhs
Copy link

poulhs commented Jun 6, 2019

It would be preferred to have the version as a fieldvalue, that way one can compare...
(so tag would be sufficient if it is possible to do "SELECT host,version WHERE version < "7.9.13" to get a report of all versions before a given version)

@Esity
Copy link

Esity commented Jun 6, 2019

Are you trying to report on this? If so, get a CMDB tool to do that. If you want to show internal metrics for telegraf and group by the version, then the tag makes perfect sense

@danielnelson
Copy link
Contributor

Can we add a flag for this one inside the config?

I wasn't thinking to add a flag, usually I only add one if it will reduce the collection time significantly. If you want to exclude it then you could just use tagexclude.

version < "7.9.13"

This would be nice, especially if it could handle semantic versions correctly. I don't think we can even do lexical string comparisons with InfluxQL, but you can use a regular expression with a tag:

select * from internal_agent where version =~ /^7\..*/

@poulhs
Copy link

poulhs commented Jun 6, 2019

Afaik, the (so called) "regular expression" do expand the expression to a list of possible combinations and is de facto a version = 7.0 OR version = 7.0.1 OR and so forth.

And one have to select fieldvalues, but I don't need any of the other fieldvalues to determine which versions are out their (if version is a tag, one can't do "SELECT version FROM internal_agent")

And yes, we do use CMDB that tells us the most, but there are still too many corner cases...

But now the developers has got my input, and I don't feel like starting a flame war here, so I let them decide and will be happy with a tag if that is what they think is best.

happy debugging ;-)

@nicgrobler
Copy link
Contributor

I wrote a plug-in that simply exports the version to an ENV on startup...then sends influx this info as a tag...only on startup (not per collection interval). This way you can see when the agent was started, and it’s version.
never bothered submitting a PR as didn’t think anyone else needed this...

Also wasn’t sure that my ‘solution’ was acceptable.

If you think it is worth it, I can open a PR for this?

@danielnelson
Copy link
Contributor

@nicgrobler Sure, would be helpful to take a look at. I do think if we go the route of having Telegraf report state change events we should probably add a few more events like reload and shutdown and possible some addition information. We should probably think a bit more about what that might look like.

@CHANDU677493
Copy link

Hi Nisely and all,

How can I get the telegraf version Number from our existing domain servers where telegraf already installed..? can you please provide a quick solution here.

@poulhs
Copy link

poulhs commented Dec 18, 2020

Some configuration like this:

[[inputs.internal]]
  ## If true, collect telegraf memory stats.
  collect_memstats = true
  interval = "120s"

or telegraf --version
should do the trick.
I think the configuration is supoorted from telegraf >= 1.12

@CHANDU677493
Copy link

CHANDU677493 commented Dec 18, 2020 via email

@poulhs
Copy link

poulhs commented Dec 18, 2020

Can I have the perfect configuration file to get the telegraf version ?

Well, your perfect may differ from my perfect (mine may even differ between the servers I configure...)
But iin my perfect world, we use: /etc/telegraf/telegraf.d/inputs_internal.cfg which telegraf includes:

  ## If true, collect telegraf memory stats.
  collect_memstats = true
  interval = "120s"

@CHANDU677493
Copy link

CHANDU677493 commented Dec 18, 2020 via email

@poulhs
Copy link

poulhs commented Dec 18, 2020

https://docs.influxdata.com/telegraf/v1.16/ to get a better understanding of telegraf
and https://docs.influxdata.com/telegraf/v1.16/administration/configuration/ to understand how to configure telegraf...

@CHANDU677493
Copy link

CHANDU677493 commented Dec 18, 2020 via email

@poulhs
Copy link

poulhs commented Dec 18, 2020

OK, then you can tell me where you have your telegraf configuration from your existing domain servers where telegraf is already installed.

Or just login to all the servers and issue telegraf --version

@poulhs
Copy link

poulhs commented Dec 18, 2020

To enable the input plugin internal
find the lines

  # # Collect statistics about itself
  # [[inputs.internal]]
  #   ## If true, collect telegraf memory stats.
  #   # collect_memstats = true

and change them to:

  # # Collect statistics about itself
  [[inputs.internal]]
  #   ## If true, collect telegraf memory stats.
  #   # collect_memstats = true

yes, that was "remove the # comment"
and restart telegraf

@CHANDU677493
Copy link

CHANDU677493 commented Dec 18, 2020 via email

@poulhs
Copy link

poulhs commented Dec 19, 2020

You failed to post an image, so I don't even know your operating system (although it sounds like a non-unix like variant).
I am not trying to be rude, so please don't be offended by the next questions, it is the kind of questions I got asked in the past and still ask myself whenever I come around similar problems:
I tried to guide you as far as I can - Do you know where the configuration files are located on your systems? Do you know if they are controlled by any kind of configuration management tools? Have your read any of the documentation on telegraf configuration? I suggest that you try influxdb community on slack - there is a good telegraf channel.

To get the telegraf version, just supply the version option on the telegraf command in a text-based shell.
You may also gather the version (and other stuff) as metrics (passed on to your influxdb (or whatever your are using)) by enabling the inputs.internal in the usual way (similar to how you would do any configuration changes to telegraf).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants