Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

textfile: fix duplicate metrics error #738

Merged
merged 1 commit into from
Dec 6, 2017
Merged

textfile: fix duplicate metrics error #738

merged 1 commit into from
Dec 6, 2017

Conversation

liwei
Copy link
Contributor

@liwei liwei commented Nov 15, 2017

The textfile gatherer should only be added to gatherer list once.

Closes #704

Copy link
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks!

Copy link
Contributor

@fpletz fpletz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying this PR fixed these annoying errors for us.

@WilliButz
Copy link

This actually fixes a critical issue for us, as the machines which have the textfile collector enabled run out of memory and start swapping heavily after a few hours because of the steadily growing log messages. I hope this will be included in the next patch release :)

@@ -34,6 +35,7 @@ import (

var (
textFileDirectory = kingpin.Flag("collector.textfile.directory", "Directory to read text files with metrics from.").Default("").String()
addOnce sync.Once
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: with this being at the package level, addOnce is pretty generic. How about textfileAddOnce?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks reasonable, i will fix and resubmit.

The textfile gatherer should only be added to gatherer list once.

Signed-off-by: Li Wei <[email protected]>
@juliusv
Copy link
Member

juliusv commented Nov 24, 2017

Hmm... At first I wondered "Why is NewTextFileCollector() even called multiple times?". Then I saw that the introduction of request-param-based collector filtering changed the node exporter to re-instantiate every collector on every scrape. Wouldn't it be better to instantiate all the collectors only once at startup and then only filter the registration per scrape? (/cc @siavashs)

Of course, the fact that this constructor modifies the global default gatherer is somewhat of a hack that's necessary because the node exporter's collector module interface doesn't allow a collector to have a method that gets called to extend the default gatherer (which in the case of the textfile collector is necessary because we need to set client-side custom timestamps on the metrics). I wonder if introducing that to the collector.Collector interface is a good idea just for this single one-off collector though.

There's something broken about the current state though (with or without this PR): if the constructor registers the textfile output with the Prometheus default gatherer, and

gatherers := prometheus.Gatherers{
prometheus.DefaultGatherer,
registry,
}
adds the default gatherer to every scrape, then we don't properly filter out the textfile collector when it isn't included in the passed HTTP collect[] filters.

I think maybe that makes the case for introducing that extra custom gatherer method to the collector.Collector interface indeed, instead of stacking one hack ontop of another?

@brian-brazil
Copy link
Contributor

I think maybe that makes the case for introducing that extra custom gatherer method to the collector.Collector interface indeed, instead of stacking one hack ontop of another?

That'd be yet another hack in my opinion. The correct solution is to make the text file collector a normal collector.

@juliusv
Copy link
Member

juliusv commented Nov 24, 2017 via email

@brian-brazil
Copy link
Contributor

brian-brazil commented Nov 24, 2017 via email

@tarokkk
Copy link

tarokkk commented Dec 6, 2017

Is there any info on this PR? Will it be merged or refactored? This bug totally crash the textfile collector feature. We had to rollback this release (although we patched up flags etc according to the 0.15.0 releas) on all machine because node_exporter ate up Memory and started to crash machines...

@SuperQ
Copy link
Member

SuperQ commented Dec 6, 2017

@tarokkk It's not recommended to deploy master branch unless you know what you are doing. Please use the latest official release v0.15.2 that does not contain the collector refactor that triggers this bug.

We are still working out if this is the correct change to eliminate the collection bug with the textfile collector.

@brian-brazil
Copy link
Contributor

👍

@SuperQ SuperQ merged commit 1e9bb4e into prometheus:master Dec 6, 2017
@SuperQ
Copy link
Member

SuperQ commented Dec 6, 2017

Sorry for the delay, we're going to take this solution for now and think about longer term refactoring.

Thanks for fixing it.

@juliusv
Copy link
Member

juliusv commented Dec 8, 2017

@brian-brazil Sorry, the textfile module doesn't actually set custom client-side timestamps. I remembered that wrong. I guess the reason it uses the injection hook is because it's the easiest way to go from text format -> protobuf -> text format? But do we strictly need that, or can we take the protobufs and make normal prometheus.Metric objects out of it (that then ironically get converted back into protobufs and then back into text, haha). Then we could use the normal Update() method to send the metrics.

@brian-brazil
Copy link
Contributor

I guess the reason it uses the injection hook is because it's the easiest way to go from text format -> protobuf -> text format?

Yes, that's why I originally implemented it this way.

But do we strictly need that, or can we take the protobufs and make normal prometheus.Metric objects out of it (that then ironically get converted back into protobufs and then back into text, haha). Then we could use the normal Update() method to send the metrics.

That's what I'd like to do.

@liwei liwei deleted the bug-704 branch December 18, 2017 01:34
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
The textfile gatherer should only be added to gatherer list once.

Signed-off-by: Li Wei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants