Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telegraf 1.5.3 and 1.6 Logparser quit logging after logfile is recreated/rotated #3906

Closed
zoom-scott-sancetta opened this issue Mar 19, 2018 · 9 comments
Labels
area/tail bug unexpected problem or unintended behavior

Comments

@zoom-scott-sancetta
Copy link

zoom-scott-sancetta commented Mar 19, 2018

Bug report

Running Telegraf 1.5.3 or 1.6.0 on Redhat Linux, using the logparser plugin.
Every night certain logfiles are rotated via org.apache.log4j.DailyRollingFileAppender.
Our telegraf is watching these logs, and when they get renamed, telegraf stops collecting updates - maybe telegraf is holding on the the file handle… I have tried watch_method = poll AND inotify.
THIS WAS REPORTED AND APPARENTLY FIXED IN TELEGRAF 1.3, ISSUE #2967

Relevant telegraf.conf:

[[inputs.logparser]]
name_override = "PartnerAPI_Error_Events"
files = [ "/zoom/logs/events.log" ]
from_beginning = false
watch_method = "inotify"
[inputs.logparser.grok]
...

System info:

[Include Telegraf version, operating system name, and other relevant details]
Telegraf 1.6.0 on Redhat Linux

Steps to reproduce:

  1. ...
  2. ...

Expected behavior:

Actual behavior:

Additional info:

[Include gist of relevant config, logs, etc.]

@danielnelson danielnelson added bug unexpected problem or unintended behavior regression something that used to work, but is now broken labels Mar 19, 2018
@danielnelson danielnelson added this to the 1.6.0 milestone Mar 19, 2018
@zoom-scott-sancetta zoom-scott-sancetta changed the title Telegraf 1.6 Logparser quits logging after logfile is recreated Telegraf 1.5.3 and 1.6 Logparser quit logging after logfile is recreated/rotated Mar 21, 2018
@danielnelson
Copy link
Contributor

Possibly related to #2847

@dgnorton
Copy link
Contributor

Same or similar issue reported in the influxdb issue tracker: influxdata/influxdb#9664

@bolek2000
Copy link

After migration to telegraf 1.8 and tail plugin with grok data_format this issue seems to be resolved for me.

@bolek2000
Copy link

bolek2000 commented Mar 23, 2019

Still see this in 1.10 (with inotify and epoll), but not every time, this is kind of random behavior. Also sometimes it just ingests only some of the metrics.

@sjwang90
Copy link
Contributor

@bolek2000 Is this a consistent problem in the latest Telegraf?

@bolek2000
Copy link

I am not sure if this problem is solved, because we built a workaround that is still in place. We have our own nginx access log preprocessing script, which sorts time slices of buffered logs chronologically and then appends the sorted lines to a new logfile, which gets ingested by inputs.tail plugin. But when we rotate/truncate the logfile we always do a systemctl reload telegraf, to make sure the rotation is always recognised by the agent/plugin. I'm not sure if I will have the time to test this again.

@pprkut
Copy link

pprkut commented Oct 21, 2020

We have integrated the telegraf restarting into an icinga check. Over a set of 100 servers, every day some of them need telegraf to be restarted because it stopped parsing the logs. We're on telegraf 1.15.3 now.

@pprkut
Copy link

pprkut commented Nov 3, 2020

Updated to 1.16.0 and still seeing log messages like

E! [inputs.logparser] Error stopping tail on file /var/log/secure

So the fixes in 1.16.0 at least don't fix this particular issue

@MyaLongmire
Copy link
Contributor

Closing this as logparser is deprecated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tail bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

8 participants