-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout exception caused fatal error in grok filter #95
Comments
are you running logstash in debug mode in that production environment? it seems the uncaught exception happens while logging an event in debug mode: https://github.com/logstash-plugins/logstash-filter-grok/blob/v3.2.2/lib/logstash/filters/grok.rb#L302 |
No, I am not running in debug mode AFAIK. Please advise me on how to verify if logstash runs in debug mode or not. I believe that message is logged due to a timeout here: https://github.com/logstash-plugins/logstash-filter-grok/blob/v3.2.2/lib/logstash/filters/grok.rb#L304 For some reason, the timeout exception is initialized with an empty |
looking at the rest of the stack trace you showed:
@logger.debug? and @logger.debug("Event now: ", :event => event) which is really strange.. |
Well yes but I don't think that line 302 is ever reached when this occurs since a TimeoutException is raised. When the exception is logged it fails on line 304:
The reason for this seems to be that |
in the normal code path, rescue ::LogStash::Filters::Grok::TimeoutException => e
# These fields aren't present at the time the exception was raised
# so we add them here.
# We could store this metadata in the @threads_to_start_time hash
# but that'd come at a perf cost and this works just as well.
e.grok = grok
e.field = field
e.value = value
raise e
end I understand that the other rescue block in grok.rb:304 should be setting |
Previously we could see errors as in logstash-plugins#95 due to some very esoteric race conditions where Thread#raise would raise outside of the rescue context. This patch changes the mechanism to be setting Thread.interrupt which is more robust.
Thanks so much for reporting this! #96 should fix it once merged :) |
Thanks! I will try out the fix when the next version is released. |
Previously we could see errors as in logstash-plugins#95 due to some very esoteric race conditions where Thread#raise would raise outside of the rescue context. This patch changes the mechanism to be setting Thread.interrupt which is more robust.
Previously we could see errors as in logstash-plugins#95 due to some very esoteric race conditions where Thread#raise would raise outside of the rescue context. This patch changes the mechanism to be setting Thread.interrupt which is more robust.
Did this fix make the 5.0 release? I'm seeing a similar issue with the 5.0 release. https://discuss.elastic.co/t/grok-terminating-logstash-5-0/64307/2 Edit : looks like the fix was made against v3.2.3 which is the version that is bundled with LS 5.0 so it appears to still be happening |
@sjivan thanks for reporting this. This looks like a separate issue involving interrupts during mutexes. It looks like an interrupt during a Mutex raises an unexpected exception type. I'll work on a patch. |
Moving this to #97 |
I am trying out the upcoming 5.0.0 elasticsearch stack and have stumbled upon a problem that I have not encountered before.
The problem is that Logstash fails with a fatal error
undefined method 'pattern' for nil:NilClass>
. Before the fatal error occured, a number of exceptions (7) of typePeriodicPoller: exception
occuredstarting at 2016-10-12T18:48:34,436 (~50 minutes before the fatal error)
The latest entries in the log looks like this:
The elasticsearch logs has no indications that something has gone wrong and kibana can continue to use it to generate graphs etc.
Looking at the grok filter code, it seems like that the
@grok
variable is set to nil inlib/logstash/filters/grok/timeout_exception.rb
causing the message method to throw aNoMethodError
.If you need help with reproducing the problem, I can try to make a smaller data set/config.
The text was updated successfully, but these errors were encountered: