-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
streams: Panic on collecting failed records #41
Comments
mumoshu
added a commit
to mumoshu/awsbeats
that referenced
this issue
Jul 11, 2018
mumoshu
added a commit
that referenced
this issue
Jul 12, 2018
* fix(streams): Panic on collecting failed records Fixes #41 * fix(ci): Fix travis build failures due to recent backward-incompatible changes in libbeat Fixes https://travis-ci.org/s12v/awsbeats/builds/402503570 * test(streams): Add a test-case for the fix
Merged
mumoshu
pushed a commit
that referenced
this issue
Jul 25, 2018
These are similar fixes to what has been done in the streams plugin. Please see the commit d0db8d4 as it looks like the same problem from #41 but I think this is a slightly neater way of handling it. We've been running this branch for 3 weeks now and it's gone from crashing very often to not crashing at all. Closes #39 Changelog: * Properly format json for firehose This was already done for streams in a086eea * Fix panic on Firehose ack/retry Not entirely sure what the problem is but we've seen panics from multiple places in the code. Mostly copying changes that were made to the streams client in ce91e04 and hoping it helps. * Fix nil dereference in Firehose failed responses The test fails with the old function and passes with the new one. I haven't seen the actual responses from the API but I suspect that when some records failed and others passed, only the ones that failed have an ErrorCode. So it needs to check if `r.ErrorCode != nil` before checking the value. It seems the `aws.StringValue` helper function does that and also removes the need for the other nil check.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I started to see persistent panics like the below in my production deployment:
This indicates that we're receiving records with unexpected structures from kinesis streams. I have no reference to the concrete specification of kinesis records that I'm unable to "fix" it.
In the meantime, I'd make awsbeats gracefully degrade, that is to give up retrying failed records with unexpected structures but just leave some log messsage to help further investigation.
The text was updated successfully, but these errors were encountered: