-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s3_key_format $INDEX increments per output definition, not per tag/stream of logs #675
Comments
@SoamA Sorry I'm not completely understanding this... can you explain it with an example s3_key_prefix that has $INDEX and the full S3 key names and tag values. And explain the file fragment part some more, thanks! Is the issue that for each value of $TAG, you want sequential indexes? I should note this bug: #653 |
Yes, here's the fluent bit config:
Note that we're relying on the
In the target S3 bucket, this produces the following:
and
This is not desirable since for the first case, there's a jump from 139 to 141 and for the second, there's a jump from 140 to 143. What we really want is:
and
i.e each file upload has its own INDEX counter as opposed to having a single counter shared amongst multiple file uploads.
and
would also work. Let me know if that helps in clarifying the problem. |
I think I get it. Are you running on k8s? You have multiple tags processed by a single S3 output, and the $INDEX numbers should be sequential within a tag/stream of logs. Currently it just increments up in time within the S3 output. I'll have to take this as a feature request, which I probably won't be able to prioritize soon sorry. @SoamA you can help by submitting a feature request via AWS Support. For a short term workaround, I wonder if there's some way you could have multiple S3 outputs, one for each tag, so each one has its own $INDEX. Is that possible? How many tags do you have? Could you do some sort of metadata rewrite_tag scheme to change the tags to be a small set of meaningful values? (I can help with this if you explain your architecture more). |
Hey @PettitWesley,
Yes, we're on EKS.
Yes, will do. Stay tuned!
Here's the relevant INPUT part of the fluent-bit conf:
Spark driver processes on an EKS host are configured to output their event logs to
When the Spark driver process is actively generating the logs, they have an |
Submitted feature request in AWS support ticket https://support.console.aws.amazon.com/support/home?region=us-east-1#/case/?displayId=12990900451. |
Describe the question/issue
When the S3 output plugin in FluentBit is configured to use the INDEX feature and it's uploading two files to S3 at the same time from the same host, this is what can happen (roughly paraphrasing FB log output):
As a result,
file1
is uploaded to S3 asfile1_1, file1_2, file1_4
andfile2
is uploaded to S3 asfile_3, file_5
- in other words, the index for the s3 fragment for each file is not increasing by one. We cannot even assume it's going to start at 1 for each file. Is there a way of modifying this behavior so that the index is guaranteed to increase by one for each file being uploaded?The reason I ask is because we have services consuming these files in S3 (such as the Spark History Server) that expect such an incremental increase (sequence increasing by 1) embedded in the filenames of the files they are consuming. If this behavior is not present, they will throw an error assuming there's a fragment missing.
Configuration
Using AWS For Fluent Bit 2.31.11.
Fluent Bit Configuration File
The full config file is contained in the
fluent-bit-crash-repro.tar
provided as part of our discussions in [all versions] exec input SIGSEGV/crash due unitialized memory [fix in 2.31.12] #661Full Config Map and pod configuration
Can provide if necessary but again, same as what was described in [all versions] exec input SIGSEGV/crash due unitialized memory [fix in 2.31.12] #661
Fluent Bit Log Output
Fluent Bit Version Info
Cluster Details
Application Details
Steps to reproduce issue
Use S3 plugin output. Upload pattern for OUTPUT has to include INDEX. Use Fluent Bit to upload two files simultaneously that match this pattern.
Related Issues
The text was updated successfully, but these errors were encountered: