Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3_key_format $INDEX increments per output definition, not per tag/stream of logs #675

Open
SoamA opened this issue Jun 8, 2023 · 5 comments
Labels
enhancement Feature request or enhancement on existing features

Comments

@SoamA
Copy link

SoamA commented Jun 8, 2023

Describe the question/issue

When the S3 output plugin in FluentBit is configured to use the INDEX feature and it's uploading two files to S3 at the same time from the same host, this is what can happen (roughly paraphrasing FB log output):

Successfully uploaded object file1_1
Successfully uploaded object file1_2
Successfully uploaded object file2_3
Successfully uploaded object file1_4
Successfully uploaded object file2_5

As a result, file1 is uploaded to S3 as file1_1, file1_2, file1_4 and file2 is uploaded to S3 as file_3, file_5 - in other words, the index for the s3 fragment for each file is not increasing by one. We cannot even assume it's going to start at 1 for each file. Is there a way of modifying this behavior so that the index is guaranteed to increase by one for each file being uploaded?

The reason I ask is because we have services consuming these files in S3 (such as the Spark History Server) that expect such an incremental increase (sequence increasing by 1) embedded in the filenames of the files they are consuming. If this behavior is not present, they will throw an error assuming there's a fragment missing.

Configuration

Using AWS For Fluent Bit 2.31.11.

  [OUTPUT]
      Name                            s3
      Match                           sel.*
      region                          us-east-1
      bucket                          mybucket
      total_file_size                 10M
      s3_key_format                   /spark-event-logs/mycluster}/eventlog_v2_$TAG[1]/events_$INDEX_$TAG[1]_$UUID
      s3_key_format_tag_delimiters    ..
      store_dir                       /home/ec2-user/buffer
      upload_timeout                  7m
      log_key                         log

Fluent Bit Log Output

Fluent Bit Version Info

Cluster Details

Application Details

Steps to reproduce issue

Use S3 plugin output. Upload pattern for OUTPUT has to include INDEX. Use Fluent Bit to upload two files simultaneously that match this pattern.

Related Issues

@PettitWesley
Copy link
Contributor

PettitWesley commented Jun 8, 2023

@SoamA Sorry I'm not completely understanding this... can you explain it with an example s3_key_prefix that has $INDEX and the full S3 key names and tag values. And explain the file fragment part some more, thanks!

Is the issue that for each value of $TAG, you want sequential indexes?

I should note this bug: #653

@SoamA
Copy link
Author

SoamA commented Jun 8, 2023

Yes, here's the fluent bit config:

      s3_key_format                   /spark-event-logs/mycluster}/eventlog_v2_$TAG[1]/events_$INDEX_$TAG[1]_$UUID

Note that we're relying on the INDEX feature to embed a numerically increasing sequence in the filenames. Here's what the Fluent Bit upload log looks like:

2023-06-07T18:03:44.102855211Z stderr F [2023/06/07 18:03:44] [ info] [output:s3:s3.3] Successfully uploaded object /spark-event-logs/adhoc/eventlog_v2_spark-3cc4886822bd405c80b6a16718547ad4/events_137_spark-3cc4886822bd405c80b6a16718547ad4_ys1Iws3p
2023-06-07T18:10:46.119749737Z stderr F [2023/06/07 18:10:46] [ info] [output:s3:s3.3] Successfully uploaded object /spark-event-logs/adhoc/eventlog_v2_spark-3cc4886822bd405c80b6a16718547ad4/events_138_spark-3cc4886822bd405c80b6a16718547ad4_69gehrN4
2023-06-07T18:14:04.134622779Z stderr F [2023/06/07 18:14:04] [ info] [output:s3:s3.3] Successfully uploaded object /spark-event-logs/adhoc/eventlog_v2_spark-3cc4886822bd405c80b6a16718547ad4/events_139_spark-3cc4886822bd405c80b6a16718547ad4_q2UV8ypa
2023-06-07T18:15:02.838638712Z stderr F [2023/06/07 18:15:02] [ info] [output:s3:s3.3] Successfully uploaded object /spark-event-logs/adhoc/eventlog_v2_spark-efd980675cd84f99814cd5ce20c9f17b/events_140_spark-efd980675cd84f99814cd5ce20c9f17b_Xdzd0OL5
2023-06-07T18:15:19.086118196Z stderr F [2023/06/07 18:15:19] [ info] [output:s3:s3.3] Successfully uploaded object /spark-event-logs/adhoc/eventlog_v2_spark-3cc4886822bd405c80b6a16718547ad4/events_141_spark-3cc4886822bd405c80b6a16718547ad4_f8W0jfZO
2023-06-07T18:23:02.915445149Z stderr F [2023/06/07 18:23:02] [ info] [output:s3:s3.3] Successfully uploaded object /spark-event-logs/adhoc/eventlog_v2_spark-3cc4886822bd405c80b6a16718547ad4/events_142_spark-3cc4886822bd405c80b6a16718547ad4_9cgENQzg
2023-06-07T18:32:02.955507888Z stderr F [2023/06/07 18:32:02] [ info] [output:s3:s3.3] Successfully uploaded object /spark-event-logs/adhoc/eventlog_v2_spark-efd980675cd84f99814cd5ce20c9f17b/events_143_spark-efd980675cd84f99814cd5ce20c9f17b_5RdoPafS

In the target S3 bucket, this produces the following:

s3://mybucket/spark-event-logs/adhoc/eventlog_v2_spark-3cc4886822bd405c80b6a16718547ad4/:
   events_137_spark-3cc4886822bd405c80b6a16718547ad4_ys1Iws3p
   events_138_spark-3cc4886822bd405c80b6a16718547ad4_69gehrN4
   events_139_spark-3cc4886822bd405c80b6a16718547ad4_q2UV8ypa
   events_141_spark-3cc4886822bd405c80b6a16718547ad4_f8W0jfZO
   events_142_spark-3cc4886822bd405c80b6a16718547ad4_9cgENQzg

and

s3://mybucket/spark-event-logs/spark-event-logs/adhoc/eventlog_v2_spark-efd980675cd84f99814cd5ce20c9f17b/:
   events_140_spark-efd980675cd84f99814cd5ce20c9f17b_Xdzd0OL5
   events_143_spark-efd980675cd84f99814cd5ce20c9f17b_5RdoPafS

This is not desirable since for the first case, there's a jump from 139 to 141 and for the second, there's a jump from 140 to 143. What we really want is:

s3://mybucket/spark-event-logs/adhoc/eventlog_v2_spark-3cc4886822bd405c80b6a16718547ad4/:
   events_001_spark-3cc4886822bd405c80b6a16718547ad4_ys1Iws3p
   events_002_spark-3cc4886822bd405c80b6a16718547ad4_69gehrN4
   events_003_spark-3cc4886822bd405c80b6a16718547ad4_q2UV8ypa
   events_004_spark-3cc4886822bd405c80b6a16718547ad4_f8W0jfZO
   events_005_spark-3cc4886822bd405c80b6a16718547ad4_9cgENQzg

and

s3://mybucket/spark-event-logs/spark-event-logs/adhoc/eventlog_v2_spark-efd980675cd84f99814cd5ce20c9f17b/:
   events_001_spark-efd980675cd84f99814cd5ce20c9f17b_Xdzd0OL5
   events_002_spark-efd980675cd84f99814cd5ce20c9f17b_5RdoPafS

i.e each file upload has its own INDEX counter as opposed to having a single counter shared amongst multiple file uploads.
It actually doesn't even have to start with 001, it can be any number as long as it's increasing sequentially. So

s3://mybucket/spark-event-logs/adhoc/eventlog_v2_spark-3cc4886822bd405c80b6a16718547ad4/:
   events_137_spark-3cc4886822bd405c80b6a16718547ad4_ys1Iws3p
   events_138_spark-3cc4886822bd405c80b6a16718547ad4_69gehrN4
   events_139_spark-3cc4886822bd405c80b6a16718547ad4_q2UV8ypa
   events_140_spark-3cc4886822bd405c80b6a16718547ad4_f8W0jfZO
   events_141_spark-3cc4886822bd405c80b6a16718547ad4_9cgENQzg

and

s3://mybucket/spark-event-logs/spark-event-logs/adhoc/eventlog_v2_spark-efd980675cd84f99814cd5ce20c9f17b/:
   events_145_spark-efd980675cd84f99814cd5ce20c9f17b_Xdzd0OL5
   events_146_spark-efd980675cd84f99814cd5ce20c9f17b_5RdoPafS

would also work. Let me know if that helps in clarifying the problem.

@PettitWesley
Copy link
Contributor

I think I get it. Are you running on k8s?

You have multiple tags processed by a single S3 output, and the $INDEX numbers should be sequential within a tag/stream of logs. Currently it just increments up in time within the S3 output.

I'll have to take this as a feature request, which I probably won't be able to prioritize soon sorry. @SoamA you can help by submitting a feature request via AWS Support.

For a short term workaround, I wonder if there's some way you could have multiple S3 outputs, one for each tag, so each one has its own $INDEX. Is that possible? How many tags do you have?

Could you do some sort of metadata rewrite_tag scheme to change the tags to be a small set of meaningful values? (I can help with this if you explain your architecture more).

@PettitWesley PettitWesley changed the title INDEX behavior in S3 Output Plugin when uploading multiple files simultaneously s3_key_format $INDEX increments per output definition, not per tag/stream of logs Jun 8, 2023
@PettitWesley PettitWesley added the enhancement Feature request or enhancement on existing features label Jun 8, 2023
@SoamA
Copy link
Author

SoamA commented Jun 8, 2023

Hey @PettitWesley,

I think I get it. Are you running on k8s?

Yes, we're on EKS.

You have multiple tags processed by a single S3 output, and the $INDEX numbers should be sequential within a tag/stream of logs. Currently it just increments up in time within the S3 output.

I'll have to take this as a feature request, which I probably won't be able to prioritize soon sorry. @SoamA you can help by submitting a feature request via AWS Support.

Yes, will do. Stay tuned!

For a short term workaround, I wonder if there's some way you could have multiple S3 outputs, one for each tag, so each one has its own $INDEX. Is that possible? How many tags do you have?

Could you do some sort of metadata rewrite_tag scheme to change the tags to be a small set of meaningful values? (I can help with this if you explain your architecture more).

Here's the relevant INPUT part of the fluent-bit conf:

  [INPUT]
      Name                tail
      Tag                 sel.<spark_internal_app_id>
      Path                /var/log/containers/eventlogs/*\.inprogress
      DB                  /var/log/sel_spark.db
      multiline.parser    docker, cri
      Mem_Buf_Limit       10MB
      Skip_Long_Lines     On
      Refresh_Interval    10
      Tag_Regex           (?<spark_internal_app_id>spark-[a-z0-9]+)
      Buffer_Chunk_Size   1MB
      Buffer_Max_Size     5MB

Spark driver processes on an EKS host are configured to output their event logs to /var/log/containers/eventlogs/. Each log is a single file. They look like:

$ ls -al /var/log/containers/eventlogs/
total 106464
drwxrwxrwx 2 root  root     4096 Jun  8 21:18 .
drwxr-xr-x 3 root  root     8192 Jun  8 21:14 ..
-rw-rw---- 1 spark root 41258305 Jun  7 18:30 spark-3cc4886822bd405c80b6a16718547ad4
-rw-rw---- 1 spark root 26285083 Jun  8 21:18 spark-5d6e74c026e44ae594311dd03d2da5bc
-rw-rw---- 1 spark root 41316254 Jun  7 23:16 spark-c1b075c4bf3b491d85e8d2159b141731
-rw-rw---- 1 spark root   135360 Jun  7 18:08 spark-efd980675cd84f99814cd5ce20c9f17b

When the Spark driver process is actively generating the logs, they have an .inprogress suffix. Once the job has completed running, the .inprogress suffix is removed. So in Fluent Bit, TAG[1] matches the 41 char alphanumeric string in the event log file name ( eg. c1b075c4bf3b491d85e8d2159b141731, efd980675cd84f99814cd5ce20c9f17b from the directory listed above). Because this alphanumeric string is randomly generated by Spark, I don't think we could meaningfully limit it to anything smaller, sadly, but I'm open to suggestions!

@SoamA
Copy link
Author

SoamA commented Jun 9, 2023

Submitted feature request in AWS support ticket https://support.console.aws.amazon.com/support/home?region=us-east-1#/case/?displayId=12990900451.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature request or enhancement on existing features
Projects
None yet
Development

No branches or pull requests

2 participants