-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
out_s3: use retry_limit in fluent-bit to replace MAX_UPLOAD_ERROR … #6475
Conversation
59a69a9
to
e7898af
Compare
e7898af
to
97ed2e9
Compare
ce0f890
to
6c38848
Compare
plugins/out_s3/s3.c
Outdated
"failed to flush chunk tag=%s, create_time=%s" | ||
"(out_id=%d)", | ||
tag, create_time_str, ctx->ins->id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I know that in the original design, I said we want to as much as possible match the format of the normal retry messages... and we didn't want to add the "retry in X seconds" since we don't have a way of calculating the retry time... but note that I still had:
[ warn] [engine] failed to flush chunk tag=xxxxx, create_time=2022-08-18T21:34:42+0000, retry issued: input=forward.1 > output=s3.0 (out_id=0)
I think the retry_issued
is important to let the user know clearly that we will retry
plugins/out_s3/s3.c
Outdated
return -1; | ||
} | ||
|
||
/* data was sent successfully- delete the local buffer */ | ||
s3_store_file_delete(ctx, chunk); | ||
s3_store_file_delete(ctx, chunk); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove stray tab
plugins/out_s3/s3.c
Outdated
if (chunk->failures > ctx->ins->retry_limit){ | ||
less_than_limit = FLB_FALSE; | ||
} | ||
s3_retry_warn(ctx, tag, chunk->input_name, create_time, less_than_limit); | ||
if (less_than_limit == FLB_FALSE) { | ||
s3_store_file_unlock(chunk); | ||
return FLB_RETRY; | ||
} | ||
else { | ||
s3_store_file_delete(ctx, chunk); | ||
return FLB_ERROR; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this logic is correct?
So if chunk failures is greater than retry limit:
- You retry
If failures is less than retry limit
- You delete the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good catch... I didn't realize this logic is inverted... @Claych please fix
6c38848
to
2e03190
Compare
plugins/out_s3/s3.c
Outdated
if (less_than_limit == FLB_TRUE) { | ||
flb_plg_warn(ctx->ins, | ||
"failed to flush chunk tag=%s, create_time=%s, " | ||
"retry issues: (out_id=%d)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retry issued
2e03190
to
9f27bd3
Compare
plugins/out_s3/s3.c
Outdated
if (tmp_upload->upload_errors > ctx->ins->retry_limit) { | ||
tmp_upload->upload_state = MULTIPART_UPLOAD_STATE_COMPLETE_IN_PROGRESS; | ||
flb_plg_error(ctx->ins, "Upload for %s has reached max upload errors", | ||
tmp_upload->s3_key); | ||
s3_retry_warn(ctx, tmp_upload->tag, tmp_upload->input_name, | ||
tmp_upload->init_time, FLB_FALSE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if we actually need a message here anymore... and if we should actually be deleting the upload here...
@Claych Here are my thoughts on each type of failure and how we should handle them.
|
96587ee
to
76ae393
Compare
76ae393
to
dce85f4
Compare
967cbcc
to
7c0c025
Compare
cbc95c5
to
7c0c025
Compare
… update s3 warn output messages with function s3_retry_warn() Signed-off-by: Clay Cheng <[email protected]>
9d46507
to
0758dee
Compare
Signed-off-by: Clay Cheng [email protected]
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
[OUTPUT]
Name s3
Match *
bucket clay-bucket-5-s3-test
region us-east-1
total_file_size 60M
auto_retry_requests true
use_put_object off
upload_chunk_size 5M
Test result when connected with s3:
Test results when disconnected from s3:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.