-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pkg/stanza] For recombine operator, in some cases, the waiting time is much longer than forceFlushTimeout #20451
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
@yutingcaicyt, are you willing to look into this? Let me know and I will assign it. |
Possibly related to #18089 |
Yeah, I'm willing to resolve it. Please assign it to me, thanks |
It's not related to #18089. I find that the description of force_flush_period does not match its actual behavior. The problem is that the forceFlushTimeout is compared with timeSinceLastEntry but not timeSinceFirstEntry in the 'flushLoop'. I think comparing with timeSinceFirstEntry is more reasonable because normally it is completely sufficient to merge logs during the forceFlushTimeout period. |
When using recombine operator, the behavior is different with description for force_flush_period. Fixing the issue that the actual timeout is much longer than force_flush_period. In order to make the actual timeout closer to "force_flush_period", set the period of ticker to 1/5 "force_flush_period", so that the entries will be forced to be sent after waiting for at most 6/5 "force_flush_period"
Component(s)
pkg/stanza
What happened?
Description
When I collect the log from the stdout of a pod which may contain several kinds of logs, some logs start with a fixed format, but some don't. In this case, sometimes I find the aggregation time in recombine is much longer than forceFlushTimeout.
Steps to Reproduce
config:
combine_field: body
force_flush_period: 5s
is_first_entry: body matches "^[0-9]+-[0-9]+-[0-9]+"
source_identifier: attributes["log.file.path"]
`
send the first log, this log starts with fixed format and match the "is_first_entry":
2023-03-29 xxxxxxxxxxfirst-logxxxxxxxxxx
wait 4s to send the second log, this log does not start with fixed format(The source of this log may be different from the first one, but all of them are printed to stdout):
xxxxxxxxxxsecond-logxxxxxxxxxx
wait 4s to send the 3th log:
xxxxxxxxxxthird-logxxxxxxxxxx
wait 4s to send the 4th log:
xxxxxxxxxx4th-logxxxxxxxxxx
.........Send a log like this every 4 seconds
`
Expected Result
the log can be flushed in 5s(the value of forceFlushTimeout)
Actual Result
these logs will stay in the recombine operator forever!
Collector version
v0.73.0
Environment information
any environment
OpenTelemetry Collector configuration
Log output
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: