Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output: Prevent flushing threads from consuming too much CPU. #1901

Merged
merged 1 commit into from
Mar 19, 2018

Conversation

fujimotos
Copy link
Member

@fujimotos fujimotos commented Mar 16, 2018

I believe this patch fixes the CPU spikes issue reported by #1636 and #1889
(possibly also #1665).

Problem

If one flush thread sets the retry state, this makes other threads
wait until the next retry time. This behaviour itself is fine and ok.

However, in this case the interval variable has a meaningless value
because state.next_clock is behind the system time. And since the
sleep call gets skipped in such a case:

while @output_flush_threads_running
  current_clock = Fluent::Clock.now
  interval = state.next_clock - current_clock
  ...
  sleep interval if interval > 0
end

it effectively make the waiting threads to consume the resource
aggressively with busy loops.

Solution

This patch makes a flush thread to wait in a sleep state until the
next retry time, insteads of busy looping

Note

This bug seems to exist from v0.14 since this commit 6c5ba1a.

This fixes the CPU spike issue reported by several users.

If one flush thread sets the retry state, this makes other threads
wait until the next retry time. This behaviour itself is fine and ok.

However, in this case the `interval` variable has a meaningless value
because `state.next_time` is behind the system time. And since the
sleep call gets skipped in such a case (output.rb#L1343):

    sleep interval if interval > 0

it effectively makes the waiting threads to consume the resource
aggressively with busy loops.
@fujimotos
Copy link
Member Author

Note: How to reproduce this issue

You can reproduce this issue relatively trivially. Try the following settings
(optionally with a debug print statement in plugin/output.rb#L1334).

<source>
  @type dummy
  tag debug.log
  dummy {"a":"b"}
</source>

# Forward to a non-existent server.
<match debug.**>
  @type forward
  <server>
    host localhost
    port 9999
  </server>
  <buffer>
    flush_interval 3
    flush_thread_count 3
  </buffer>
</match>

Also, you can use this config to check this patch actually fixes the problem

@repeatedly
Copy link
Member

Thanks! I will check it.

@repeatedly repeatedly self-assigned this Mar 19, 2018
@repeatedly repeatedly added bug Something isn't working v1 labels Mar 19, 2018
@repeatedly repeatedly merged commit 9c58346 into fluent:master Mar 19, 2018
@repeatedly
Copy link
Member

Confirmed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working v1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants