wip - filter_ratelimit: Add record rate limiter #433

jsravn · 2017-11-15T15:39:23Z

Add a filter to rate limit records per configurable bucket field. The
benefit of this is the overall logging infrastructure can be protected
from a rogue logging source (e.g. an application that spams a high
volume of messages in a short period of time).

Rate limiting is implemented using a token based algorithm. Some n
tokens per second are added, with a total burst limit of q. This limits
messages to n messages per second on average, with a max amount of q per
second.

Configuration

Bucket_Key [filename-field]

Field to use for grouping messages into a rate limited bucket. Rate limits
apply to the bucket. Buckets are independent from each other.

Records_Per_Second 10

Average number of records per second allowed.

Records_Burst 20

Max number of records in a second allowed.

Initial_Records_Burst 100

Max number of records to allow on startup. Useful when expected to load
a lot of log messages at startup from new log files.

Max_Buckets 256

Number of expected active buckets. If too small, rate limiting won't
function very well.

jsravn · 2017-11-15T15:43:30Z

I didn't notice #410, but I don't think it's complete yet.

Anyways, we have a requirement in our production systems to limit logs on a per pod basis (in Kubernetes), so we can protect our ES cluster and other pods from a rogue application that starts spamming logs heavily.

As Bucket_Field can be configured above, it can be used with the tail plugin as well by setting Path_Key and setting Bucket_Field to match. (Maybe it should be called Bucket_Key instead 🤔). So this can be used generically in many use cases.

It uses the token bucket algorithm which is simple to code and reason about.

I had to modify the hashtable implementation to store any arbitrary object, so I could store the bucket data for quick retrieval. My work on this also motivated #431 to limit total hash table size.

Let me know what you think, I will try this out in our own systems as well to get some feedback.

jsravn · 2017-11-15T16:01:55Z

Changed it to Bucket_Key to fit the other plugins better.

jsravn · 2017-11-15T16:03:44Z

I'll rename "Events" to "Records" as well to match the other plugins.

jsravn · 2017-11-15T16:28:13Z

Quick benchmarks (updated using null output):

single bucket

baseline:

$ time timeout 60 bin/fluent-bit -f 1 -i dummy -p "rate=100000" -o null 2>/dev/null
Fluent-Bit v0.12.8
Copyright (C) Treasure Data

timeout 60 bin/fluent-bit -f 1 -i dummy -p "rate=100000" -o null 2> /dev/null  11.26s user 8.49s system 30% cpu 1:04.17 total

with filter:

$ time timeout 60 bin/fluent-bit -f 1 -i dummy -p "rate=100000" -F ratelimit -p "match=*" -p "bucket_key=message" -p "records_per_second=100000" -p "records_burst=100000" -o null 2>/dev/null
Fluent-Bit v0.12.8
Copyright (C) Treasure Data

timeout 60 bin/fluent-bit -f 1 -i dummy -p "rate=100000" -F ratelimit -p  -p   16.48s user 5.03s system 33% cpu 1:04.16 total

Around 3% increase.

Add a filter to rate limit records per configurable bucket field. The benefit of this is the overall logging infrastructure can be protected from a rogue logging source (e.g. an application that spams a high volume of messages in a short period of time). Rate limiting is implemented using a token based algorithm. Some n tokens per second are added, with a total burst limit of q. This limits messages to n messages per second on average, with a max amount of q per second. Configuration: *Bucket_Key [filename-field]* Field to use for grouping messages into a rate limited bucket. Rate limits apply to the bucket. Buckets are independent from each other. *Records_Per_Second 10* Average number of records per second allowed. *Records_Burst 20* Max number of records in a second allowed. *Initial_Records_Burst 100* Max number of records to allow on startup. Useful when expected to load a lot of log messages at startup from new log files. *Max_Buckets 256* Number of expected active buckets. If too small, rate limiting won't function very well.

jsravn · 2017-11-15T17:14:37Z

I'm thinking Initial_Records_Burst is the wrong approach here. What I want to do is allow all initial log messages on startup when using the tail plugin. I don't want to throttle in these cases, otherwise the filter will ratelimit those messages incorrectly.

I think a better approach may be some time delay before rate limiting kicks in. That should let the tail plugin finish gathering all prior logs and send them through (I suppose even that isn't perfect due to backpressure, but I think it will be a safer approach). So I could introduce Initial_Delay_Seconds and ratelimiting will remain disabled until that time is passed.

jsravn · 2017-11-15T17:37:09Z

Also thinking the dropped messages log line throttling should be time based - it's still really easy to cause log spam with it in my testing.

Replace Initial_Record_Burst with Initial_Delay that delays ratelimiting for the specified number of seconds. This provides a better mechanism to avoid throttling logs at startup. Throttle the dropped record log message by time, so log volume of the filter is independent of incoming log volume. Add an option Log_Period_Seconds so the logging period is configurable. Ensure records_burst is >= records_per_second. Improve test coverage.

jsravn · 2017-11-15T18:51:33Z

Ok made some improvements in the last commit:

ratelimit: various improvements

Replace Initial_Record_Burst with Initial_Delay that delays ratelimiting
for the specified number of seconds. This provides a better mechanism to
avoid throttling logs at startup.

Throttle the dropped record log message by time, so log volume of the
filter is independent of incoming log volume. Add an option
Log_Period_Seconds so the logging period is configurable.

Ensure records_burst is >= records_per_second.

Improve test coverage.

onorua · 2017-11-16T16:23:43Z

Initial_Records_Burst on the contrary, I was thinking on doing some sort of "slow start" which should prevent from killing ES cluster in case of fluent-bit mass upgrade/restart. We have around 250k msg/sec, and if we don't do throttling for at least several seconds, it may lead to unexpected results.

I'm not sure what will happen in case you have limit of 5 msg per second, but get 7 messages each 3rd second. Will you drop overhead if there is no burst or keep it?

onorua · 2017-11-16T17:35:25Z

I've experimented with this plugin, here is the problem I've found or maybe I did not get how to configure it, please advise:

when the flow is constantly bigger than the threshold, it is behaving like fuse - start dropping all of the messages, instead of doing leaky-bucket.

config

[FILTER]
    Match *
    Bucket_Key tag
    Records_Per_Second 65
    Records_Burst 70
    Initial_Delay_Seconds 1
    Log_Period_Seconds 0

jsravn · 2017-11-17T10:08:12Z

Thanks for testing it @onorua.

This plugin is token based (not leaky bucket), with a drop policy if out of tokens. My idea is we rely on normal backpressure if there is a network limitation.

when the flow is constantly bigger than the threshold, it is behaving like fuse - start dropping all of the messages, instead of doing leaky-bucket.

Yeah that's as expected, if you exceed "burst" it will drop records for that bucket.

Can you let me know what you had in mind instead? My goal here is to prevent single log sources from causing disruption to other sources. Maybe it doesn't match with what you want. Do you want a single rate limit and rely on fluent-bits normal backpressure mechanism?

jsravn · 2017-11-17T10:10:09Z

Initial_Records_Burst on the contrary, I was thinking on doing some sort of "slow start" which should prevent from killing ES cluster in case of fluent-bit mass upgrade/restart. We have around 250k msg/sec, and if we don't do throttling for at least several seconds, it may lead to unexpected results.

Can you rely on normal network backpressure here? ES should slow down and fluent-bit will naturally slow down number of messages sent.

I'm not sure what will happen in case you have limit of 5 msg per second, but get 7 messages each 3rd second. Will you drop overhead if there is no burst or keep it?

If no burst it will drop the messages.

jsravn · 2017-11-17T10:12:27Z

Maybe we need a better policy over dropping records, but need to think on it. A problem at least is that fluent-bit can't distinguish different log flows (afaict) so can't slow down messages for just a single log file. Maybe some sort of fair queuing would work with a leaky bucket rate limit. But it's quite a different approach. And we'd still need a way of pausing log ingestion per log source.

jsravn · 2017-11-17T11:26:42Z

I'm thinking we could modify the tail input plugin to have a buffer per file rather than a global buffer. Then we could have a leaky bucket which is keyed on the Path_Key for instance, and rely on the backpressure to stop file ingestion per file.

onorua · 2017-11-17T13:07:09Z

In our case we use fluent-bit as a forwarder, which means it has no relation to tail or whats so ever. What we really need is the leaky-bucket, which would leak even if it is overflowed 100% of the time.
There are basically 3 algorithms:

GCRA
Token which is GCRA with queue added
Sliding window, which used for batch processing mostly.

What I have discovered, is that for forwarding purpose none of GCRA, token - works well, because client fluent-bit/fluentd sends data in batches every several seconds (default is 5 I believe).

Because you consider fluent-bit only as ingesting, while in our case it is ingesting + forwarding we have slightly different expectations as well as approach.

jsravn · 2017-11-19T16:07:54Z

Hmm I forgot about the flushing behaviour. I wonder what the purpose of that is, over sending records as soon as possible. I think I see what you want now though - some sort of traffic shaping on the output. For me it's more about policy enforcement. So probably different things.

I think if this filter could "pause" ingestion/forwarding instead of dropping, then you could sort of get that traffic shaping, although you'd still have bursts from the flushes.

jsravn · 2017-11-24T10:42:10Z

@onorua I'm thinking now that I'll be happy with Mem_Buf_LImit - which will throttle logs anyways. E.g if your flush period is 5s, and Mem_Buf_Limit is 5Mi, you'll throttle logs to 1Mi/s.

edsiper · 2018-05-02T19:05:10Z

Mem_buf_Limit and the new filter_throttle address the need.

Closing the issue for now.

jsravn force-pushed the rate-limit branch from 438e500 to b0d6e12 Compare November 15, 2017 16:00

jsravn force-pushed the rate-limit branch from b0d6e12 to 8d3f836 Compare November 15, 2017 16:07

jsravn changed the title ~~filter_ratelimit: Add message rate limiter~~ filter_ratelimit: Add record rate limiter Nov 15, 2017

jsravn force-pushed the rate-limit branch 2 times, most recently from c3cac52 to e13d732 Compare November 15, 2017 16:44

jsravn changed the title ~~filter_ratelimit: Add record rate limiter~~ wip - filter_ratelimit: Add record rate limiter Nov 27, 2017

edsiper closed this May 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wip - filter_ratelimit: Add record rate limiter #433

wip - filter_ratelimit: Add record rate limiter #433

jsravn commented Nov 15, 2017 •

edited

Loading

jsravn commented Nov 15, 2017

jsravn commented Nov 15, 2017

jsravn commented Nov 15, 2017

jsravn commented Nov 15, 2017 •

edited

Loading

jsravn commented Nov 15, 2017 •

edited

Loading

jsravn commented Nov 15, 2017

jsravn commented Nov 15, 2017

onorua commented Nov 16, 2017

onorua commented Nov 16, 2017

jsravn commented Nov 17, 2017

jsravn commented Nov 17, 2017

jsravn commented Nov 17, 2017 •

edited

Loading

jsravn commented Nov 17, 2017

onorua commented Nov 17, 2017

jsravn commented Nov 19, 2017

jsravn commented Nov 24, 2017

edsiper commented May 2, 2018

wip - filter_ratelimit: Add record rate limiter #433

wip - filter_ratelimit: Add record rate limiter #433

Conversation

jsravn commented Nov 15, 2017 • edited Loading

Configuration

jsravn commented Nov 15, 2017

jsravn commented Nov 15, 2017

jsravn commented Nov 15, 2017

jsravn commented Nov 15, 2017 • edited Loading

single bucket

jsravn commented Nov 15, 2017 • edited Loading

jsravn commented Nov 15, 2017

jsravn commented Nov 15, 2017

onorua commented Nov 16, 2017

onorua commented Nov 16, 2017

config

jsravn commented Nov 17, 2017

jsravn commented Nov 17, 2017

jsravn commented Nov 17, 2017 • edited Loading

jsravn commented Nov 17, 2017

onorua commented Nov 17, 2017

jsravn commented Nov 19, 2017

jsravn commented Nov 24, 2017

edsiper commented May 2, 2018

jsravn commented Nov 15, 2017 •

edited

Loading

jsravn commented Nov 15, 2017 •

edited

Loading

jsravn commented Nov 15, 2017 •

edited

Loading

jsravn commented Nov 17, 2017 •

edited

Loading