Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rate-limiting plugin works randomly (in redis/cluster mode) #4379

Closed
WojciechBaran-TomTom opened this issue Mar 11, 2019 · 13 comments · Fixed by #5873
Closed

rate-limiting plugin works randomly (in redis/cluster mode) #4379

WojciechBaran-TomTom opened this issue Mar 11, 2019 · 13 comments · Fixed by #5873
Labels
task/needs-investigation Requires investigation and reproduction before classifying it as a bug or not.

Comments

@WojciechBaran-TomTom
Copy link

Summary

The rate-limiting plugin fails to throttle when configured in redis/cluster mode.
It seems that under a load the kong node does not decrease the limit and/or does not respond with the X-RateLimit-* headers.

Steps To Reproduce

  1. Create a test service/route pair with a rate-limiting plugin running in redis or cluster mode
  2. Perform a N synchronous GET requests on that endpoint
  3. Check if all the responses contain X-RateLimit-* headers
  4. Check if the remaining limit has been decreased by N

Additional Details & Logs

  • Kong version 1.0.2

It seems that the code defined outside of RateLimitingHandler:access is sometimes just not run.
It affects both the RateLimitingHandler:header_filter (which results in no rate-limiting headers being sent in the response) and the RateLimitingHandler:log (which results in no limit being consumed by the request).

@h3xar0n
Copy link
Contributor

h3xar0n commented Mar 11, 2019

Hi @wbarantt , could you please share the following:

  • the environment in which you're running Kong
  • kong.conf values
  • Kong error logs

@WojciechBaran-TomTom
Copy link
Author

It's just stock a stock kong with no custom configuration.
I have configured a per consumer rate-limitimg on my test route and started the GET traffic.

No errors in the logs.
If I break the redis host/port configuration then it will fail with 500 or return 200 (depending on the fault tolerant setting) so it does not seem to be a problem with kong<>redis communication.
It works fine if there's no other traffic (going through another route) but once the number of requests increases the plugin execution seems to become disordered.

@p0pr0ck5
Copy link
Contributor

@wbarantt we'd still like to know a bit more about your Kong configuration :) Are you deploying a cluster of Kong nodes behind an LB? Can you please share the exact plugin config you've enabled? And how many distinct consumers are sending requests to Kong when you observe this behavior?

TBH this sounds a bit like #3329, but we'd love to learn more about your use case to get a better understanding of what's going on. Thanks!

@p0pr0ck5 p0pr0ck5 added the pending author feedback Waiting for the issue author to get back to a maintainer with findings, more details, etc... label Mar 12, 2019
@WojciechBaran-TomTom
Copy link
Author

This is my whole test config:
kong_config.txt

This is how I test it:
for i in seq -w 1 20;do echo -e "RUN:\t=====\t#${i}\t====="; curl -sD - -o /dev/null -H "Host:httpbin.org" http://localhost:8000/httpbin_01/headers;done

This is what I get with no other traffic coming to this kong node:
RUN: ===== #11 =====
HTTP/1.1 429 Too Many Requests
Date: Wed, 13 Mar 2019 11:26:47 GMT
Content-Type: application/json; charset=utf-8
Connection: keep-alive
WWW-Authenticate: Key realm="kong"
Content-Length: 37
X-RateLimit-Remaining-second: 8
X-RateLimit-Limit-second: 10
X-RateLimit-Remaining-minute: 0
X-RateLimit-Limit-minute: 10
Server: kong/1.0.2

When I turn on the regular traffic the kong starts to skip and results in sth like this:
RUN: ===== #17 =====
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 281
Connection: keep-alive
WWW-Authenticate: Key realm="kong"
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Date: Wed, 13 Mar 2019 10:44:55 GMT
Server: nginx
X-RateLimit-Remaining-second: 9
X-RateLimit-Limit-second: 10
X-RateLimit-Remaining-minute: 0
X-RateLimit-Limit-minute: 10
X-Kong-Upstream-Latency: 72
X-Kong-Proxy-Latency: 1
Via: kong/1.0.2

RUN: ===== #18 =====
HTTP/1.1 429 Too Many Requests
Date: Wed, 13 Mar 2019 10:44:57 GMT
Content-Type: application/json; charset=utf-8
Connection: keep-alive
WWW-Authenticate: Key realm="kong"
Content-Length: 37
X-RateLimit-Remaining-second: 10
X-RateLimit-Limit-second: 10
X-RateLimit-Remaining-minute: 0
X-RateLimit-Limit-minute: 10
Server: kong/1.0.2

So 7 additional requests have passed through the kong...

Here's the full log:
kong_log.txt
Some responses are missing the X-RateLimit headers, some of them were not counted.

@p0pr0ck5 p0pr0ck5 added task/needs-investigation Requires investigation and reproduction before classifying it as a bug or not. and removed pending author feedback Waiting for the issue author to get back to a maintainer with findings, more details, etc... labels Mar 20, 2019
@anujjalan
Copy link

We are facing a similar issue after upgrade.

We were earlier on 0.14.0 version where this used to work fine but after our upgrade to 1.0.2, rate limiting numbers are not right and we suspect plugin is not even getting executed for many concurrent calls.

We did not found any errors related to redis / plugin but still only 60% of requests were counted as part of rate limiting.

@qiao-meng-zefr
Copy link

I am getting the same issue on Kong 1.0.3

In one of my test I made total of 40 calls (with 20 workers) to kong, rate limited was set to 10 per hour on the consumer level. So the expected result is only 10 calls can go through and the rest 30 should get 429 response.

However, all calls got 200 response. 10 response has the X-RateLimit* headers. It looks like only the X-RateLimit* headers are missing from the other 30 calls. All 40 calls have other kong headers such as X-Kong-Upstream-Latency, etc.

@ganeshs
Copy link

ganeshs commented Apr 22, 2019

I'm facing similar issue after upgrading from 0.14.0 to 1.0.2. The plugin seem to be skipped for certain % of requests.

@ganeshs
Copy link

ganeshs commented May 4, 2019

I did some debugging and could zero down the issue to this piece of code below,

function RateLimitingHandler:log(_)
  if kong.ctx.plugin.timer then
    kong.ctx.plugin.timer()
  end
end

I configured rate-limiting plugin to use redis with a threshold of 1000 requests per min. In the test run, I fired 2000 requests at a concurrency of 100, the redis counter was incremented only 367 times instead of 1000.

The kong.ctx.plugin.timer is null for rest of them. I suspect its getting cleared somewhere else in the flow. As per documentation, kong.ctx.plugin is accessible within the plugin instance at request level. The same test with 0.14.1 works pretty fine.

I had key-auth, rate-limiting, prometheus and syslog plugins configured for my route. When I disabled syslog and prometheus, the rate-limiting works fine. When I enable one of them, it doesn't

@hbagdi @thibaultcha Would really appreciate if you can give some pointers on changes related to the kong.ctx.plugin in 1.x release. Would be more than happy to fix and submit a PR. Thanks

@ganeshs
Copy link

ganeshs commented May 4, 2019

Btw, this change was introduced in this commit,
0ad4aa3

The 0.14.1 version was incrementing the redis counter in the access phase. In 1.x, its done in the log phase.

After making the below fix, I can confirm that rate-limiting works as before,

--kong.ctx.plugin.timer = function()
    local ok, err = timer_at(0, increment, conf, limits, identifier, current_timestamp, 1)
    if not ok then
      kong.log.err(“failed to create timer: “, err)
    end
--end
--function RateLimitingHandler:log(_)
--  if kong.ctx.plugin.timer then
--    kong.ctx.plugin.timer()
--  end
--end

@bungle
Copy link
Member

bungle commented May 5, 2019

@ganeshs yes, that is a good workaround. we need to check what could be overwriting (or why it is cleared or perhaps not even se in a first place) the context value.

@xuanit
Copy link

xuanit commented Jul 22, 2019

I'm facing the same issue. Do we have a plan to fix this issue in Kong?

@ordith
Copy link

ordith commented Sep 26, 2019

We are also facing this issue. Is there any update on when this might be addressed?

@rainest
Copy link
Contributor

rainest commented Jan 21, 2020

#5460 should likely fix this. An unknown issue clears the plugin ctx contents, and rate-limiting increments counters based on data stored in it.

#5459 may address the issue with ctx, but we do not have definitive proof of that issue's exact cause yet. As such, 5460 ignores any problem with ctx whatsoever by moving all rate-limiting functionality into the access phase.

thibaultcha added a commit that referenced this issue May 8, 2020
* Store namespace keys in `ngx.ctx` to ensure all dynamically generated
  namespaces are isolated on a per-request basis.
* Introduce a new global API (which is a private API for use by core
  only) to explicitly delete a namespace.

A note on the benefits of this new implementation:

* By using a table to keep track of namespaces in `ngx.ctx`, we ensure
  that users of `ngx.ctx` cannot access the table namespaces, thus
  properly isolating it and avoiding polluting `ngx.ctx` for core and/or
  plugins. We can think of it as a double-pointer reference to the
  namespace. Each request gets a pre-allocated table for 4 namespaces
  (of course not limited to 4), with the assumption that most instances
  do not execute more than 4 plugins using `kong.ctx.plugin` at a time.
* We ensure that namespace key references cannot be `nil` to ensure a
  safe usage from the core.
* All namespace key references are still weak when assigned to a
  namespace, thus tying the lifetime of the namespace to that of its key
  reference. Similarly to the previous implementation, this is done to
  ensure we avoid potential memory leaks. That said, the
  `kong.ctx.plugin` namespace does use the plugin's conf for that
  purpose anymore (See 40dc146), which
  alleviates such concerns in the current usage of this API.
* All tables allocated for namespace key references and namespaces keys
  themselves will be released when `ngx.ctx` will be GC'ed.
* We also ensure than `kong.ctx.*` returns `nil` when `ngx.ctx` is not
  supported ("init" phase).

Co-Authored-By: tdelaune <[email protected]>

Fix #4379
Fix #5853
See #5851
See #5459
thibaultcha added a commit that referenced this issue May 14, 2020
* Store namespace keys in `ngx.ctx` to ensure all dynamically generated
  namespaces are isolated on a per-request basis.
* Introduce a new global API (which is a private API for use by core
  only) to explicitly delete a namespace.

A note on the benefits of this new implementation:

* By using a table to keep track of namespaces in `ngx.ctx`, we ensure
  that users of `ngx.ctx` cannot access the table namespaces, thus
  properly isolating it and avoiding polluting `ngx.ctx` for core and/or
  plugins. We can think of it as a double-pointer reference to the
  namespace. Each request gets a pre-allocated table for 4 namespaces
  (of course not limited to 4), with the assumption that most instances
  do not execute more than 4 plugins using `kong.ctx.plugin` at a time.
* We ensure that namespace key references cannot be `nil` to ensure a
  safe usage from the core.
* All namespace key references are still weak when assigned to a
  namespace, thus tying the lifetime of the namespace to that of its key
  reference. Similarly to the previous implementation, this is done to
  ensure we avoid potential memory leaks. That said, the
  `kong.ctx.plugin` namespace does use the plugin's conf for that
  purpose anymore (See 40dc146), which
  alleviates such concerns in the current usage of this API.
* All tables allocated for namespace key references and namespaces keys
  themselves will be released when `ngx.ctx` will be GC'ed.
* We also ensure than `kong.ctx.*` returns `nil` when `ngx.ctx` is not
  supported ("init" phase).

Co-Authored-By: tdelaune <[email protected]>

Fix #4379
Fix #5853
See #5851
See #5459
thibaultcha added a commit that referenced this issue May 14, 2020
* Store namespace keys in `ngx.ctx` to ensure all dynamically generated
  namespaces are isolated on a per-request basis.
* Introduce a new global API (which is a private API for use by core
  only) to explicitly delete a namespace.

A note on the benefits of this new implementation:

* By using a table to keep track of namespaces in `ngx.ctx`, we ensure
  that users of `ngx.ctx` cannot access the table namespaces, thus
  properly isolating it and avoiding polluting `ngx.ctx` for core and/or
  plugins. We can think of it as a double-pointer reference to the
  namespace. Each request gets a pre-allocated table for 4 namespaces
  (of course not limited to 4), with the assumption that most instances
  do not execute more than 4 plugins using `kong.ctx.plugin` at a time.
* We ensure that namespace key references cannot be `nil` to ensure a
  safe usage from the core.
* All namespace key references are still weak when assigned to a
  namespace, thus tying the lifetime of the namespace to that of its key
  reference. Similarly to the previous implementation, this is done to
  ensure we avoid potential memory leaks. That said, the
  `kong.ctx.plugin` namespace does use the plugin's conf for that
  purpose anymore (See 40dc146), which
  alleviates such concerns in the current usage of this API.
* All tables allocated for namespace key references and namespaces keys
  themselves will be released when `ngx.ctx` will be GC'ed.
* We also ensure than `kong.ctx.*` returns `nil` when `ngx.ctx` is not
  supported ("init" phase).

Co-Authored-By: tdelaune <[email protected]>

Fix #4379
Fix #5853
See #5851
See #5459
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task/needs-investigation Requires investigation and reproduction before classifying it as a bug or not.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants