-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simpler and more powerful @ratelimit decorator #48
Comments
Since you are going ahead with some major changes, I'd like to request a feature that'll ease it's integration with captcha libraries i.e. the ability to reset all counters for a given set of keys. If a request has Oh and a big 👍 for the ideas above. I've found the library to be quite simple & elegant. Would love to see it evolve but retain the clean structure. |
@getub I saw your other issue, don't worry! I believe something like this makes that easier because we are reducing the number of counters that need to be cleared. |
+1 to all suggestions. The IP address is the most widely used key for ratelimiting. Should finding the IP address be more elaborate? This isn't my expertise, but I'm sharing a few links if it helps. There are a few good answers on SO:
And there's django-ipware: |
IP address is way, way too environment-specific when dealing with load balancers or reverse proxies. And getting it wrong is dangerous, especially here, where it could allow attackers to completely circumvent rate limiting. This is the same reason django core dropped The correct solution is really writing middleware that sets REMOTE_ADDR based on however your servers are set up. If people want to deal with it on a use-by-use basis that's fine, this still supports that (and using e.g. This comes up periodically in issues or pull reqs and the answer isn't going to change, I should move it to the README.Sent from my phone, please excuse the occasional typo. On Tue, Sep 2, 2014 at 12:46 AM, addremove [email protected]
|
Fair point & good to know. On a side note, what's a good library one could use to get IP address? |
I don't have an answer to that, because it's so specific. It's a very trivial piece of middleware, but the interesting bit is completely dependent on how your servers are wired up. For example, I use class ProxyMiddleware(object):
def process_request(self, request):
# This line is the important part but is environment-specific:
request.META['REMOTE_ADDR'] = request.META['HTTP_X_CLIENT_IP'] But that's because I know that I can trust If you use something generic, that's going to handle multiple things, like how ipware above looks very flexible—read: complex—and easy to misconfigured, compared to a 3-line middleware class. |
I went and looked at how I'm using django-ratelimit in fjord/Input and the above reconceptualization looks good to me. I know a bunch of people have hit problems with ip addresses and proxies, so I'm +1 on making that clearer in the docs and I think having some guidance on writing a middleware would be pretty helpful, too. That's definitely a better idea than the "key" example I added (pretty sure it was me) in the docs. I definitely like how this makes group-based throttle rates much easier. That'll help SUMO a lot. |
I added something to the docs, though I intentionally skip giving any details of calculating the real IP address. I don't want to put anything copy/pastable there because I'm afraid people will copy and paste it. |
You could purposefully make it "invalid python" so if they copy/pasted it, it fails. Something like:
I don't know if that gets around your concerns. |
It already is invalid, no RHS to the assignment, so that should work. I don't even want to particularly imply that the IP is ready to use as-is in a header. If it's |
@willkg Here do you mean Django/permission groups, or groups of endpoints (i.e. the |
Aw dang, I broke travis. |
I assumed that the two "group" things were just different ways of grouping users into rate buckets. Now I'm not sure if I'm misreading or not. |
The @ratelimit(group='my-lists', ...)
def my_items(...):
@ratelimit(group='my-lists', ...)
def my_other_items(...): still share a counter, because without the |
Ahh... ok. I was talking about the example in the "callable rates" section. Given that's an example and not a "feature", I think there isn't an inherent ambiguity here. Though if you think there is, maybe rename the ratelimit argument to "viewgroup"? |
Eh, I think your point is well-made, given that nothing in the docs really talks about Django groups or permission at all, it should be fine. And it's shorter. |
Fixed in #49 |
I have a new, hopefully better, idea of how ratelimit can work, internally, so I want to write it down here, and get @willkg's opinion and sanity check (also also if anyone else has comments) before I dive into it.
Problems
Multiple cache keys per decorator
Right now, each
@ratelimit
decorator can create several keys, all of which are then treated with the same expiration (and see below about that). Except, if you stack decorators, they may generate the same key (e.g. for the IP address) and then things just stop making any sense at all.So what we're doing is updating counters but in a really non-intuitive way, and we break stacking, which seems to be a natural way people try to use the library. It also costs us atomicity because we can't use
cache.incr
.Good actors can get stuck in sliding windows
Because we push back the TTL on every increment, once a client gets ratelimited, they are stuck until they wait the full period to reset, so if the limit is
1/h
and they wait 59 minutes, they then have to wait another hour, not just one minute, because they jumped the gun a little.Per-method (or group) is a pain
Implementing per-method or method-group rate limits would require something
keys=lambda r: 'group'+r.META['REMOTE_ADDR']
everywhere.Solution
This is a big, backwards-incompatible change. Fortunately, it's pre-1.0, so whatever. This would, hopefully, be a step toward something we'd call 1.0.
One counter per decorator
The biggest change: each decorator should result in a single counter (cache key, whatever). So
Would result in exactly one counter, that uses a combination of the method name, the key value, and the rate (and probably the current time, but hang on) probably MD5ed together to get a name.
Then, if you wanted to do, say, a burst limit, you could do:
Since the rate is part of the key, these two get incremented independently. And if the next method had:
the default behavior would be to ratelimit these views independently.
Fixed windows
I've come around to the view that each attempt shouldn't completely reset the clock on the timer. We should be creating windows. The window needs to be staggered somehow by key, so that we don't, for example, open all the flood gates every hour on the hour. (We can definitely skip staggering that for seconds and possibly for minutes.)
So, for example, if the rate is
100/h
then we'd do something like:Then we append the counter name/cache key with this value. The value should be different for every key and should change every
period
, but its staggered within theperiod
according to the wall clock.We can even somehow return this to the view to allow it to send
Retry-After
if we want.The new signature
It should be simpler, and better reflect that each decorator is an individual counter, while still providing shortcuts for common use cases:
Most of this should be straightforward, but:
group
defaults to the dotted name of the method (e.g.myapp.views.myview
). That limits each view individually, but you can set it to, e.g.group='myviewgroup'
to count a number of views together.rate
works as now ('X/Yu'
whereX
andY
are integers andu
is a unit from{s,m,h,d}
) orrate
is a callable (or a dotted path to a callable) that is passed the group and the request and returns either a rate string, or a tuple:(limit, period-in-secods)
. I think this is a better method of handlingskip_if
because the callable could returnNone
for "no limit" (or0
for "never allow). And, it opens up a whole new thing that would be, I think, very useful (see below).key
is one of a few well-known strings, a callable, or a dotted path to a callable. Callables would get the group and the request. Well-known strings would include at least:'ip'
-request.META['REMOTE_ADDR]
get:X
-request.GET['X']
post:X
-request.POST['X']
field:X
-d = request.POST if request.method == 'POST' else request.GET; d['X']
header:what-ever
-request.META['HTTP_WHAT_EVER']
user
-request.user
user_or_ip
-request.user if request.user.is_authenticated() else request.META['REMOTE_ADDR']
(very common use case, nice to have)method
works as now, a method or a list of methods, orNone
for allblock
works as now,True
to raise aRatelimited
exception,False
to annotate the request.Generating the counter name/cache key
We combine all of this to get a key that we increment:
We don't worry about expiring it. We just do
limited = cache.incr(cache_key) > limit
and call it good. The values age out of the LRU.Current use cases
Login forms
A very common form to protect right now is a login form, which can be done with one decorator:
You'll need two decorators to provide the same (one-IP/many-users and many-IPs/one-user) protections, but you get more control. If you expect users to be behind NAT, you can allow a higher single-IP rate, while still preventing dictionary attacks against a username.
New stuff
This opens up some cool stuff. Stacking now works as intended/expected, like in the burst rate examples above. But there's more that could happen in subsequent versions:
Callable
rates
A pretty trivial use case for callable rates is customizing them by user or user type.
A cool thing to build in would be per-user limiting, e.g.:
(Or something, maybe better supporting anonymous users with
user=0
or similar. And of course with caching.)Definitions in settings
Once
group
is a thing, it becomes easy to do something like this:The decorator (or helper methods) could pull these settings unless they're specifically overridden by the invocation.
That makes it less error-prone to use a helper like
is_ratelimited
and also change things (like temporary remove limits) in settings without messing with actual source modules.The text was updated successfully, but these errors were encountered: