Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simpler and more powerful @ratelimit decorator #48

Closed
jsocol opened this issue Sep 1, 2014 · 18 comments
Closed

Simpler and more powerful @ratelimit decorator #48

jsocol opened this issue Sep 1, 2014 · 18 comments
Milestone

Comments

@jsocol
Copy link
Owner

jsocol commented Sep 1, 2014

I have a new, hopefully better, idea of how ratelimit can work, internally, so I want to write it down here, and get @willkg's opinion and sanity check (also also if anyone else has comments) before I dive into it.

Problems

Multiple cache keys per decorator

Right now, each @ratelimit decorator can create several keys, all of which are then treated with the same expiration (and see below about that). Except, if you stack decorators, they may generate the same key (e.g. for the IP address) and then things just stop making any sense at all.

So what we're doing is updating counters but in a really non-intuitive way, and we break stacking, which seems to be a natural way people try to use the library. It also costs us atomicity because we can't use cache.incr.

Good actors can get stuck in sliding windows

Because we push back the TTL on every increment, once a client gets ratelimited, they are stuck until they wait the full period to reset, so if the limit is 1/h and they wait 59 minutes, they then have to wait another hour, not just one minute, because they jumped the gun a little.

Per-method (or group) is a pain

Implementing per-method or method-group rate limits would require something keys=lambda r: 'group'+r.META['REMOTE_ADDR'] everywhere.

Solution

This is a big, backwards-incompatible change. Fortunately, it's pre-1.0, so whatever. This would, hopefully, be a step toward something we'd call 1.0.

One counter per decorator

The biggest change: each decorator should result in a single counter (cache key, whatever). So

def user_or_ip(r):
    u = request.user
    return u.getusername() if u.is_authenticated() else r.META['REMOTE_ADDR']

@ratelimit(key=user_or_ip, rate='100/h')
def some_view(request):

Would result in exactly one counter, that uses a combination of the method name, the key value, and the rate (and probably the current time, but hang on) probably MD5ed together to get a name.

Then, if you wanted to do, say, a burst limit, you could do:

@ratelimit(key=user_or_ip, rate='10/s')
@ratelimit(key=user_or_ip, rate='100/h')
def some_view(request):

Since the rate is part of the key, these two get incremented independently. And if the next method had:

@ratelimit(key=user_or_ip, rate='10/s')
@ratelimit(key=user_or_ip, rate='100h')
def another_view(request):

the default behavior would be to ratelimit these views independently.

Fixed windows

I've come around to the view that each attempt shouldn't completely reset the clock on the timer. We should be creating windows. The window needs to be staggered somehow by key, so that we don't, for example, open all the flood gates every hour on the hour. (We can definitely skip staggering that for seconds and possibly for minutes.)

So, for example, if the rate is 100/h then we'd do something like:

def k(value, period=3600):
    ts = int(time.time())
    m = ts - (ts % period) + (zlib.crc32(value) % period)
    if m < ts:
        return m + period
    return m

Then we append the counter name/cache key with this value. The value should be different for every key and should change every period, but its staggered within the period according to the wall clock.

We can even somehow return this to the view to allow it to send Retry-After if we want.

The new signature

It should be simpler, and better reflect that each decorator is an individual counter, while still providing shortcuts for common use cases:

@ratelimit(
    group=None,
    key=None,
    rate=None,
    method=['POST'],
    block=False)

Most of this should be straightforward, but:

  • group defaults to the dotted name of the method (e.g. myapp.views.myview). That limits each view individually, but you can set it to, e.g. group='myviewgroup' to count a number of views together.
  • rate works as now ('X/Yu' where X and Y are integers and u is a unit from {s,m,h,d}) or rate is a callable (or a dotted path to a callable) that is passed the group and the request and returns either a rate string, or a tuple: (limit, period-in-secods). I think this is a better method of handling skip_if because the callable could return None for "no limit" (or 0 for "never allow). And, it opens up a whole new thing that would be, I think, very useful (see below).
  • key is one of a few well-known strings, a callable, or a dotted path to a callable. Callables would get the group and the request. Well-known strings would include at least:
    • 'ip' - request.META['REMOTE_ADDR]
    • get:X - request.GET['X']
    • post:X - request.POST['X']
    • field:X - d = request.POST if request.method == 'POST' else request.GET; d['X']
    • header:what-ever - request.META['HTTP_WHAT_EVER']
    • user - request.user
    • user_or_ip - request.user if request.user.is_authenticated() else request.META['REMOTE_ADDR'] (very common use case, nice to have)
  • method works as now, a method or a list of methods, or None for all
  • block works as now, True to raise a Ratelimited exception, False to annotate the request.

Generating the counter name/cache key

We combine all of this to get a key that we increment:

cache_key = PREFIX + md5(group + rate + key_value + window)

We don't worry about expiring it. We just do limited = cache.incr(cache_key) > limit and call it good. The values age out of the LRU.

Current use cases

Login forms

A very common form to protect right now is a login form, which can be done with one decorator:

# old
@ratelimit(ip=true, field='username', rate='5/m')

# new
@ratelimit(key='ip', rate='10/m')
@ratelimit(key='field:username', rate='5/m')

You'll need two decorators to provide the same (one-IP/many-users and many-IPs/one-user) protections, but you get more control. If you expect users to be behind NAT, you can allow a higher single-IP rate, while still preventing dictionary attacks against a username.

New stuff

This opens up some cool stuff. Stacking now works as intended/expected, like in the burst rate examples above. But there's more that could happen in subsequent versions:

Callable rates

A pretty trivial use case for callable rates is customizing them by user or user type.

def get_rate_limit(group, request):
    if request.user.is_authenticated():
        return '1000/h'
    return '100/h'

A cool thing to build in would be per-user limiting, e.g.:

class UserRateLimit(models.Model):
    group = models.CharField(db_index=True)
    user = models.ForeignKeyField(null=True)  # 'null' for default
    rate = models.CharField()

    @classmethod
    def get_for_user(cls, group, request):
        user = request.user
        if not user.is_authenticate():
            rate = cls.objects.get(group=group, user=None)
            return rate.rate
        try:
            rate = cls.objects.get(group=group, user=user)
            return rate.rate
        except cls.DoesNotExist:
            default = cls.objects.get(group=group, user=None)
            return default.rate

(Or something, maybe better supporting anonymous users with user=0 or similar. And of course with caching.)

Definitions in settings

Once group is a thing, it becomes easy to do something like this:

# settings.py
RATE_LIMIT_CONFIG = {
    '*': {  # global default
        'key': 'user_or_ip',
        'rate': '10/m',
    },
    'some_group': {
        'key': 'ip',
        'rate': '100/m',
    },
}

The decorator (or helper methods) could pull these settings unless they're specifically overridden by the invocation.

That makes it less error-prone to use a helper like is_ratelimited and also change things (like temporary remove limits) in settings without messing with actual source modules.

@getub
Copy link

getub commented Sep 1, 2014

Since you are going ahead with some major changes, I'd like to request a feature that'll ease it's integration with captcha libraries i.e. the ability to reset all counters for a given set of keys. If a request has limited attribute set to True, perhaps it could have another attribute limited_by that specifies the key or sequence keys which were used to limit the request. Then if you want to reset them after a captcha is correctly solved, you simply call reset(keys=limited_by) and re-direct back the original view.

Oh and a big 👍 for the ideas above. I've found the library to be quite simple & elegant. Would love to see it evolve but retain the clean structure.

@jsocol
Copy link
Owner Author

jsocol commented Sep 1, 2014

@getub I saw your other issue, don't worry! I believe something like this makes that easier because we are reducing the number of counters that need to be cleared.

@bitcity
Copy link
Contributor

bitcity commented Sep 2, 2014

+1 to all suggestions.

The IP address is the most widely used key for ratelimiting. Should finding the IP address be more elaborate? This isn't my expertise, but I'm sharing a few links if it helps.

There are a few good answers on SO:

And there's django-ipware:

@jsocol
Copy link
Owner Author

jsocol commented Sep 2, 2014

IP address is way, way too environment-specific when dealing with load balancers or reverse proxies. And getting it wrong is dangerous, especially here, where it could allow attackers to completely circumvent rate limiting. 

This is the same reason django core dropped SetRemoteAddrFromForwardedFor middleware. 

The correct solution is really writing middleware that sets REMOTE_ADDR based on however your servers are set up. If people want to deal with it on a use-by-use basis that's fine, this still supports that (and using e.g. 'header:x-forwarded-for' makes it easy). 

This comes up periodically in issues or pull reqs and the answer isn't going to change, I should move it to the README. 

Sent from my phone, please excuse the occasional typo.

On Tue, Sep 2, 2014 at 12:46 AM, addremove [email protected]
wrote:

+1 to all suggestions.
The IP address is the most widely used key for ratelimiting. Should finding the IP address be more elaborate? This isn't my expertise, but I'm sharing a few links if it helps.
There are a few good answers on SO:
http://stackoverflow.com/questions/3003145/how-to-get-the-client-ip-address-in-php
http://stackoverflow.com/questions/1634782/what-is-the-most-accurate-way-to-retrieve-a-users-correct-ip-address-in-php
And there's django-ipware:

https://github.com/un33k/django-ipware

Reply to this email directly or view it on GitHub:
#48 (comment)

@bitcity
Copy link
Contributor

bitcity commented Sep 2, 2014

Fair point & good to know. On a side note, what's a good library one could use to get IP address?

@jsocol
Copy link
Owner Author

jsocol commented Sep 2, 2014

I don't have an answer to that, because it's so specific. It's a very trivial piece of middleware, but the interesting bit is completely dependent on how your servers are wired up. For example, I use X-Client-IP to forward the client address from nginx to Django, instead of dealing with weird X-Forwarded-For rules, so I have something like:

class ProxyMiddleware(object):
    def process_request(self, request):
        # This line is the important part but is environment-specific:
        request.META['REMOTE_ADDR'] = request.META['HTTP_X_CLIENT_IP']

But that's because I know that I can trust X-Client-IP. (Even in my dev VMs I run runserver behind nginx.)

If you use something generic, that's going to handle multiple things, like how X-Forwarded-For can work and X-Cluster-Client-IP and whatever else, it's going to open up trivial header spoofing, since most web servers will forward X- headers unless you tell them not to.

ipware above looks very flexible—read: complex—and easy to misconfigured, compared to a 3-line middleware class.

@willkg
Copy link
Collaborator

willkg commented Sep 2, 2014

@rlr @mythmon I think this refactoring gives us some interesting possibilities for SUMO contributors. Definitely worth looking at.

@willkg
Copy link
Collaborator

willkg commented Sep 2, 2014

I went and looked at how I'm using django-ratelimit in fjord/Input and the above reconceptualization looks good to me.

I know a bunch of people have hit problems with ip addresses and proxies, so I'm +1 on making that clearer in the docs and I think having some guidance on writing a middleware would be pretty helpful, too. That's definitely a better idea than the "key" example I added (pretty sure it was me) in the docs.

I definitely like how this makes group-based throttle rates much easier. That'll help SUMO a lot.

@jsocol
Copy link
Owner Author

jsocol commented Sep 2, 2014

I added something to the docs, though I intentionally skip giving any details of calculating the real IP address. I don't want to put anything copy/pastable there because I'm afraid people will copy and paste it.

@willkg
Copy link
Collaborator

willkg commented Sep 2, 2014

You could purposefully make it "invalid python" so if they copy/pasted it, it fails. Something like:

class ProxyMiddleware(object):
    def process_request(self, request):
        # This line is the important part but is environment-specific:
        request.META['REMOTE_ADDR'] = request.META[<KEY WHERE YOUR IP ADDRESS REALLY IS>]

I don't know if that gets around your concerns.

@jsocol
Copy link
Owner Author

jsocol commented Sep 2, 2014

It already is invalid, no RHS to the assignment, so that should work. I don't even want to particularly imply that the IP is ready to use as-is in a header. If it's X-Forwarded-For or X-Cluster-Client-IP it may take more processing than just copying it.

@jsocol
Copy link
Owner Author

jsocol commented Sep 2, 2014

I definitely like how this makes group-based throttle rates much easier. That'll help SUMO a lot.

@willkg Here do you mean Django/permission groups, or groups of endpoints (i.e. the group= kwarg)? I suppose it makes both easier (or possible) but now I'm worried that name is too close for clarity.

@jsocol
Copy link
Owner Author

jsocol commented Sep 2, 2014

Aw dang, I broke travis.

@willkg
Copy link
Collaborator

willkg commented Sep 2, 2014

I assumed that the two "group" things were just different ways of grouping users into rate buckets. Now I'm not sure if I'm misreading or not.

@jsocol
Copy link
Owner Author

jsocol commented Sep 2, 2014

The group= kwarg to the @ratelimit decorator groups views, not users, so

@ratelimit(group='my-lists', ...)
def my_items(...):

@ratelimit(group='my-lists', ...)
def my_other_items(...):

still share a counter, because without the group= kwarg, they will not, in the new scheme. I think this makes more sense, in general (individual API endpoints or views default to separate rate limits unless you opt in to sharing them).

@willkg
Copy link
Collaborator

willkg commented Sep 2, 2014

Ahh... ok. I was talking about the example in the "callable rates" section. Given that's an example and not a "feature", I think there isn't an inherent ambiguity here.

Though if you think there is, maybe rename the ratelimit argument to "viewgroup"?

@jsocol
Copy link
Owner Author

jsocol commented Sep 2, 2014

Eh, I think your point is well-made, given that nothing in the docs really talks about Django groups or permission at all, it should be fine. And it's shorter.

@jsocol jsocol changed the title Reconceptualize Simpler and more powerful @ratelimit decorator Sep 2, 2014
@jsocol jsocol modified the milestone: 0.5 Sep 10, 2014
@jsocol
Copy link
Owner Author

jsocol commented Oct 26, 2014

Fixed in #49

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants