Skip to content
This repository has been archived by the owner on Feb 21, 2023. It is now read-only.

When load testing the pools with timeouts, once the test is done the pool doesn't recover and keeps failing #231

Closed
argaen opened this issue May 16, 2017 · 5 comments
Milestone

Comments

@argaen
Copy link
Contributor

argaen commented May 16, 2017

Hi, after an issue that was opened in aio-libs/aiocache#196 I've been investigating concurrency issues with both aiomcache and aioredis.

TLDR: When load testing the pools with timeouts, once the test is done the pool doesn't recover and keeps failing.

I've been working on aiomcache implementation to fix this aio-libs/aiomcache#46 and performs good enough. Amount of 200s is really high and server keeps working after the load test ab -n 5000 -c 100 http://127.0.0.1:8080/.

Here are some numbers:

aiomcache

Concurrency Level:      100
Time taken for tests:   5.345 seconds
Complete requests:      5000
Failed requests:        0
Total transferred:      935000 bytes
HTML transferred:       180000 bytes
Requests per second:    935.44 [#/sec] (mean)
Time per request:       106.901 [ms] (mean)
Time per request:       1.069 [ms] (mean, across all concurrent requests)
Transfer rate:          170.83 [Kbytes/sec] received

aioredis

Concurrency Level:      100
Time taken for tests:   10.578 seconds
Complete requests:      5000
Failed requests:        4843
   (Connect: 0, Receive: 0, Length: 4843, Exceptions: 0)
Non-2xx responses:      4843
Total transferred:      741280 bytes
HTML transferred:       5652 bytes
Requests per second:    472.68 [#/sec] (mean)
Time per request:       211.559 [ms] (mean)
Time per request:       2.116 [ms] (mean, across all concurrent requests)
Transfer rate:          68.44 [Kbytes/sec] received

Even worse, in my case all requests keep failing when the test has finished.

This is the script I've been using for the load testing:

import uuid
import logging
import asyncio
import aiocache

from aiohttp import web

logger = logging.getLogger(__name__)


class CacheManager:
    def __init__(self):
        # self.cache = aiocache.MemcachedCache(pool_size=4)
        self.cache = aiocache.RedisCache(pool_max_size=4)

    async def get(self, key):
        return await self.cache.get(key, timeout=0.1)

    async def set(self, key, value):
        return await self.cache.set(key, value, timeout=0.1)


async def handler_get(req):
    try:
        data = await req.app['cache'].get('testkey')
        if data:
            return web.Response(text=data)
        data = str(uuid.uuid4())
        await req.app['cache'].set('testkey', data)
        return web.Response(text=str(data))
    except asyncio.TimeoutError:
        data = str(uuid.uuid4())
        await req.app['cache'].set('testkey', data)
        return web.Response(status=404)


if __name__ == '__main__':
    app = web.Application()
    app['cache'] = CacheManager()
    app.router.add_route('GET', '/', handler_get)
    web.run_app(app)

Requirements:

-e [email protected]:argaen/aiocache.git@f7fa8e71508203fa108acd89369e7d13e0403b4d#egg=aiocache
-e [email protected]:aio-libs/aiomcache.git@ff4dbc18145fd3e99c1623879fa3c506616510fa#egg=aiomcache
aioredis==0.3.1
aiohttp==2.0.7

I have aiocache in the middle because it allows me to swap easily between one aioredis and aiomcache.

Now some extra observations:

  • aioredis behaves well when there is no task cancellation
  • aioredis performs "better" when pool_size is 1 (and I can keep querying the server after the load test). If I move it to anything higher than 1, then crashes forever.

If its OK, I will work on fixing it once you or someone else confirms the same problem I'm seeing.

@popravich
Copy link
Contributor

Yes, I followed the aiocache/aiomcache issue discussion, but I need to take a deep look into this in terms of aioredis.

@popravich popravich changed the title Pool behaves badly under high load with timeouts When load testing the pools with timeouts, once the test is done the pool doesn't recover and keeps failing May 18, 2017
@popravich
Copy link
Contributor

Ok, I've found the issue causing loop to stuck, it is releated to asyncio.Lock deadlock — python/cpython#1031.
pool.acquire() gets a lock to get or create new connection.
When lock is released next lock waiter is waked up, however if this waiter's task gets cancelled at the same time (with timeout) no more lock waiters are waked up, we end up with a deadlock.

@argaen
Copy link
Contributor Author

argaen commented May 18, 2017

Ahhh good catch there. Thanks for having a look at it!

@jiamo
Copy link

jiamo commented Jun 5, 2017

Are we need to wait until the python stdlib upgrade? Any suggestion to fix it on a production env?

pfreixes added a commit to pfreixes/aioredis that referenced this issue Jun 7, 2017
@pfreixes
Copy link
Contributor

pfreixes commented Jun 8, 2017

A pain in the ass but finally a patch inside of aioredis #241 /cc @popravich

popravich pushed a commit that referenced this issue Jun 21, 2017
* #231 Patch for the cpython issue with Lock

* Make Lock compatible with 3.3 and 3.4

* Changed test to make it compatible with python 3.3 version

* Flake8 issue

* Increase coverage by reducing scope, test just the bugfix for 3.6

* One unique version of Lock.acquire function

* Use create_future provided by aioredis
popravich pushed a commit that referenced this issue Jun 21, 2017
* Make Lock compatible with 3.3 and 3.4

* Changed test to make it compatible with python 3.3 version

* Flake8 issue

* Increase coverage by reducing scope, test just the bugfix for 3.6

* One unique version of Lock.acquire function

* Use create_future provided by aioredis
popravich added a commit that referenced this issue Jun 21, 2017
Squashed commit of the following:

commit 71ea399
Author: Alexey Popravka <[email protected]>
Date:   Wed Jun 21 10:56:12 2017 +0300

    fix typo

commit 98ee8be
Author: Alexey Popravka <[email protected]>
Date:   Wed Jun 21 10:47:36 2017 +0300

    Bump version: 0.3.1 → 0.3.2

commit 013ec54
Author: Alexey Popravka <[email protected]>
Date:   Wed Jun 21 10:47:22 2017 +0300

    update CHANGES

commit b2cfb7b
Author: Alexey Popravka <[email protected]>
Date:   Wed Jun 21 10:28:19 2017 +0300

    fix timeout passing in pool

commit 201451d
Author: Pau Freixes <[email protected]>
Date:   Wed Jun 21 08:46:50 2017 +0200

    * #231 Patch for the cpython issue with Lock

    * Make Lock compatible with 3.3 and 3.4

    * Changed test to make it compatible with python 3.3 version

    * Flake8 issue

    * Increase coverage by reducing scope, test just the bugfix for 3.6

    * One unique version of Lock.acquire function

    * Use create_future provided by aioredis

commit 2ca2f08
Author: argaen <[email protected]>
Date:   Thu Jun 8 20:40:44 2017 +0200

    Create only one task for closing

commit e7a0063
Author: argaen <[email protected]>
Date:   Mon Jun 5 16:26:22 2017 +0200

    close_waiter is lazily created

commit e38be72
Author: Pau Freixes <[email protected]>
Date:   Thu May 11 22:07:33 2017 +0200

    Removed print

commit f77a06b
Author: Pau Freixes <[email protected]>
Date:   Thu May 11 17:07:55 2017 +0200

    Some small MR issues fixed

commit eaede3d
Author: Pau Freixes <[email protected]>
Date:   Wed May 10 16:40:24 2017 +0200

    Support for connection timeout param

    Related to #184, it allows aioredis to configure a limited
    time that will be used trying to open a connection, if it is
    reached a `asyncio.TimeoutError` will be raised

    By default any timeout is configured.

commit 6b13dd1
Author: Alexey Popravka <[email protected]>
Date:   Mon Mar 6 10:36:37 2017 +0200

    tiny styling fix

commit 864a390
Author: marijngiesen <[email protected]>
Date:   Fri Mar 3 10:25:20 2017 +0100

    Fixed flake8 errors

commit b36a17a
Author: marijngiesen <[email protected]>
Date:   Fri Mar 3 01:38:42 2017 +0100

    Fixed typo

commit d93e25d
Author: marijngiesen <[email protected]>
Date:   Fri Mar 3 01:36:37 2017 +0100

    Added ZREVRANGEBYLEX command

commit bab6a0b
Author: Alexey Popravka <[email protected]>
Date:   Tue May 9 22:15:29 2017 +0300

    update spelling wordlist

commit e2bae24
Author: Alexey Popravka <[email protected]>
Date:   Tue May 9 22:11:40 2017 +0300

    Bump version: 0.3.0 → 0.3.1

commit 1c4327c
Author: Alexey Popravka <[email protected]>
Date:   Tue May 9 22:11:35 2017 +0300

    update changes

commit 3b779a8
Author: Alexey Popravka <[email protected]>
Date:   Mon Mar 27 10:42:04 2017 +0300

    fix pubsub Receiver missing iter() method (fixes #203)
popravich added a commit that referenced this issue Nov 2, 2017
commit 1fcdcff
Author: Alexey Popravka <[email protected]>
Date:   Wed Oct 25 11:41:24 2017 +0300

    Bump version: 0.3.3 → 0.3.4

commit 56e7c34
Author: Alexey Popravka <[email protected]>
Date:   Wed Oct 25 11:36:25 2017 +0300

    update changes.txt

commit 4171376
Author: Alexey Popravka <[email protected]>
Date:   Wed Oct 25 11:26:06 2017 +0300

    fix test

commit aaae7b8
Author: Alexey Popravka <[email protected]>
Date:   Wed Oct 25 11:14:42 2017 +0300

    add integrational test to compare v0.x version with v1.x

commit 2a6cf14
Author: Alexey Popravka <[email protected]>
Date:   Wed Oct 25 10:39:23 2017 +0300

    fix time command when connection-wide encoding is set (fixes #266)

commit ac47208
Author: Alexey Popravka <[email protected]>
Date:   Fri Jun 30 10:54:12 2017 +0300

    spelling

commit 7a26c16
Author: Alexey Popravka <[email protected]>
Date:   Fri Jun 30 10:25:33 2017 +0300

    bump version to v0.3.3

commit 1082256
Merge: 71ea399 718f731
Author: Alexey Popravka <[email protected]>
Date:   Fri Jun 30 10:19:20 2017 +0300

    Merge pull request #257 from pfreixes/fix_lock_critical_bug_v03

    Fix critical bug with patched Lock

commit 718f731
Author: Pau Freixes <[email protected]>
Date:   Thu Jun 29 22:44:53 2017 +0200

    Fix critical bug with patched Lock

commit 71ea399
Author: Alexey Popravka <[email protected]>
Date:   Wed Jun 21 10:56:12 2017 +0300

    fix typo

commit 98ee8be
Author: Alexey Popravka <[email protected]>
Date:   Wed Jun 21 10:47:36 2017 +0300

    Bump version: 0.3.1 → 0.3.2

commit 013ec54
Author: Alexey Popravka <[email protected]>
Date:   Wed Jun 21 10:47:22 2017 +0300

    update CHANGES

commit b2cfb7b
Author: Alexey Popravka <[email protected]>
Date:   Wed Jun 21 10:28:19 2017 +0300

    fix timeout passing in pool

commit 201451d
Author: Pau Freixes <[email protected]>
Date:   Wed Jun 21 08:46:50 2017 +0200

    * #231 Patch for the cpython issue with Lock

    * Make Lock compatible with 3.3 and 3.4

    * Changed test to make it compatible with python 3.3 version

    * Flake8 issue

    * Increase coverage by reducing scope, test just the bugfix for 3.6

    * One unique version of Lock.acquire function

    * Use create_future provided by aioredis

commit 2ca2f08
Author: argaen <[email protected]>
Date:   Thu Jun 8 20:40:44 2017 +0200

    Create only one task for closing

commit e7a0063
Author: argaen <[email protected]>
Date:   Mon Jun 5 16:26:22 2017 +0200

    close_waiter is lazily created

commit e38be72
Author: Pau Freixes <[email protected]>
Date:   Thu May 11 22:07:33 2017 +0200

    Removed print

commit f77a06b
Author: Pau Freixes <[email protected]>
Date:   Thu May 11 17:07:55 2017 +0200

    Some small MR issues fixed

commit eaede3d
Author: Pau Freixes <[email protected]>
Date:   Wed May 10 16:40:24 2017 +0200

    Support for connection timeout param

    Related to #184, it allows aioredis to configure a limited
    time that will be used trying to open a connection, if it is
    reached a `asyncio.TimeoutError` will be raised

    By default any timeout is configured.

commit 6b13dd1
Author: Alexey Popravka <[email protected]>
Date:   Mon Mar 6 10:36:37 2017 +0200

    tiny styling fix

commit 864a390
Author: marijngiesen <[email protected]>
Date:   Fri Mar 3 10:25:20 2017 +0100

    Fixed flake8 errors

commit b36a17a
Author: marijngiesen <[email protected]>
Date:   Fri Mar 3 01:38:42 2017 +0100

    Fixed typo

commit d93e25d
Author: marijngiesen <[email protected]>
Date:   Fri Mar 3 01:36:37 2017 +0100

    Added ZREVRANGEBYLEX command

commit bab6a0b
Author: Alexey Popravka <[email protected]>
Date:   Tue May 9 22:15:29 2017 +0300

    update spelling wordlist

commit e2bae24
Author: Alexey Popravka <[email protected]>
Date:   Tue May 9 22:11:40 2017 +0300

    Bump version: 0.3.0 → 0.3.1

commit 1c4327c
Author: Alexey Popravka <[email protected]>
Date:   Tue May 9 22:11:35 2017 +0300

    update changes

commit 3b779a8
Author: Alexey Popravka <[email protected]>
Date:   Mon Mar 27 10:42:04 2017 +0300

    fix pubsub Receiver missing iter() method (fixes #203)
@popravich popravich added this to the v1.0 milestone Nov 9, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants