Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timeit support for coroutines #121

Closed
Tinche opened this issue Jan 12, 2022 · 19 comments
Closed

timeit support for coroutines #121

Tinche opened this issue Jan 12, 2022 · 19 comments

Comments

@Tinche
Copy link
Contributor

Tinche commented Jan 12, 2022

Hello!

I think pyperf is an amazing project and I use the timeit to benchmark essentially all the libraries I work on (attrs, cattrs, incant...).

I wish I could use it to benchmark async functions though. Right now, I benchmark asyncio.run(my_coro) but since asyncio.run is so costly there's a ton of noise in the signal.

I think essentially pyperf could detect a coroutine was passed in, spawn an event loop and just await it in a loop.

@corona10
Copy link
Member

@vstinner Do you have any ideas?

@vstinner
Copy link
Member

I don't know how to do that.

cc @methane

@methane
Copy link
Contributor

methane commented Jan 31, 2022

@Tinche Do you really mean timeit, but not bench_func().

bench_func() receives function. So it may be possible to detect the func is coroutine or not.
On the other hand, timeit receives an expression, not a function. So it is difficult to detect that coroutine is passed in.

@methane
Copy link
Contributor

methane commented Jan 31, 2022

Would you give us some examples?

@Tinche
Copy link
Contributor Author

Tinche commented Jan 31, 2022

I use timeit all the time in the terminal and I've never used bench_func(), so probably timeit. Maybe it could be a flag or a different command?

Here's an example. I have a project, https://github.com/Tinche/incant/, that does function composition (mostly for dependency injection), and I want to measure how efficient it is. It supports functions and coroutines. Functions I can benchmark easily, coroutines I need to benchmark using asyncio.run, and that has a ton of noise since it does a lot of unrelated work.

Note that these coroutines I'm benchmarking are technically async, but they either do not await anything or they await sleep(0).

I usually prepare the function being tested in a file and then do something like:

pyperf timeit -g -s "from asyncio import run; from test import main" "run(main())"

so since I need to have a separate file bench_func() could work too. The CLI interface is sooo nice though ;)

That said, maybe there's a way to run a coroutine without involving an event loop? Just iterate over it until it's done or something like that? I'm not proficient in that part of Python.

@methane
Copy link
Contributor

methane commented Jan 31, 2022

Would you try this?

pyperf timeit -g -s "import asyncio; loop=asyncio.get_event_loop(); from test import main" \
  "loop.run_until_complete(main())"

or

pyperf timeit -g -s "import asyncio, test" \ 
  "asyncio.get_event_loop().run_until_complete(test.main())"

With this, one loop is used repeatedly instead of creating and destroying loops for each main() execution.
Is this reduce your "noise"?

@Tinche
Copy link
Contributor Author

Tinche commented Feb 1, 2022

It does work and helps a little. If it's too hard to do otherwise in pyperf I will accept this as the answer ;)

@methane
Copy link
Contributor

methane commented Feb 3, 2022

What "little" means? It reduce your noise only little? If so, it means this feature request will have only little benefit.

If you just meant "I don't want to write this timeit", I'm sorry. But it is very difficult.
Again, timeit receives statements, not function. So timeit can not distinguish async code automatically.

I will consider about adding bench_async_func() or bench_func() supports async func.
And I will consider adding --async option to timeit later.

@Tinche
Copy link
Contributor Author

Tinche commented Feb 3, 2022

Well, it reduces the running time by a lot, so it reduces noise by a lot.

I have a generated coroutine that that I'm benchmarking. This coroutine awaits several other coroutines inside.

asyncio.run: Mean +- std dev: 529 us +- 42 us
loop.run_until_complete: Mean +- std dev: 185 us +- 12 us

So the difference was noise introduced by asyncio.run. Hence, a big improvement. Dunno how much more it can be improved by logic inside pyperf.

@Tinche
Copy link
Contributor Author

Tinche commented Feb 3, 2022

Offtopic: heh, for comparison's sake, if I change the test so they are all ordinary functions, not async def functions, it takes 1 microsecond. I wasn't aware asyncio/the event loop adds so much overhead.

@vstinner
Copy link
Member

vstinner commented Feb 3, 2022

So the difference was noise introduced by asyncio.run

Each call to asyncio.run() creates a new fresh event loop, and then closes it. Moreover, it also shutdowns asynchronous generators and the default asyncio executor (thread pool).

@methane
Copy link
Contributor

methane commented Feb 3, 2022

I usually prepare the function being tested in a file and then do something like:

Since you write script for test already, I don't think timeit command is so important for you.
If #124 is merged, you can just add few lines in your test code:

if __name__ == '__main__':
    import pyperf
    pyperf.Runner().bench_async_func('main', main)

@Tinche
Copy link
Contributor Author

Tinche commented Feb 3, 2022

@methane Thanks a lot! Trying from your branch, now the time is: main: Mean +- std dev: 123 us +- 7 us. Looks like we got rid of all the overhead.

@vstinner
Copy link
Member

vstinner commented Feb 3, 2022

Fixed by #124 thanks to @methane.

@vstinner vstinner closed this as completed Feb 3, 2022
@vstinner
Copy link
Member

vstinner commented Feb 3, 2022

I closed the issue because it seems like the idea of adding an --async option to pyperf timeit was abandonned. But I'm open to this idea if someone wants to write a PR for that!

@vstinner
Copy link
Member

vstinner commented Feb 3, 2022

I was curious and compared doc/examples/bench_async_func.py between Python 3.6 and 3.10, since the pyperf implementation is different (Python 3.6 doesn't have asyncio.run()):

$ python3 -m pyperf compare_to py36.json py310.json 
Mean +- std dev: [py36] 1.33 ms +- 0.02 ms -> [py310] 1.32 ms +- 0.02 ms: 1.01x faster

Using an asyncio sleep of 1 ms, there is no significant difference: for me, it confirms that the pyperf implementation is correct ;-) The accuracy is good. We don't measure the time spent to create and close the event loop.

@vstinner
Copy link
Member

vstinner commented Feb 3, 2022

(A) Benchmark on asyncio.run() with bench_func() on a coroutine func() which does nothing:

import asyncio
import pyperf

async def func():
    pass

def bench():
    asyncio.run(func())

runner = pyperf.Runner()
runner.bench_func('bench', bench)

(B) Benchmark on loop.run_until_complete(loop) with bench_func() on a coroutine func() which does nothing:

import asyncio
import pyperf

async def func():
    pass

def bench(loop):
    loop.run_until_complete(func())

runner = pyperf.Runner()
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
runner.bench_func('bench', bench, loop)

(C) Benchmark pyperf 2.3.1 new bench_async_func() method on a coroutine func() which does nothing:

import asyncio
import pyperf

async def func():
    pass

runner = pyperf.Runner()
runner.bench_async_func('bench', func)

Results on Python 3.10:

asyncio_run_py310
=================

bench: Mean +- std dev: 139 us +- 5 us

run_until_complete-py310
========================

bench: Mean +- std dev: 16.7 us +- 0.4 us

bench_async_func-py310
======================

bench: Mean +- std dev: 128 ns +- 2 ns

+-----------+-------------------+--------------------------+-------------------------+
| Benchmark | asyncio_run_py310 | run_until_complete-py310 | bench_async_func-py310  |
+===========+===================+==========================+=========================+
| bench     | 139 us            | 16.7 us: 8.33x faster    | 128 ns: 1087.31x faster |
+-----------+-------------------+--------------------------+-------------------------+

The std dev is way better using bench_async_func()!

  • asyncio.run(): +- 5 us (5000 ns)
  • loop.run_until_complete(): +- 0.4 us (400 ns)
  • bench_async_func(): +- 2 ns (2 ns)

@vstinner
Copy link
Member

vstinner commented Feb 3, 2022

I think essentially pyperf could detect a coroutine was passed in, spawn an event loop and just await it in a loop.

I don't think that detecting if the argument looks like a coroutine or not is not a good idea. It requires to import asyncio which is a "heavy" module (high startup time). I strongly prefer having a separated API (method) for that.

@vstinner
Copy link
Member

vstinner commented Feb 3, 2022

This function is now part of the just released pyperf 2.3.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants