Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-103793: Defer formatting task name #103767

Merged
merged 14 commits into from
Apr 29, 2023

Conversation

itamaro
Copy link
Contributor

@itamaro itamaro commented Apr 24, 2023

The default task name is Task-<counter> (if no name is passed in during Task creation). This is initialized in Task.__init__ (C impl) using string formatting, which can be quite slow. Actually using the task name in real world code is not very common, so this is wasted init.

Let's defer this string formatting to the first time the name is read (in get_name impl), so we don't need to pay the string formatting cost if the task name is never read.

Fixes gh-103793

performance analysis

in a full run of pyperformance, this is 1.00x faster overall, and up to 5% faster on async benchmarks.

full report in this gist.

@itamaro itamaro changed the title gh-NNNNN: Defer formatting task name gh-103793: Defer formatting task name Apr 24, 2023
The default task name is "Task-<counter>" (if no name is passed in during Task creation).
This is initialized in `Task.__init__` (C impl) using string formatting, which can be quite slow.
Actually using the task name in real world code is not very common, so this is wasted init.

Let's defer this string formatting to the first time the name is read (in `get_name` impl),
so we don't need to pay the string formatting cost if the task name is never read.
@itamaro itamaro marked this pull request as ready for review April 24, 2023 23:24
@jbower-fb
Copy link
Contributor

jbower-fb commented Apr 25, 2023

Strictly speaking, it's probably not necessary to store the task number in Task.__init__(). We could just call _task_name_counter() on lazy creation of a name. The only downside to this is tasks would end up being numbered slightly differently after this change for the same programs. However, I doubt anyone except possibly one or two easily fixed tests are relying on this. In general I imagine the number is actually non-deterministic.

Not storing the name on construction saves memory and will likely make everything a tiny bit faster.

@itamaro
Copy link
Contributor Author

itamaro commented Apr 25, 2023

We could just call _task_name_counter() on lazy creation of a name

that's clever! I like it :)

made this change to the PR. CI is still running, but at least on my local test run it didn't break any tests.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts.

Comment on lines 107 to 108
self._name = f'Task-{_task_name_counter()}'
# optimization: defer task name formatting to first get_name
self._name = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to optimize the .py version (which is normally not run since there is a C accelerator version)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we definitely don't need to optimize the python version. I assumed we wanted to maintain parity between the versions, but maybe it doesn't matter to this extent (or my assumption was wrong and parity is not a goal).

I will revert the changes to the python version.

Modules/_asynciomodule.c Outdated Show resolved Hide resolved
@kumaraditya303
Copy link
Contributor

Strictly speaking, it's probably not necessary to store the task number in Task.init(). We could just call _task_name_counter() on lazy creation of a name. The only downside to this is tasks would end up being numbered slightly differently after this change for the same programs. However, I doubt anyone except possibly one or two easily fixed tests are relying on this. In general I imagine the number is actually non-deterministic.

No, tasks numbers are assigned as they are allocated, its a little debugging aid and should not be changed.

Copy link
Contributor

@kumaraditya303 kumaraditya303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my suggestion on issue.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@itamaro
Copy link
Contributor Author

itamaro commented Apr 26, 2023

I have made the requested changes; please review again

Apologies about the review request spam - I don't know why GitHub decided to do that 🤔 (and I don't have permissions to clean up the reviewers list)

@bedevere-bot
Copy link

Thanks for making the requested changes!

@kumaraditya303: please review the changes made to this pull request.

Copy link
Member

@JelleZijlstra JelleZijlstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, nice speed win

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but did you re-run the benchmark, now that we have a PyLong object creation?

And did you run the buildbots with the latest version to check for leaks?

@gvanrossum gvanrossum dismissed kumaraditya303’s stale review April 29, 2023 01:08

We implemented something even simpler than a tagged pointer.

@gvanrossum gvanrossum added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Apr 29, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gvanrossum for commit 592d44b 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Apr 29, 2023
@gvanrossum
Copy link
Member

(Dismissing Kumar's review for him, he told me he's busy for the next few weeks. Setting the refleaks-buildbots label, those are sufficient for me.)

@gvanrossum gvanrossum merged commit 85c7bf5 into python:main Apr 29, 2023
@itamaro
Copy link
Contributor Author

itamaro commented Apr 29, 2023

I did rerun the benchmarks, it was 5-7% faster on the async benchmarks. I'll rerun again on main vs the parent commit and share the full report.

@gvanrossum
Copy link
Member

Thanks! (And just because I work on weekends doesn’t mean I expect you to.)

@itamaro
Copy link
Contributor Author

itamaro commented Apr 30, 2023

pyperformance async benchmarks 4-7% faster

3.12-base.20230429.1.json
=========================

Performance version: 1.0.6
Python version: 3.12.0a7+ (64-bit) revision fbf3596c3e
Report on Linux-5.15.0-1033-aws-x86_64-with-glibc2.31
Number of logical CPUs: 72
Start date: 2023-04-29 16:39:30.287074
End date: 2023-04-29 16:50:17.995978

3.12-defer.20230429.1.json
==========================

Performance version: 1.0.6
Python version: 3.12.0a7+ (64-bit) revision 85c7bf5bce
Report on Linux-5.15.0-1033-aws-x86_64-with-glibc2.31
Number of logical CPUs: 72
Start date: 2023-04-29 16:12:36.207389
End date: 2023-04-29 16:22:51.562679

+-------------------------------+---------------------------+----------------------------+--------------+-----------------------+
| Benchmark                     | 3.12-base.20230429.1.json | 3.12-defer.20230429.1.json | Change       | Significance          |
+===============================+===========================+============================+==============+=======================+
| async_tree_cpu_io_mixed       | 867 ms                    | 835 ms                     | 1.04x faster | Significant (t=6.62)  |
+-------------------------------+---------------------------+----------------------------+--------------+-----------------------+
| async_tree_io                 | 1.47 sec                  | 1.39 sec                   | 1.06x faster | Significant (t=21.03) |
+-------------------------------+---------------------------+----------------------------+--------------+-----------------------+
| async_tree_memoization        | 735 ms                    | 696 ms                     | 1.06x faster | Significant (t=21.10) |
+-------------------------------+---------------------------+----------------------------+--------------+-----------------------+
| async_tree_none               | 611 ms                    | 570 ms                     | 1.07x faster | Significant (t=13.44) |
+-------------------------------+---------------------------+----------------------------+--------------+-----------------------+

microbenchmark ~10% faster

python -m pyperf timeit -s 'import asyncio' -s 'async def coro(): pass' -s 'async def make_tasks(): return [asyncio.Task(coro()) for _ in range(10000)]' 'asyncio.run(make_tasks())'

with this PR:

Mean +- std dev: 27.5 ms +- 0.6 ms

on parent commit:

Mean +- std dev: 30.5 ms +- 0.9 ms

@gvanrossum
Copy link
Member

Excellent—thanks!

@itamaro itamaro deleted the defer-task-name-formatting branch April 30, 2023 01:02
carljm added a commit to carljm/cpython that referenced this pull request May 1, 2023
* main: (26 commits)
  pythongh-104028: Reduce object creation while calling callback function from gc (pythongh-104030)
  pythongh-104036: Fix direct invocation of test_typing (python#104037)
  pythongh-102213: Optimize the performance of `__getattr__` (pythonGH-103761)
  pythongh-103895: Improve how invalid `Exception.__notes__` are displayed (python#103897)
  Adjust expression from `==` to `!=` in alignment with the meaning of the paragraph. (pythonGH-104021)
  pythongh-88496: Fix IDLE test hang on macOS (python#104025)
  Improve int test coverage (python#104024)
  pythongh-88773: Added teleport method to Turtle library (python#103974)
  pythongh-104015: Fix direct invocation of `test_dataclasses` (python#104017)
  pythongh-104012: Ensure test_calendar.CalendarTestCase.test_deprecation_warning consistently passes (python#104014)
  pythongh-103977: compile re expressions in platform.py only if required (python#103981)
  pythongh-98003: Inline call frames for CALL_FUNCTION_EX (pythonGH-98004)
  Replace Netlify with Read the Docs build previews (python#103843)
  Update name in acknowledgements and add mailmap (python#103696)
  pythongh-82054: allow test runner to split test_asyncio to execute in parallel by sharding. (python#103927)
  Remove non-existing tools from Sundry skiplist (python#103991)
  pythongh-103793: Defer formatting task name (python#103767)
  pythongh-87092: change assembler to use instruction sequence instead of CFG (python#103933)
  pythongh-103636: issue warning for deprecated calendar constants (python#103833)
  Various small fixes to dis docs (python#103923)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Defer string formatting in asyncio Task creation
8 participants