Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline bytecode caches #90997

Closed
brandtbucher opened this issue Feb 24, 2022 · 32 comments
Closed

Inline bytecode caches #90997

brandtbucher opened this issue Feb 24, 2022 · 32 comments
Assignees
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@brandtbucher
Copy link
Member

brandtbucher commented Feb 24, 2022

BPO 46841
Nosy @markshannon, @corona10, @pablogsal, @brandtbucher, @neonene, @iritkatriel, @penguin-wwy
PRs
  • bpo-46841: Use *inline* caching for BINARY_OP #31543
  • bpo-46841: Add (undocumented) _co_quickened attribute for code object. #31552
  • bpo-46841: Move the cache for LOAD_GLOBAL inline. #31575
  • bpo-46841: Use inline caching for UNPACK_SEQUENCE #31591
  • bpo-46841: Inline cache for BINARY_SUBSCR. #31618
  • bpo-46841: Use inline caching for COMPARE_OP #31622
  • bpo-46841: Use inline caching for attribute accesses #31640
  • bpo-46841: Improve the failure stats for COMPARE_OP #31663
  • bpo-46841: Fix error message hacks in GET_AWAITABLE #31664
  • bpo-46841: Fix BINARY_OP's handling of inline caches #31671
  • bpo-46841: Use inline caching for calls #31709
  • bpo-46841: Don't use an oparg counter for STORE_SUBSCR #31742
  • bpo-46841: Add a _Py_SET_OPCODE macro #31780
  • bpo-46841: Update adaptive.md for inline caching #31817
  • bpo-46841: Quicken code in-place #31888
  • bpo-46841: Don't scan backwards in bytecode. #31901
  • bpo-46841: Don't jump during throw()  #31968
  • gh-90997: bpo-46841: Disassembly of quickened code #32099
  • bpo-46841: remove no-longer-used macro UPDATE_PREV_INSTR_OPARG #32100
  • bpo-46841: Use a bytes object for _co_code_adaptive #32205
  • bpo-46841: Avoid unnecessary allocations in code object comparisons #32222
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brandtbucher'
    closed_at = None
    created_at = <Date 2022-02-24.02:17:14.449>
    labels = ['interpreter-core', 'deferred-blocker', '3.11', 'performance']
    title = 'Inline bytecode caches'
    updated_at = <Date 2022-04-01.11:29:28.677>
    user = 'https://github.com/brandtbucher'

    bugs.python.org fields:

    activity = <Date 2022-04-01.11:29:28.677>
    actor = 'Mark.Shannon'
    assignee = 'brandtbucher'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2022-02-24.02:17:14.449>
    creator = 'brandtbucher'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46841
    keywords = ['patch']
    message_count = 29.0
    messages = ['413875', '413901', '413987', '413988', '414188', '414194', '414261', '414277', '414343', '414349', '414358', '414359', '414364', '414371', '414372', '414378', '414391', '414461', '414490', '414523', '414547', '414696', '414763', '414955', '415307', '415676', '415969', '416482', '416483']
    nosy_count = 7.0
    nosy_names = ['Mark.Shannon', 'corona10', 'pablogsal', 'brandtbucher', 'neonene', 'iritkatriel', 'penguin_wwy']
    pr_nums = ['31543', '31552', '31575', '31591', '31618', '31622', '31640', '31663', '31664', '31671', '31709', '31742', '31780', '31817', '31888', '31901', '31968', '32099', '32100', '32205', '32222']
    priority = 'deferred blocker'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue46841'
    versions = ['Python 3.11']

    Linked PRs

    @brandtbucher
    Copy link
    Member Author

    ...as discussed in faster-cpython/ideas#263.

    My plan is for this initial PR to lay the groundwork, then to work on porting over the existing opcode caches one-by-one. Once that's done, we can clean up lots of the "old" machinery.

    @brandtbucher brandtbucher added the 3.11 only security fixes label Feb 24, 2022
    @brandtbucher brandtbucher self-assigned this Feb 24, 2022
    @brandtbucher brandtbucher added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage 3.11 only security fixes labels Feb 24, 2022
    @brandtbucher brandtbucher self-assigned this Feb 24, 2022
    @brandtbucher brandtbucher added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Feb 24, 2022
    @markshannon
    Copy link
    Member

    We need to decide what to do about dis.

    I don't think we should have a show_cache option, as the caches are meaningless junk without quickening (maybe we should drop the CACHE opcode, and just use zeroes).

    Instead we should have a show_quickened option, to show the quickened form, which we need to make clear is very much implementation defined.
    E.g. Cinder might show the machine code as well.

    That way, we can present the cache information as extra data on the quickened form, rather than junk instructions.

    @markshannon
    Copy link
    Member

    New changeset 0f41aac by Brandt Bucher in branch 'main':
    bpo-46841: Use inline caching for BINARY_OP (GH-31543)
    0f41aac

    @markshannon
    Copy link
    Member

    Making this a release blocker, as we really cannot leave this half finished for the release.

    Shouldn't be a problem, as we'll have it done in a week or so.

    @markshannon
    Copy link
    Member

    New changeset 424ecab by Brandt Bucher in branch 'main':
    bpo-46841: Use inline caching for UNPACK_SEQUENCE (GH-31591)
    424ecab

    @markshannon
    Copy link
    Member

    New changeset 4558af5 by Mark Shannon in branch 'main':
    bpo-46841: Move the cache for LOAD_GLOBAL inline. (GH-31575)
    4558af5

    @markshannon
    Copy link
    Member

    New changeset 7820a58 by Brandt Bucher in branch 'main':
    bpo-46841: Use inline caching for COMPARE_OP (GH-31622)
    7820a58

    @markshannon
    Copy link
    Member

    New changeset 3b0f1c5 by Mark Shannon in branch 'main':
    bpo-46841: Use inline cache for BINARY_SUBSCR. (GH-31618)
    3b0f1c5

    @neonene
    Copy link
    Mannequin

    neonene mannequin commented Mar 2, 2022

    UNPACK_SEQUENCE's slowdown is already filed?

    https://speed.python.org/timeline/#/?exe=12&ben=unpack_sequence&env=4&revs=50&equid=off&quarts=on&extr=on

    I hit the gap at 424ecab on Windows.

    @pablogsal
    Copy link
    Member

    This is marked as a release blocker so I am holding the alpha release on this. Is there anything we can do to unblock this issue?

    @markshannon
    Copy link
    Member

    Is there some way to mark something as not blocking an alpha release, but blocking a beta release?

    Everything is working at the moment, but not so efficiently.

    @markshannon
    Copy link
    Member

    We should be done with this by early next week, if you can wait.

    @pablogsal
    Copy link
    Member

    Is there some way to mark something as not blocking an alpha release, but blocking a beta release?

    "Deferred blocker"

    @markshannon
    Copy link
    Member

    Good to know, although "deferred blocker" is somewhat vague about when it is deferred until.

    OOI, does it become a "blocker" again once you've done the alpha release, or what stops it being deferred past the beta or even the final release?

    @markshannon
    Copy link
    Member

    It's not an UNPACK_SEQUENCE slowdown, it's a silly benchmark ;)

    https://github.com/python/pyperformance/blob/main/pyperformance/data-files/benchmarks/bm_unpack_sequence/run_benchmark.py#L6

    What I *think* is happening is that the inline cache takes the size of the function (in code units) from about 4800 to about 5200, crossing our threshold for quickening (currently set to 5000).

    When we quicken in-place, there will be no need for a threshold and this issue will disappear. We should probably up the threshold for now, just to keep the charts looking good.

    @pablogsal
    Copy link
    Member

    OOI, does it become a "blocker" again once you've done the alpha release, or what stops it being deferred past the beta or even the final release?

    Check out the devguide:

    https://devguide.python.org/triaging/#priority

    The issue will not hold up the next release, n. It will be promoted to a release blocker for the following release, n+1.

    But in any case, I normally promote them to release blockers by hand and all of them become full blockers in the beta.

    @iritkatriel
    Copy link
    Member

    New changeset 2f49b97 by Irit Katriel in branch 'main':
    bpo-46841: remove no-longer-used macro UPDATE_PREV_INSTR_OPARG (GH-32100)
    2f49b97

    @markshannon
    Copy link
    Member

    New changeset bd2e47c by Brandt Bucher in branch 'main':
    bpo-46841: Avoid unnecessary allocations in code object comparisons (GH-32222)
    bd2e47c

    @markshannon
    Copy link
    Member

    New changeset ae9de82 by Brandt Bucher in branch 'main':
    bpo-46841: Use a bytes object for _co_code_adaptive (GH-32205)
    ae9de82

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    pablogsal pushed a commit that referenced this issue May 6, 2022
    * Move CACHE handling into _unpack_opargs
    
    * Remove auto-added import
    
    * blurb add
    @pablogsal
    Copy link
    Member

    ⚠️ This issue has been updated from 'deferred-blocker' to 'release blocker' as we are past beta1. This issue will block the next release (Python 3.11.0 beta 2). ⚠️

    @iritkatriel
    Copy link
    Member

    Is there anything left to do here?

    @markshannon
    Copy link
    Member

    No. This is done.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants