Micro-optimize the LOAD_FAST opcode #92763

sweeneyde · 2022-05-13T04:41:23Z

The most common opcode before:

TARGET_LOAD_FAST:
    frame->prev_instr = next_instr++;
    PyObject *value = frame->localsplus[oparg]
    if (value == NULL) { goto unbound_local_error; }
    value->ob_refcnt++;
    *stack_pointer++ = value;
    _Py_CODEUNIT word = *next_instr;
    opcode = word & 255;
    oparg = word >> 8;
    opcode |= cframe.use_tracing;
    goto *opcode_targets[opcode];

The most common opcode after:

TARGET_LOAD_FAST_KNOWN_QUICK:
    next_instr++;
    PyObject *value = frame->localsplus[oparg]
    value->ob_refcnt++;
    *stack_pointer++ = value;
    _Py_CODEUNIT word = *next_instr;
    opcode = word & 255;
    oparg = word >> 8;
    goto *opcode_targets[opcode];

In particular:

The write to frame->prev_instr is removed.
The NULL-check and branch are removed.
The memory read and |= are removed.

None of these were particularly significant in isolation, but together, they accounted for an approximately 1% speedup in pyperformance.

sweeneyde added 13 commits May 10, 2022 02:13

implement LOAD_FAST_KNOWN

3627bb5

fix leak and add stack effect

ac0ad1b

add failing test

721e6ed

Attempt exception handling

42ef42f

Parameters are guaranteed to be initialized

a8e94f8

revert to LOAD_FAST on lineno setter and LocalsToFast

038a85d

Port over superinstructions

3243981

remove debugging code so non-debug builds work.

0ddc721

NOTRACE_DISPATCH() and TARGET_SAFE

dea4a83

Fix and remove debugging code

df0abe9

Merge remote-tracking branch 'upstream/main' into microopt

566ff81

Add more test cases

3e3ca26

Update test_dis

0ea6c86

bedevere-bot added the awaiting core review label May 13, 2022

sweeneyde added 3 commits May 13, 2022 00:52

Fix asan and leaky symbols

f2c5e1f

Merge remote-tracking branch 'upstream/main' into microopt

749030c

Use make_cfg_traversal_stack

0818033

sweeneyde closed this Jun 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Micro-optimize the LOAD_FAST opcode #92763

Micro-optimize the LOAD_FAST opcode #92763

sweeneyde commented May 13, 2022

Micro-optimize the LOAD_FAST opcode #92763

Micro-optimize the LOAD_FAST opcode #92763

Conversation

sweeneyde commented May 13, 2022