Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-109039: Branch prediction for Tier 2 interpreter #109038

Merged
merged 17 commits into from
Sep 11, 2023

Conversation

gvanrossum
Copy link
Member

@gvanrossum gvanrossum commented Sep 6, 2023

  • Add cache entries to bytecodes.c and update them (but don't use them yet)
  • Make tests pass
  • Use cache entries for branch prediction
  • Add new tests
  • Initialize cache entries to 0x5555 (0b_0101_0101_0101_0101)
  • Buildbots
  • Benchmark

@gvanrossum gvanrossum changed the title Branch prediction for Tier 2 interpreter gh-109039: Branch prediction for Tier 2 interpreter Sep 6, 2023
#if ENABLE_SPECIALIZATION
next_instr->cache = (next_instr->cache << 1) | flag;
#endif
JUMPBY(oparg * flag);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need also a SKIP_OVER the cache?

I'm guessing that could cause the assert(frame->prev_instr == instr); in _Py_call_instrumentation_jump to fail for the instrumented jumps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That skip over the cache is already generated -- see the corresponding code in generated_cases.c.h.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the "next_instr += 1;"? In all other cases there is an explicit SKIP_OVER(INLINE_CACHE_ENTRIES_...) in bytecodes.c.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those SKIP_OVER() calls are always followed by a DISPATCH() call (or maybe a goto).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Can we make the code generator emit SKIP_OVER(X) instead of next_instr += x;?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, though IIRC Mark at some point objected to emitting macros. So I'd rather keep the status quo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markshannon What is the reason not to emit macros?

A reason to emit them is so that they are implemented in one place, so if their implementation changes you only change there. Do we want to change the code generator (and to remember that we need to) every time the implementation of a macro like SKIP_OVER changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I don't expect SKIP_OVER() to ever change. In hand-written code the macro expresses the intent better. But in generated code it just obscures what happens. I had to go to some lengths to change PEEK() and POKE() calls in the generated code to using stack_pointer[x] instead; I don't want to go back. If you still disagree, try engaging @markshannon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you still disagree,

More like trying to understand than disagreeing.

try engaging @markshannon.

Yes I directed my previous comment to him.

This is needed so branch prediction can work.
Alas, this goes untested (how to test it?).

In INSTRUMENTED_POP_JUMP_IF_NOT_NONE, rename flag to nflag
to emphasise that its sense is reversed (this is the only op
that jumps if the flag is false, because there's no
Py_IsNotNone() function). (Alternatively, we could have changed
the sense of the flag, but that would have been more work.)
@gvanrossum gvanrossum marked this pull request as ready for review September 8, 2023 00:44
@gvanrossum
Copy link
Member Author

Benchmark is running, will post results here. I think I've addressed all actionable review comments. Please review.

@gvanrossum
Copy link
Member Author

gvanrossum commented Sep 8, 2023

Benchmark is neutral(*), which is a good thing (it means adding a cache entry to the branch instructions didn't slow anything down).


(*) Or possibly some benchmarks crashed. I'm going to run buildbots to be sure.

@gvanrossum gvanrossum added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Sep 8, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gvanrossum for commit 1850988 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Sep 8, 2023
family_member_names.update(family.members)
for instr in self.instrs.values():
if (
instr.name not in family_member_names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to exclude family members from this table?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's tradition -- the table only contains the data for family heads and is always consulted after looking up the deoptimized opcode in _PyOpcode_Deopt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exclude family members from this table?

That's tradition

At first glance I thought this was some reference to a hypothetical family in the midst of conflict. 🙂

@gvanrossum
Copy link
Member Author

Hm. Many buildbots fail on test_sys_settrace, but I can't (yet) reproduce it. Must be about build flags.

This time by disabling the optimizer.
@AlexWaygood
Copy link
Member

AlexWaygood commented Sep 8, 2023

Hm. Many buildbots fail on test_sys_settrace, but I can't (yet) reproduce it. Must be about build flags.

Unlikely that it's to do with this PR, as it's already happening on main -- see #109052 and #109143

@gvanrossum
Copy link
Member Author

Unlikely that it's to do with this PR, as it's already happening on main -- see #109052 and #109143

Thanks, I'll not worry about it then.

Buildbots are beginning to lose their value for me. :-(

@gvanrossum
Copy link
Member Author

I'll fix the conflict, then merge this.

@gvanrossum gvanrossum enabled auto-merge (squash) September 11, 2023 17:36
@gvanrossum gvanrossum disabled auto-merge September 11, 2023 17:45
@gvanrossum
Copy link
Member Author

(Sorry, several reviewers got a review request because I changed the bytecode magic number. Please ignore.)

@gvanrossum gvanrossum merged commit bcce5e2 into python:main Sep 11, 2023
@gvanrossum gvanrossum deleted the count-branches branch September 11, 2023 18:21
@markshannon
Copy link
Member

This introduced a regression in branch and jump monitoring, as the target is off by one.
The line numbers in test_monitoring should be unchanged from 3.12.

@gvanrossum
Copy link
Member Author

This introduced a regression in branch and jump monitoring, as the target is off by one. The line numbers in test_monitoring should be unchanged from 3.12.

Okay, can you give me a hint on what went wrong? I haven't been following how the instrumentation works in detail, and I have no idea which bits are being tested by the tests I modified, or what I should fix. (If it's involved, please open a new issue and CC me.)

@markshannon
Copy link
Member

It is fixed in #111486, so nothing to worry about. Just putting it here for the record.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants