-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Each instruction is two codewords, and consists of "opcode, oparg, 0, 0" #100106
Conversation
✅ Deploy Preview for python-cpython-preview canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool work. So the doubling of the instruction size only costs us 1%. That means if we can realize the removal of LOAD/STORE_FAST and LOAD_CONST we should be able to gain quite a bit.
Do you envision we could do a gradual transition to the register world, where some instructions use registers and others still use the stack?
I think so. A register can be an index into the stack, and some opcodes can just push and pop as before. This makes the transition incremental. |
Sounds good. Maybe we should add that to faster-cpython/ideas#485 (or one of the other issues about registers?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Time to start making one simple instruction use an extra oparg? Without even optimizing LOAD/STORE -- we could just tackle UNARY_NEGATIVE and give it a second oparg that designates the destination, and make the compiler write the bytecode like that.
@@ -230,6 +230,9 @@ extern "C" { | |||
#define NB_INPLACE_TRUE_DIVIDE 24 | |||
#define NB_INPLACE_XOR 25 | |||
|
|||
/* number of codewords for opcode+oparg(s) */ | |||
#define OPSIZE 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess for now we're not contemplating the size depending on the opcode. Probably just as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it won’t be hard to change this macro if we decide to do that.
I made a new PR with this stuff on today's version of main: #100276. |
This emits "opcode, oparg, 0, 0" for each instruction.
Still debugging some test failures related to line numbers/tracing etc. But this works well enough to benchmark with pyperformance: