-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stitching it all together #621
Comments
(I've filled in some missing detail and corrected some typos.) This makes sense. The API change (to pass a pointer to a pointer) is simple. The rest can be elaborated over time. From python/cpython#108866 I understand that you want Where do you imagine the calls to |
Yes,
|
I am not ready yet to consider tail calls. Not until we've officially given up on being able to support other C compilers than clang and GCC.
I'm not sure how a function is "attached" to an exit. I guess this requires translation to machine code (i.e., copy-and-patch)? In the Tier 2 interpreter, the best I can imagine doing is for the exit stubs to use different special uops instead of the current I also have some conceptual problem understanding how to code with tail calls. But that's just me, I will be working on it. |
The executor is attached to the exit. Executors are Python objects. The function pointer is in the executor. |
In light of want we've learned so far, this is how I think the all the parts of the tier 2 optimizer should work. Optimizer interfaceThe VM will call the optimizer when a back edge (or a RESUME, once implemented) counter reaches the given threshold. Region selectionThe optimizer should build a linear sequence or tree (with possible jumps back to the top) in a pre-allocated buffer in the interpreter object. Micro opsThe micro-ops in the selected region should contain as much information as necessary for optimization and efficient execution. OptimizationUsing partial evaluation, peephole optimizers or some other magic, the uops in the buffer should be transformed into something that will run faster. Making executor(s)An executor, or graph of executors, should be created from the buffer. |
I am struggling to extract action items from @markshannon's last comment. I think I have the following:
|
First of all let's rename Here are the definitions:
Note that neither of the above definitions depend on internal interpreter state, so are independent of the tier or whether compiling or interpreting. |
Okay, those definitions are helpful. To summarize, SET_IP sets it from the oparg, SAVE_CURRENT_IP from current instruction.
Hm, "the currently executing instruction" feels problematic. In Tier 1 this would (currently) be Also, the definition doesn't tell us when SAVE_CURRENT_IP should be used. (In practice, in the current code, IIRC it is used to sync from |
By "currently executing instruction", I mean "currently executing instruction as if executing in tier 1". Ideally, |
Okay, so then In Tier 1, the IP is implicitly saved at the start of each instruction, via If it was explicitly part of the instruction, we could drop it for instructions that don't call Now, there's an explicit Let me just copy all this into a new issue. |
We are currently building the region selector, optimizer and components to execute the optimized regions, but we lack an architecture link the regions together and give us good performance.
The general idea is that once we are executing in tier 2, we should try and stay in tier 2.
So exits should be able to adapt as they becomes hot.
We solve this problem with the Fundamental theorem of software engineering.
We want something that looks like an executor, but that takes a pointer to the pointer to the executor.
Something like:
Note the additional indirection. This moves the memory load from the caller to the callee, making it a bit slower, but allows the callee executor to modify which executor the caller calls next time.
On exiting from tier 2 execution, instead of returning to the tier 1 interpreter, we should jump to an adaptive stub which checks for hotness and dispatches accordingly.
For side exits with additional micro-ops to exit, the code for such a side-exit stub would look like:
For an end exit (or a side exit with no attached uops) the code looks like this:
The text was updated successfully, but these errors were encountered: