Split micro-ops that have different behavior depending on low bit of oparg. #115457
Labels
interpreter-core
(Objects, Python, Grammar, and Parser dirs)
performance
Performance or resource usage
Splitting these micro-ops will improve performance by reducing the number of branches, the size of code generated, and the number of holes in the JIT stencils. There is no real downside; the increase in complexity at runtime is negligible and there isn't much increased complexity in the tooling.
Taking
_LOAD_ATTR_INSTANCE_VALUE
as an example, as it is the dynamically most common.can be split into
and
Each of these is simpler, thus smaller and faster than the base version.
We can always choose one of the two split version when projecting the trace, so we don't need an implementation of the base version at all. This means that the tier 2 interpreter and stencils aren't much bigger than before.
Linked PRs
The text was updated successfully, but these errors were encountered: