-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase fraction of code executed by tier 2. #118093
Comments
By my count we're currently hovering around 54% of code executed in tier two (our benchmarks run about 266 billion tier one instructions on normal builds and 122 billion instructions on JIT builds). I've identified a few strategies for improving this (based on stats and tracing through how we execute a bunch of the benchmarks) and will start landing PRs soon. No magic bullets here, just chipping away at things:
My motivation for this is to make JIT improvements more pronounced. We currently spend less than 10% of our time in the JIT (vs ~25% of our time in tier one), which means that we need to improve the performance of JIT code by over 10% just to see a 1% improvement on the benchmarks. My (probably ambitious) goal is to get the fraction of code executed in tier two up to around 80% (meaning, in the neighborhood of 25%-30% of the total time spent running the benchmarks) in the next couple of weeks. Then the improvements can be easier to measure and iterate on. |
It's also worth noting that our stats are currently broken on benchmarks that use C extensions or spawn subprocesses. So the actual numbers may vary a bit right now, but probably aren't heavily biased one way or another. |
…123140) * Convert CALL_ALLOC_AND_ENTER_INIT to micro-ops such that tier 2 supports it * Allow inexact arguments for CALL_ALLOC_AND_ENTER_INIT.
pythonGH-123140) * Convert CALL_ALLOC_AND_ENTER_INIT to micro-ops such that tier 2 supports it * Allow inexact arguments for CALL_ALLOC_AND_ENTER_INIT.
…_GENERAL` (GH-123212) Specialize classes without vectorcall as CALL_NON_PY_GENERAL
According to stats and profiling only about 40% of bytecode instructions are executed by tier 2 and the remaining 60% by tier 1.
We the expected improvements to the JIT and tier 2 optimizer we expect tier 2 (with JIT) to have a significantly faster than tier 1.
It therefore make sense to get the fraction of instructions executed by tier 2 up from 40% to nearer 90%.
To do that we need to:
Linked PRs
BINARY_OP_INPLACE_ADD_UNICODE
#122253LOAD_ATTR_PROPERTY
#122283DEOPT_IF
s intoEXIT_IF
s #122998CALL_KW
#123006CALL_FUNCTION_EX
#123034CALL_ALLOC_AND_ENTER_INIT
suitable for tier 2. #123140CALL_NON_PY_GENERAL
#123212The text was updated successfully, but these errors were encountered: