-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spill single-def variable at definition to avoid further spilling #54345
Conversation
14e83e5
to
e21876e
Compare
Below is the summary of diffs. Detail diffs can be seen here. Code Size
Perf Score
Instruction Count
|
@dotnet/jit-contrib |
FYI - @danmoseley |
outerloop failures are #54469 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, nice catch!
one question, I looked at the regressions, they mostly look like:
-Generating: N089 ( 3, 3) [000140] DA---------- * STORE_LCL_VAR long V12 tmp10 d:2 rdx REG rdx
+Generating: N089 ( 3, 3) [000140] DA---------z * STORE_LCL_VAR long V12 tmp10 d:2 rdx REG rdx
+IN000b: mov qword ptr [V12 rsp+20H], rdx
V12 in reg rdx is becoming live [000140]
Live regs: 00000002 {rcx} => 00000006 {rcx rdx}
Live vars: {V06} => {V06 V12}
genIPmappingAdd: ignoring duplicate IL offset 0xc
Generating: N091 (???,???) [000196] ------------ IL_OFFSET void IL offset: 0xc REG NA
Generating: N093 ( 1, 1) [000122] ------------ t122 = LCL_VAR long V12 tmp10 u:2 rdx REG rdx $401
Generating: N095 ( 1, 1) [000123] -c---------- t123 = CNS_INT long 0 REG NA $100
/--* t122 long
+--* t123 long
Generating: N097 ( 3, 3) [000124] J------N---- * EQ void REG NA $245
-Not emitting compare due to flags being already set
+IN000c: test rdx, rdx
Generating: N099 ( 5, 5) [000125] ------------ * JTRUE void REG NA
-IN000b: je L_M36406_BB08
+IN000d: je L_M36406_BB08
and if I understand correctly mov
does not change SZ
flag, so can we add a special condition to AreFlagsSetToZeroCmp
to look at the instruction before the previous instruction (why does not English language have a work for this?) if the previous is such spill? Will it make this change all improvements no regressions?
This diff in benchmark looks like a potential measurable regression:
232 ( 3.51% of base) : 878.dasm - System.Number:NumberToStringFormat(byref,byref,System.ReadOnlySpan
1[Char],System.Globalization.NumberFormatInfo)`
unsigned char lvSpillAtSingleDef : 1; // variable has a single def (as determined by LSRA interval scan) | ||
// and is spilled making it candidate to spill right after the | ||
// first (and only) definition. | ||
// Note: We cannot reuse lvSingleDefRegCandidate because it is set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it mean that lvSingleDefRegCandidate==true
does not imply that the variable has a single def for lvSpillAtSingleDef
purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. lvSingleDefRegCandidate
is only used so we can decide if we should enregister EH-var or not (which happens in earlier phase). Things change after that and we can't rely on that information. The best safe place I thought about checking for single-def is while building intervals. I wanted to do it for EH-var as well, but unfortunately we need that information before LSRA.
I do not see that diff. Here are the diffs that I see. May be my branch didn't have #53214 and you compared it with main? |
I was comparing your branch top with sandreenko@87cd70c (the last change not from your on your branch), strange.
|
Here are the numbers for aspnet
Detail diffs: https://gist.github.com/kunalspathak/b92c9221560a6c0c9136d6419f50fd03 |
Thanks to @sandreenko for brining me the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good to me. Agree with Sergey that some refactoring to capture the common predicate chains would make this more readable/maintainable.
Seems like some interesting avenues for follow up too:
- more aggressive CSE policy (since CSEs are usually single def)
- reconsider which weights we use for ref positions (global/local...?)
- decide if spilling at def is sub-optimal if all uses are cheap
There were some errors in the environment when I collected above numbers. Here are fresh set (no major changes) Code size
Perf score
Instruction count
|
I did some investigation the regressions related to |
9b676ab
to
98203d3
Compare
@AndyAyersMS , @sandreenko - I have addressed the review feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one more thing I'd like you to look at.
If a single-def variable is decided to get spilled in its lifetime, then spill it at the firstRefPosition RefTypeDef so the value of the variable is always valid on the stack. Going forward, no more spills will be needed for such variable or no more resolutions (reg to stack) will be needed for such single-def variables.
3ea3629
to
649c6c4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Hello @kunalspathak! Because this pull request has the p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (
|
Failures are infrastructure related. |
Just found out a related issue - #7994. |
If a variable is a single-def and it was ever decided to be spilled during register allocation, spill it at its first and only definition so we can skip further spilling throughout the method since the value on the stack is already up-to-date.
This is accomplished in following way:
buildIntervals()
, we will markinterval->isSingleDef=true
if it is a single-def.firstRefPosition->singleDefSpill = true
indicating that it should be spilled right after the first ref position. This concept is similar to our writeThru EH-vars.singleDefSpill
and mark correspondingvarDsc->lvSpillAtSingleDef=true
varDsc->lvSpillAtSingleDef
to decide if we should spill a variable and include it in GC pointers scanning (since it is always on stack).As a positive side-effect, during spilling, we would prefer to spill a register assigned to "spill at singledef" interval because cost of spilling it is lower than other intervals.
This gives significant performance improvement in scenarios that involve common sub-expression elimination (CSE). In CSEs, we define a temp variable once and use it at multiple places. We would eliminate the spilling of these temporary variables throughout the method. Below is an example diff of GenericArraySortHelper.InsertionSort. The method involves nested loop and we CSE some temps that we spill immediately at their definition. That helps us eliminate the spilling of those variables inside the loop.
Similar improvement can be seen for "this" argument which is always a single-def.
Diff of
InsertionSort
: https://www.diffchecker.com/bZEOJkJ6Contributes to #6761 and #6825.
Credit: Optimized Interval Splitting in a Linear Scan Register Allocator paper by Christian and Hanspeter, section 4.3.