-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First stab at Microsoft.DotNet.ILCompiler package #64
Conversation
@jkotas This now produces something. The current problems with this:
The package "looks" okay. One of the problems with the non-runtime-specific package is that it has runtime-specific cruft in it, but that is probably only hurting size. I don't know if this is something the rest of the infra papers over before publishing. I assume RID specific package build publishing works otherwise. |
@MichalStrehovsky the results you're seeing are because the infrastructure is producing a framework pack (the .NET Core 3.0+ method of consuming the runtime) instead of a runtime-specific package (the .NET Core 2.x and previous way). Try setting That should enable building the RID-specific NuGet packages the way the lineup package expects. |
Works like charm. Thank you @jkoritzinsky! |
CoreRT repo at commit b48a58e6facdda7f50f9256e0d9d6a209473132f.
This is based off the crossgen2 package.
c35b397
to
61e4a74
Compare
Along with #71 and #72, this produces a workable NuGet package. I say workable, because it requires passing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
This PR has a bunch of refactoring for the Azure Pipelines build process. The primary change is to call build.ps1 instead of cmake directly. This removes the dependency on some of the old script files (which are now deleted).
# Local heap optimizations on Arm64 1. When not required to zero the allocated space for local heap (for sizes up to 64 bytes) - do not emit zeroing sequence. Instead do stack probing and adjust stack pointer: ```diff - stp xzr, xzr, [sp,#-16]! - stp xzr, xzr, [sp,#-16]! - stp xzr, xzr, [sp,#-16]! - stp xzr, xzr, [sp,#-16]! + ldr wzr, [sp],#-64 ``` 2. For sizes less than one `PAGE_SIZE` use `ldr wzr, [sp], #-amount` that does probing at `[sp]` and allocates the space at the same time. This saves one instruction for such local heap allocations: ```diff - ldr wzr, [sp] - sub sp, sp, #208 + ldr wzr, [sp],#-208 ``` Use `ldp tmpReg, xzr, [sp], #-amount` when the offset not encodable by post-index variant of `ldr`: ```diff - ldr wzr, [sp] - sub sp, sp, #512 + ldp x0, xzr, [sp],#-512 ``` 3. Allow non-loop zeroing (i.e. unrolled sequence) for sizes up to 128 bytes (i.e. up to `LCLHEAP_UNROLL_LIMIT`). This frees up two internal integer registers for such cases: ```diff - mov w11, #128 - ;; bbWeight=0.50 PerfScore 0.25 -G_M44913_IG19: ; gcrefRegs=00F9 {x0 x3 x4 x5 x6 x7}, byrefRegs=0000 {}, byref, isz stp xzr, xzr, [sp,#-16]! - subs x11, x11, #16 - bne G_M44913_IG19 + stp xzr, xzr, [sp,#-112]! + stp xzr, xzr, [sp,#16] + stp xzr, xzr, [sp,#32] + stp xzr, xzr, [sp,#48] + stp xzr, xzr, [sp,#64] + stp xzr, xzr, [sp,#80] + stp xzr, xzr, [sp,#96] ``` 4. Do zeroing in ascending order of the effective address: ```diff - mov w7, #96 -G_M49279_IG13: stp xzr, xzr, [sp,#-16]! - subs x7, x7, #16 - bne G_M49279_IG13 + stp xzr, xzr, [sp,#-80]! + stp xzr, xzr, [sp,#16] + stp xzr, xzr, [sp,#32] + stp xzr, xzr, [sp,#48] + stp xzr, xzr, [sp,#64] ``` In the example, the zeroing is done at `[initialSp-16], [initialSp-96], [initialSp-80], [initialSp-64], [initialSp-48], [initialSp-32]` addresses. The idea here is to allow a CPU to detect the sequential `memset` to `0` pattern and switch into write streaming mode.
This is based off the crossgen2 package.
We still don't have the BuildIntegration files, but I'm keen to see how spectacularly this is going to fail in the CI.