-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce TaskParameterEventArgs #6155
Conversation
Fixes #6007 |
Fixes #5211 |
Fixes #3966 |
Potentially fixes #3577 (at least the hope is to significantly reduce allocations) |
Potentially fixes #2200 |
Partially fixes #2168 |
I've noticed that the ref api didn't regenerate and discovered that that functionality got disabled accidentally. After this fix merges I'll need to rebase and regenerate the public API: |
Here are some stats from a binlog produced from this PR and an equivalent binlog from master:
Note how we have 34,149 strings totalling 7 MB, and master has 38,934 strings totalling 334 MB. Turns out if you crumble those extra ~4,800 strings down they basically disappear. Binlog size here goes from 11.6 MB to 3.6 MB. An equivalent binlog from 16.8 (no dedup at all) is ~22 MB. Uncompressed stream inside the binlog goes from 341 MB to 18 MB. Also note how the largest string goes from 8.6 MB to 43 KB. This completely avoids the Large Object Heap at 85 KB threshold. |
I got some more perf numbers.
This PR appears to be slower than 16.8 for /m and no log:
|
@KirillOsenkov If you compare against the msbuild binaries shipping with VS, those are also ngened and use profile guided optimizations. Best to compare against your own built msbuild from the 16.8 branch |
src/Build/BackEnd/Components/RequestBuilder/IntrinsicTasks/ItemGroupLoggingHelper.cs
Outdated
Show resolved
Hide resolved
@KirillOsenkov regarding the newly added ton of small string allocations for all the items and their metadata, how about not deserializing |
That would hold on to a byte[] and I guess when LogTaskInputs is set and this whole machinery runs, we're guaranteed to consume every event via a logger. So might as well read it eagerly, plus most strings will be interned by InterningBinaryReader. |
Some decent perf improvements from the last 5 commits:
|
There are a few oustanding questions with this PR that I'd like to resolve, and I'll respond separately when it's ready for proper review. It's getting closer though. Just need to decide whether it's worth the risk of exposing a couple internals for raw speed. |
You can see We also see reduced copies of items during enumeration since we now bypass the enumerator Proxy that DeepClones items.
I apologize for measuring 16.8 again, but please indulge me, it brings me great joy: ;) This is 16.8.5: Check that LOH |
This should now be ready for review. |
I'm starting to have concerns about how node packet serialization works in this PR. The whole problem starts because we want to shoehorn the implementation of serialization into the declarations assembly (MS.B.Framework). I think it might be cleaner if we move the serialization out to Microsoft.Build. This way SmallDictionary, TaskItemData and potentially even IMetadataContainer can move out to Microsoft.Build as well. There won't be a need for static constructors then. |
This is needed to ensure the static constructor runs.
Nah, I've just tried and it won't solve the Message problem. |
src/Build/BackEnd/Components/RequestBuilder/IntrinsicTasks/ItemGroupLoggingHelper.cs
Show resolved
Hide resolved
Rename SmallDictionary -> ArrayDictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for this!
Child node "19" exited prematurely.
error
#2164
Instead of just logging a BuildMessageEventArgs with a list of all items and metadata concatenated into a large string (often 5 MB in size or more) it keeps a structured representation of items and metadata. TaskParameterEventArgs inherits from BuildMessageEventArgs and the Message implementation materializes the large string on demand. However when only the BinaryLogger is present the Message is never accessed, thus saving on allocations. The Message is also never sent across the nodes nor written into the binlog.
TaskParameterEventArgs is instantiated in 5 locations: ItemGroup Include and Remove inside targets, task inputs, and two cases for task outputs.
Storing smaller strings in the binlog results in very significant savings from string deduplication. A 22 MB binlog goes down to 3.5 MB in size. We're also seeing build speed improvements from 33 seconds to 30 seconds. Significant reduction in memory allocations since we no longer need to allocate the large strings and send them across the nodes.
This also shares some extension methods for reading and writing things between the node packet serialization and binary logger. It also shares a new internal type, TaskItemData, used as a holder for deserialized items. The actual ProjectItemInstance.TaskItem is too heavyweight for this.
Binary logger format version goes all the way to 11. The viewer already supports the new format. An additional benefit is that the viewer no longer has to parse the large text messages to recover structure, it will now be more reliable when reading multi-line properties, items and metadata values.
I've replayed the binlog produced by this version and diffed with the old binlog and all information appears to be preserved.