Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Avoid relying on LINQ in product code for improved performance #10803

Open
kasperk81 opened this issue Oct 13, 2024 · 4 comments
Open
Labels
needs-more-info Issues that need more info to continue investigation. stale For issues that haven't had activity in some time. triaged

Comments

@kasperk81
Copy link
Contributor

There are approximately 200 occurrences of LINQ usage in MSBuild's non-test and non-sample source code. While LINQ provides clean and expressive syntax, it is not considered high-performance, as noted in discussions such as dotnet/runtime#19382 (comment) and dotnet/runtime#76205. LINQ introduces overhead in certain performance-sensitive scenarios, which can be suboptimal for a product like MSBuild.

The Challenge

Replacing LINQ across MSBuild's codebase is not a trivial task. From my experience, it took five pull requests to remove LINQ from the microsoft/kiota's dependency repositories. A direct replacement would involve significant effort, especially when considering that LINQ expressions are often tightly integrated into the logic.

Proposed Solution: LinqGen

A more practical and less invasive solution is to replace LINQ with LinqGen, an MIT-licensed, source-generated variant of LINQ. LinqGen provides the same expressive syntax as LINQ but eliminates the performance penalties and code bloat associated with runtime LINQ by generating optimized code at compile time using Roslyn source generators.

This approach offers several advantages:

  • Minimal Code Changes: LinqGen can often serve as a direct replacement for LINQ, reducing the need for large-scale refactoring.
  • Improved Performance: LinqGen addresses the performance issues tied to LINQ's runtime overhead, making it more suitable for a high-performance application like MSBuild.
  • Mitigating Code Bloat: Source generation with LinqGen ensures that only the necessary code is generated, avoiding the code bloat commonly associated with LINQ in performance-critical paths.
  • NativeAOT Compatibility: Though not a primary goal for MSBuild, it's worth noting that LinqGen offers better compatibility with NativeAOT. This was one of the key reasons I chose this approach for microsoft/kiota.

Conclusion

By switching LINQ usage in MSBuild's product code to LinqGen, we can achieve better performance without invasive changes to the codebase. This change will help optimize MSBuild's efficiency while maintaining readable and expressive code. Additionally, it provides the side benefit of improved NativeAOT compatibility, should that ever become a consideration in the future.

@rainersigwald
Copy link
Member

We have made many changes to remove LINQ from hot paths over the years, and I think we should continue down that path: if something is causing a problem and it has LINQ, we should expect to remove it as step 1 in the problem-solving journey. I don't see much benefit in doing it preemptively everywhere, though. Adopting LinqGen doesn't seem likely to be helpful today, since we've already done a bunch of LINQ-on-hot-path removals.

@kasperk81
Copy link
Contributor Author

LINQ-on-hot-path removals

File system I/O can lead to resource leaks (such as handles) within the operating system. This is particularly noticeable on macOS and Linux when running in containers. These environments can become strained, especially when using LINQ, which abstracts away the underlying resource management. While LINQ offers a more expressive and concise syntax, it inadvertently leads to increased resource consumption, making it harder for developers to track and manage resource usage effectively.

In particular, I believe removing usage of LINQ from this dozen will improve the resource consumption:

src/Build/BuildCheck/Infrastructure/EditorConfig/EditorConfigFile.cs
src/Build/Evaluation/Profiler/ProfilerResultPrettyPrinter.cs
src/Build/FileSystem/DirectoryCacheFileSystemWrapper.cs
src/Build/Logging/ProfilerLogger.cs
src/Build/Utilities/EngineFileUtilities.cs
src/Deprecated/Conversion/ProjectFileConverter.cs
src/Framework/FileClassifier.cs
src/Framework/Profiler/ProfilerResult.cs
src/Shared/FileMatcher.cs
src/Shared/FileUtilities.cs
src/Tasks/GetSDKReferenceFiles.cs
src/Tasks/ResolveManifestFiles.cs
src/Utilities/TrackedDependencies/FileTracker.cs

@rainersigwald
Copy link
Member

I'd love to see evidence in the form of profiles or other quantitative measures of impact here. Without that this feels likely to introduce bugs without noticeable impact.

@maridematte maridematte added needs-more-info Issues that need more info to continue investigation. triaged labels Oct 15, 2024
@dotnet-policy-service dotnet-policy-service bot added the stale For issues that haven't had activity in some time. label Nov 14, 2024
Copy link
Contributor

This issue is marked as stale because feedback has been requested for 30 days with no response. Please respond within 14 days or this issue will be closed due to inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-more-info Issues that need more info to continue investigation. stale For issues that haven't had activity in some time. triaged
Projects
None yet
Development

No branches or pull requests

3 participants