support for incremental recompilation of projects #8044
Replies: 41 comments 12 replies
-
These issues and PRs may be related: Incremental builder re-write - #8040 Does anybody understand how much IncrementalBuild accomplishes this functionality? |
Beta Was this translation helpful? Give feedback.
-
In the context of build and not tooling, I don't know enough about the pros/cons of this specific mechanism to warrant an opinion on it. I would need to learn more about it and talk with the C# team to understand their perspective. -- To increase build/compile time performance, there are many alternatives that are much cheaper to do. We are addressing them now. With each iteration, we get a little better. How we are addressing performance today:
Specifically build times, we have a potential candidate in the list:
Our
|
Beta Was this translation helpful? Give feedback.
-
Incremental builds are already used in F#/C#/etc. tooling with MSBuild's up to date checker. It's a heuristic that can often fail since MSBuild is not deterministic, though. Should MSBuild ever support a deterministic mode, then this would be more reliable and act as a basis for lots of interesting experiences. We've seen 30% or more of total build time in large solutions be attributed to MSBuild tasks getting run when they probably shouldn't be. Tools like ReSharper Build can help with this a bit but they're still not perfect. After spending some time looking at builds for larger solutions I honestly think addressing this directly in MSBuild is more impactful. Incremental recompilation of source files is an interesting proposal but I'm not sure it helps total build time and I'm also not completely convinced that it's the source of tooling performance woes when making changes across files and wanting to see those changes reflected. F# does not have any heuristics to see if a change in a dependent project requires that everything in a dependency chain needs to be typechecked again. C# does have these heuristics and it's why (in most cases) seeing changes reflected anywhere in your codebase is generally quite fast. Doing this for F# would be very impactful from a tooling standpoint, but I also think it's bordering on a research area too. It's something we've spoken about a lot but we don't have a definitive solution. Part of the issue is that F# tooling today does not have the necessary pieces needed to safely and efficiently calculate if something needs to be typechecked again. @TIHan's work with the Compilation Model prototype adds some of the necessary infrastructural changes to allow something like that. |
Beta Was this translation helpful? Give feedback.
-
@cartermp that sounds like something that the FSharp.Data.Adaptive model would be suited to, at least in the overall shape of the problem as described. Modeling values/bindings/scopes as adaptive values and chaining them together in more reactive ways, etc. |
Beta Was this translation helpful? Give feedback.
-
Is this tracked anywhere? When I search on google everything says that MSBuild supports deterministic builds since VS2017. (Or possibly VS2015.) |
Beta Was this translation helpful? Give feedback.
-
@cartermp, this ticket is about building and running code after it was changed, right now, however small is the change, the rebuild time of a given single project is identical to a cold build, and for projects of significant size, that time is much larger than with C# / VB.NET projects of similar size line of code wise. I see all the good things happening in terms of optimizing F# tooling (type checking and advanced editor support up to date after edits) and don't want to detract anything from this work. Changing code in many places before a rebuild is a valid but specific use case, the time to perform such changes offsets build time, compile taking a bit long is fine here. The use cases I'm refering to are those that Microsoft has invested to make it easy and fast to run a solution / the code after changes (pressing F5 in visual studio), generally a mix of those:
In many other ecosystems, things like hotreloading is important to make workflow smooth, enabling small edits and seeing the effect quickly, see https://github.com/hasura/awesome-live-reloading for examples.
I hope my explanation helps in changing your view a bit, if you need an extreme example on small codebase, take FSharpPlus project, changing anything in the test project (not the library), building that project alone takes close to 2 minutes (i7-2860QM hyperthreading off, edit: 1min 30 on i7-7500U with hyperthreading on), maybe you can get 1 minute on top specced CPU. It can be argued that FSharpPlus is particularly taxing on the compiler, but it is important to understand that even a vanilla F# codebases will exhibit those issues, I assume the main reasons are:
Is it type checking and code editing you are refering to? AFAIK, when you change a C# file in a project, it will rebuild the whole project and all the dependent projects (unless using the tooling mentioned), the heuristics you mention are true in the editor tooling. I understand the impact of incremental rebuild, or edit and continue support (probably related to some degree) will require a lot of engineering efforts, and I just wanted to make an explicit request about it, I perfectly understand if it doesn't fit the current roadmap. The great thing about incremental rebuild versus other approaches (that shouldn't be dismissed), is that it works for any kind of codebases, I could think about hypothetical ways to optimize repeated time consuming overload resolutions by not performing the full resolution when the arguments are rigid, but I'm sure it won't yield homogeneous results as proper incremental rebuild. It will also pave the way to features related to type checking that may be put aside with the concern it could drastically improve compile time. A simple approach to this feature would be to cache the files that are depended upon for each compilation unit, and treat each compilation unit conceptually as a separate project, but this won't scale as much as a fine grained approach at symbol level in non recursive scope and optimization such as looking at the API surface change and many others. Thanks @cartermp and all others for the feedback so far. |
Beta Was this translation helpful? Give feedback.
-
The compilers (fsc and csc) are deterministic, however msbuild loads and runs many other build tasks that are not verified to be deterministic (and many which we know to not be). So a better way to phrase this is that the F# build toolchain is not deterministic at this time and therefore incremental compilation is a best effort heuristic. |
Beta Was this translation helpful? Give feedback.
-
Roslyn/MsBuild added support for reference assemblies, would be great if fsc also supported this. Here is a good writeup from the MsBuild side: dotnet/project-system#2254 This is not directly related to the incremental build of a single assembly, but having this would enable to skip a lot of redundant recompilations. |
Beta Was this translation helpful? Give feedback.
-
Reference assembly support is another good (also challenging) thing to add. Another benefit is a significant reduction in memory usage - for very large projects it could be upwards of 10% or less. Incremental compilation is very different from the other techniques discussed here. I think it's best to think of it as entirely separate from everything else with its own set of benefits. It's probably achievable on a per-module basis once we know what the dependency graph looks like. I'm just not sure how expensive that would be to implement. Certainly worth doing though. |
Beta Was this translation helpful? Give feedback.
-
What is language-specific about generating reference assemblies? The spec says these are just normal assemblies with various parts taken out. Everything appears to be straightforward operations on IL. Is the reason that it's faster if you generate both at once than generating the Does https://github.com/ImperialPlugins/ReferenceAssemblyGenerator look any good as an |
Beta Was this translation helpful? Give feedback.
-
A compiler needs to know how to read and write them, and also flow the right information to the right places depending on the operation. This is not necessarily straightforward work in a compiler, though the spec is now very stable so it won’t take as long to implement correctly as it did for C# and VB.
…On Sun, Jan 12, 2020, at 08:25, Charles Roddie wrote:
What is language-specific about reference assemblies? The spec <https://github.com/dotnet/roslyn/blob/master/docs/features/refout.md#definition-of-ref-assemblies> says these are just normal assemblies with various parts taken out. Everything appears to be straightforward operations on IL so it's not clear to me why C# and VB have specific support.
Does https://github.com/ImperialPlugins/ReferenceAssemblyGenerator look any good as an `standard IL assembly -> reference assembly` function?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <https://github.com/dotnet/fsharp/issues/8044?email_source=notifications&email_token=ABQEJTSA4RYEC3ZDKJZMRUDQ5NAAZA5CNFSM4KACPLV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIW6BLA#issuecomment-573431980>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABQEJTXZVANYI53RX7A6FHDQ5NAAZANCNFSM4KACPLVQ>.
|
Beta Was this translation helpful? Give feedback.
-
OK, but am I right that in theory a reference assembly can be generated from a standard CIL assembly? |
Beta Was this translation helpful? Give feedback.
-
In theory, the open source project you link could produce reference assemblies (though I cannot speak towards it's quality). This doesn't buy us anything, though, since the compiler still needs to know how to read them. |
Beta Was this translation helpful? Give feedback.
-
FSharp has special metadata that would need to be preserved, such as what is required for inline functions. So doing this would a little bit of work to figure out how inline functions should interact with reference assemblies.
@0x53A I don't understand where a claim that reference assemblies will improve incremental compilation comes from though. The compiler would still need to do the same amount of work to determine if the public surface area has changed or not. This could be elided if fsi files were being used but that can be done today. In the C# compiler reference files / skeleton assemblies are about error recovery and library distribution size, not performance. |
Beta Was this translation helpful? Give feedback.
-
Sounds like this would be a great improvement. The discussion in #7077 suggests to me that interdependencies between F# projects are the cause of a lot of existing performance problems, and also related to interop inconsistencies (F# to F# easier than F# to C#), and being able to analyze some compiled version of an F# project - or even better to take as an input a reference assembly of any .Net project, would fix that. So although I don't understand the details, these issues all seem very related.
If the reference assembly file has stayed the same, the public surface area is the same, and most changes that keep the public surface area the same should keep the reference assembly the same. So you just need to compare file hashes surely? |
Beta Was this translation helpful? Give feedback.
-
Also, I don't know if this is a language barrier thing, but since the error is an FS error, can you please update the word storage to say memory. I don't know if this is a UK vs. US thing and brits say storage to refer to memory, but I'm most used to refering to OOM as memory and not storage. |
Beta Was this translation helpful? Give feedback.
-
@jzabroski Sorry if this wasn't clear. What I'm saying is that the problem you're seeing is not related to the actual build/compile of F# code due to how that works. It's likely due to VS just going OOM. F# tooling does use a lot of memory, but there's no memory leak that we're aware of. It's unfortunately not too uncommon to see process memory limits getting reached with massive solutions and/or extensions like R# also getting involved. But none of this would really be helped by incremental compilation since we just shell out to an MSBuild process when building your codebase. |
Beta Was this translation helpful? Give feedback.
-
OK, but this feels a bit like "hot potato" - is there a way I can pinpoint the amount of memory used by R#, F#, etc within devenv.exe? I used to have the Microsoft Premiere Support memory dump instructions and know how to roll them up using tools, but honestly I can't keep so many technologies in my head these days. None of my other coworkers are running into this problem and we're all running identical Amazon Workspace instances with identical VS 2019 version, with identical solutions, all with Resharper running. I'm not trying to push this onto you to fix, but if it is R# for example I'd like to file a bug report with them instead. |
Beta Was this translation helpful? Give feedback.
-
I'll defer to #9175 where it sounds like you have what you're after :) |
Beta Was this translation helpful? Give feedback.
-
Thanks - please feel free to hide these comments as off-topic. Thanks for your help. |
Beta Was this translation helpful? Give feedback.
-
Hi @cartermp - I actually suspect this problem may be in F# after all, and has nothing to do with the fact I'm using Resharper. It seems like there is a string constant folding somewhere in some compiler (C# or F#) that only is triggered when the changed project graph includes a Resources assembly. I found a C# bug fix by Neil Gafter for string constant folding (based on my guess as to what could cause exponential memory explosion on incremental compiles), but my code stack is using his fix and I still have this memory explosion issue. I haven't nailed it down just yet, but also nobody from Microsoft Visual Studio replied to my crash reports. Is it possible F# uses some similar string constant folding logic to the one prior to Neil's fix. I'm almost done porting my F# code to C#, so it doesn't really effect me, but I hope this helps make a better product. Thank you. |
Beta Was this translation helpful? Give feedback.
-
Hey @jzabroski , I apologize that we haven't looked at the crash reports. We take your issues seriously, and we want to have a better product. While we don't know if it is really caused by constant folding with strings, you have at least highlighted a problem that we should fix. In F#, we do constant folding with strings; I implemented it last year when I was optimizing string concatenations to use the I did the naive way of doing constant folding and looks like that can bite us. The easiest solution is just not to do constant folding with strings, or just limit the amount. |
Beta Was this translation helpful? Give feedback.
-
@TIHan Glad my intuition was correct and helpful in pinpointing the problem. I never for once took any offense to any discussion I've had with Microsoft. I'm easy going. Glad I can help improve F#. I see a bright future for .NET where we can all have a peaceful ecosystem of awesome language features. Rock on, Will. |
Beta Was this translation helpful? Give feedback.
-
Just going to throw this into the mix for inspiration (although it more specifically deals with incremental parsing) https://diekmann.co.uk/diekmann_phd.pdf |
Beta Was this translation helpful? Give feedback.
-
@cartermp / @TIHan, seeing there are tracked tickets/experiments on incremental builder and that a thought came to better scope a potential first iteration on incremental recompilation. The caching would be happening at the source file level in terms of coarseness, and working on the infrastructure around cache outputs per file:
for for related: #10445 #10217 #10211 #7077 if any of those tickets could invest some effort towards coming in the direction of this issue, I think it will be helpful for eventually reaching the goal of this issue. |
Beta Was this translation helpful? Give feedback.
-
from https://github.com/dotnet/msbuild/blob/main/documentation/specs/static-graph.md
If this applies to msbuild as a whole, it probably applies to the process it spends the most time in. |
Beta Was this translation helpful? Give feedback.
-
Hi, I work on Flow, Facebook's JS type checker. @smoothdeveloper pointed me to this issue because I recently spoke about our own solution to this problem on a livestream/podcast. It is important to note that our problem is far simpler than yours-- we only need to type check code, not compile it. Flow has several "lazy" modes that it can start in. When you run flow in a lazy mode we:
... and that's it. Flow starts a server when you run it that uses a file watcher to listen for updates. The server is also responsible for things like:
So, suppose after starting a lazy server you edit some file Now, we ensure that every exported value in each module is sufficiently annotated such that the signatures can be extracted with a purely syntactic analysis. This allows us to quickly extract a signature from each file in dependency order and then, like in #11152, we check everything in parallel. The big property we are taking advantage of in this approach that may not be viable in F# is that every exported value is annotated enough that we can extract signature files without requiring developers to actually write them. Ok, so now you're done with your first edit, Flow has lazily loaded all the necessary information to check your changes, and Flow has performed those checks. On future edits to that file, all of the dependencies are already loaded, so type checking is just a matter of checking that one file and the files that depend on it. We have another optimization here that lets us skip checking the dependent files: we hash signatures of files and only check their dependents if the signatures have changed. Keeping hashes stable is extremely non-trivial. Even things as seemingly innocuous as whitespace changes need to be taken into account--- if a file downstream of the file you're editing refers to a value in your file, you want to make sure that the error message is not pointing to a stale location! We handled that by making our representation of locations more abstract, but that is just one technical challenge to solve. Keeping the hashes of types stable in the first place can also be a challenge, and we've fixed many bugs where meaningful changes did not change the hash. If you have any questions about any part of our approach please ask! |
Beta Was this translation helpful? Give feedback.
-
Thanks for the insight!
The project dependency graph is already handled by msbuild so unchanged projects should not be rebuilt. As has been discussed earlier in this thread, making F# projects support being declared as reference assemblies (which has the affect that only the public surface area is tracked via MSBuild) could significantly improve some scenarios. Within a project, the semantics of F# mean that computing the dependency graph is trivial: all files below the one being edited are affected. As you have pointed out a dependency graph approach will win you performance if you are editing leaf nodes, but doesn't speed things up on "core" files, which for F# are files at the "top" of the file order.
Due to the way type inference works, even if a file has not changed, an upstream change can still modify semantics so this is not an applicable optimization for F#. There is the concept of fsi files though, these are a public contract for types and make types explicit instead of inferred. If a programmer creates fsi files we will only re-run typechecking on dependent types based upon these files changing. Its a perf win but requires developers to add these files themselves and forgo type inference on their public surface area.
We have done most of the work for this elsewhere hopefully we can leverage this into the F# codebase in the future. As you say, it is a non-trivial amount of work. |
Beta Was this translation helpful? Give feedback.
-
For future readers (like me): Support for reference assemblies has been added via #12334 |
Beta Was this translation helpful? Give feedback.
-
Great to see that support for reference assemblies has been added! What about incremental build support at the module-level? Has any progress been made on that front? |
Beta Was this translation helpful? Give feedback.
-
Is your feature request related to a problem? Please describe.
Incremental recompilation is not something simple, yet I believe this is a significant impediment to adoption of F# for large codebases, which are prevalent in large applications and is defintely a space where F# has a role to play.
Many language toolchains support this (C/C++, rust, scala to name a few), C#, F# and VB.NET are notable exceptions, albeit F# is the most significantly impacted as it doesn't gob lines of codes nearly as fast as C# and VB.NET compilers, plus C# and VB.NET have some support of edit and continue.
In large dotnet projects, tools like Resharper or Rider already support incremental recompilation at solution/project level, shaving off a lot of time spent on building the solution, even when a project depended upon is changed, it won't recompile the depending projects unless there is top level API surface change.
I've seen criticism and frustration around those issues, I can reasonably say this single improvement would change experience of every single F# developer and help with adoption in places with large codebases that rely mostly on fast compile time or incremental compilation.
There is even prior art in F# landscape AFAIU with Fable.
Describe the solution you'd like
When I change a .fs file, only code that directly depend on it in the project should be recompiled. the rest could be cached/linked in the output assembly.
Describe alternatives you've considered
practical: work with .fsx and only pull the subset of what you are working with, but that workflow doesn't translate well depending the type of projects / structure of the code.
hypothetical: one project per file + resharper build / hacking msbuild & fsc task infrastructure to rely on netmodules and ilmerge those, this won't work well / be supported with existing tooling.
Additional context
https://blog.jetbrains.com/dotnet/2015/10/15/introducing-resharper-build/
https://www.scala-sbt.org/1.x/docs/Understanding-Recompilation.html
fable-compiler/Fable#1648
https://doc.rust-lang.org/edition-guide/rust-2018/the-compiler/incremental-compilation-for-faster-compiles.html
Beta Was this translation helpful? Give feedback.
All reactions