Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadyToRun: Tailcalls not supported #5857

Open
KevinRansom opened this issue May 14, 2016 · 22 comments
Open

ReadyToRun: Tailcalls not supported #5857

KevinRansom opened this issue May 14, 2016 · 22 comments
Labels
area-ReadyToRun-coreclr enhancement Product code improvement that does NOT require public API changes/additions tenet-performance Performance related issue
Milestone

Comments

@KevinRansom
Copy link
Member

Compiling F# code with crossgen I see:

Not implemented (Exception from HRESULT: 0x80004001 (E_NOTIMPL)) while compiling method [email protected]
ReadyToRun: Explicit tailcalls not supported

We should ensure that .tail is supported.

Kevin

theme:tail-call

@dsyme
Copy link

dsyme commented May 16, 2016

So I'm wondering why this is this not supported? NGEN has always supported tailcalls (at least in recent years). And there have always been many tests to ensure CLR support for tailcalls didn't regress :/

Do we have any info on the ramifications of this, or what's needed to support it?

Thanks

@jkotas
Copy link
Member

jkotas commented May 16, 2016

The fundamental problem with slow tailcalls is that we need more sane way to implement them.

  • For R2R on Windows, it is needed to have a reasonable ABI that we can maintain version resiliency for. (We will fallback to JIT for them.)
  • For Unix (https://github.com/dotnet/coreclr/issues/2556), it is needed to avoid writing 1000+ lines of dynamically emitted assembly code that will be a giant bug farm. (They will be turned into regular calls on Unix.)

cc @janvorli since he is going to look into this.

@jkotas
Copy link
Member

jkotas commented May 16, 2016

We may be able to enable the fast tailcalls for R2R, but it requires teaching delay load fixups about them (the delay load fixups are not able to reliably find the indirection cell used by the tailcall - tried it in dotnet/coreclr#4984).

@dsyme
Copy link

dsyme commented May 19, 2016

@jkotas Thanks for the details. I have to agree that it's really, really critical for F# on .NET Core to get this sorted out - both "crossgen" and tailcalls on Unix.

Among other things, many parts of the F# compiler toolchain itself depend on indirect tailcalls being taken, especially when processing large inputs.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 30, 2020
@msftgits msftgits added this to the Future milestone Jan 30, 2020
@jkotas jkotas changed the title ReadyToRun: Explicit tailcalls not supported ReadyToRun: Tailcalls not supported Mar 4, 2020
@jkotas jkotas modified the milestones: Future, 5.0 Mar 4, 2020
@jkotas jkotas added the tenet-performance Performance related issue label Mar 4, 2020
@jkotas
Copy link
Member

jkotas commented Mar 4, 2020

We may be able to enable the fast tailcalls for R2R, but it requires teaching delay load fixups about them

We should look into fixing this for .NET 5. It is becoming more of a problem.

@VSadov
Copy link
Member

VSadov commented Mar 4, 2020

Is tailcalling fcalls any easier than regular methods? Or it is largely the same problem?

@jkotas
Copy link
Member

jkotas commented Mar 4, 2020

It is same problem.

VSadov added a commit to VSadov/runtime that referenced this issue Mar 4, 2020
VSadov added a commit that referenced this issue Mar 4, 2020
* Managed `LdelemaRef` and `StelemRef`

* support ARM64 and OSX

* removed old JIT_Stelem_Ref implementation

* removed JIT_Ldelema_Ref

* Attributes for stack/debug hiding.

* PR feedback

* Avoid needing static initializer.

* couple small tweaks

* Avoid PreStub indirection in array element JIT helpers.

* Initialize `gcCoverLock` a bit earlier

* Added comment about revisiting after #5857 is fixed
@davidwrighton
Copy link
Member

@mangod9 I think this is too risky for .NET 5. We should fix put it on the feature schedule for .NET 6.

@mangod9
Copy link
Member

mangod9 commented Aug 10, 2020

yup will move.

@mangod9 mangod9 modified the milestones: 5.0.0, 6.0.0 Aug 10, 2020
@dsyme
Copy link

dsyme commented Aug 10, 2020

@cartermp Just checking, how signficant a hit is this, e.g. for F# compiler startup etc.?

@davidwrighton
Copy link
Member

@dsyme I expect its fairly significant, but I don't think we would have time to ensure the right quality of fix before we fork for .NET 5.0. I hope that we can address some of these somewhat niche, and fairly small feature items early in the .NET 6 timeframe so that we can rely on the longer release process to validate the work. OTOH, if @cartermp can provide evidence that this is crucial for the release, we can address the problem.

@dsyme
Copy link

dsyme commented Aug 10, 2020

@KevinRansom @cartermp Does this regress anything in the F# experience of .NET overall or is it status quo?

Just checking we're not about to encounter 6sec compiler startups except where we're already encountering them (we obviously haven't previously encountered these when using NGEN, and I think .NET Core compiler only gets hit on first use?)

@KevinRansom
Copy link
Member Author

Nothing has regressed, however, thanks to the tailcall work, done over the last year. We have a chance at getting this fixed eventually. I agree with David a feature over the next release to wire up the tailcall work to ready to run would be outstanding.

@SirBogman
Copy link

Are you sure there won't be a difference between the compiler startup time for .NET 5 with ready to run and .NET 4.8 with NGEN? Because .NET 5 won't have NGEN, correct?

@cartermp
Copy link

Yes, I think nothing has regressed compared to how it has always been for F# and .NET Core. So getting this into a good place would be great, but feels like a .NET 6 kind of thing (or at least some release after 5 is out).

Since F# is already crossgen'd I don't think we're in a decent place.

@KevinRansom
Copy link
Member Author

@SirBogman , F# on the coreclr will perform as it always has, on the desktop there will still be a net48 build. F# on the sdk is crossgened, which obviously builds on ready to run. So there is a small startup cost while the coreclr compiles the dynamic il generated to support the tailcall code, however, we have been paying that cost for the last 5 years, so it's bearable.

I hope this clears things up

Kevin

@dsyme
Copy link

dsyme commented Aug 11, 2020

Thanks all for confirming

@mangod9 mangod9 modified the milestones: 6.0.0, 7.0.0 Jul 26, 2021
jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Jul 31, 2021
@TIHan
Copy link
Contributor

TIHan commented Sep 20, 2021

This is still very much a problem - the startup time cost on the F# CoreCLR compiler is significant dotnet/fsharp#11907
This effectively has caused us to revert the change that would allow Visual Studio to use the F# CoreCLR compiler by default.

The other option for us is to build a compiler server.

jakobbotsch added a commit that referenced this issue Oct 7, 2021
* Support fast tailcalls in R2R

Partially addresses #5857

* Support ARM64

* Run jit-format

* Fix non-R2R ARM build

* Fix recursive-call-to-loop optimization with non-standard args

* Implement new delay load helper for fast tailcalls

* Minor changes and fix build break

* Switch to a define for tailcall register

* Fix x86

* Implement DefType.IsUnsafeValueType

* Emit rex.jmp for tailcall jumps on x64

* Refactor non standard args + refix recursive tailcall opt

* Set nonStandardArgKind for stack args also

* Regenerate JIT interface

* Improve recursion-to-loop decision about which args need to be reassigned

* Use INS_tail_i_jmp for func token indir

* More efficient arm64 VSD fast tailcalls, fix some bad merging

Also fix logic since we removed tailcall register. For ARM64 that means
we now use the single temp reg that is available for R2R/VSD calls, and
in LSRA we ensure this goes into a volatile register.

* Take R2R indirection into account for tail call profitability

On x64 an R2R indirected direct call normally disassembles the call site
to determine the indirection cell, so it is more expensive to do tail
calls in this scenario as the indirection cell needs to be passed in a
register. Take this into account: if there is no tail. prefix, do normal
calls, and otherwise use tail calls.

* Disallow tailcalls via JIT helper in R2R builds

* Revert "Take R2R indirection into account for tail call profitability"

This reverts commit 863ad4d.

Let's not divert on the behavior here. It is not clear that having a
smaller call site is better than tail calling albeit with a larger call
site.

* Add SPMI support, clean up mcPackets enum

* Take necessary conditions into account in canTailCall

Do not do implicit tailcalls when
* The caller is the entry point
* The caller is marked NoInline
* The callee requires security object
@AndyAyersMS
Copy link
Member

@jakobbotsch can we close this or should we keep it open to track possible support for helper-assisted tail calls?

@jakobbotsch
Copy link
Member

I think it's ok to keep this open to track helper-assisted tail calls in R2R.

@mangod9
Copy link
Member

mangod9 commented Jul 13, 2023

Is there more work required for this? Or is it ok to close?

@jakobbotsch
Copy link
Member

jakobbotsch commented Jul 24, 2023

We still don't support helper-based tailcalls in R2R that this issue is actually about, but those are quite rare. I will move this to future.

@jakobbotsch jakobbotsch modified the milestones: 8.0.0, Future Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-ReadyToRun-coreclr enhancement Product code improvement that does NOT require public API changes/additions tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests