Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve stack slot reuse #109204

Open
ilovepi opened this issue Sep 18, 2024 · 5 comments
Open

Improve stack slot reuse #109204

ilovepi opened this issue Sep 18, 2024 · 5 comments
Assignees
Labels
enhancement Improving things as opposed to bug fixing, e.g. new or missing feature llvm:codegen

Comments

@ilovepi
Copy link
Contributor

ilovepi commented Sep 18, 2024

When compared to GCC and MSVC, Clang/LLVM is not doing a good job of reusing stack space. Most of that boils down to inefficiencies/inaccuracies in the way we do stack coloring. Today, LLVM supports two kinds of stack coloring. One focuses on local variables(StackColoring), and the other on register spills(StackSlotColoring). For the most part, the two transforms use the same algorithm, but are implemented separately for historical reasons. In the stack slot coloring code, there is a comment indicating that we should unify the two (

// TODO: In the future we plan to improve stack coloring in the following ways:
). However, after discussing this idea at the Dev Meeting last year, I’m not sure how practical it is to actually do that.

In our discussion it was noted that several developers had already to solve this and failed. Though it seems straightforward to unify the two passes, in practice this has proven more difficult to achieve, since there are many subtle interactions with other parts of code generation. One of the key issues people remembered was that there are some long standing issues with the accuracy of lifetime insertion, and previous attempts to unify these passes has been difficult because of it. The inaccuracy here wasn’t just a conservative approximation, but would sometimes be wrong, leading to miscompilation/stack corruption. Keeping the two transforms separate has kept things working for now.

Given that the above approach seems to require more significant effort, we should probably focus on improving the quality of the lifetime annotations in the short term, and start discussing how we can improve the architecture/analysis to do better in the long-term. To start, #68746 points out that currently, we do a poor job annotating the lifetimes of temporaries. This is made more complex by some recent changes at the language level. #68746 (comment) outlines some steps we can take to alleviate that. I believe that enough progress has been made in clang to revive https://reviews.llvm.org/D74094#4647616, though @AaronBallman would have to confirm that.

We also have some other open issues around slot reuse, such as #57725, and several more going further back (though I'm not sure if they are still relevant). I think these should be reevaluated, and that we should try to determine their root cause, should they persist.

This is something I'd like to start looking at more seriously in the next few months, and I'd like to coordinate efforts here w/ the rest of the community.

A few questions I have are:

  1. if anyone else has looked at this (IIRC @preames gave me a lot of the background context at last year's dev meeting)?
  2. if they're aware of other's working on this?
  3. if this is a priority for anyone else?
  4. are there other approaches we should consider?

cc'ing some relevant folks: @petrhosek @nickdesaulniers @topperc @hiraditya

@ilovepi ilovepi added the enhancement Improving things as opposed to bug fixing, e.g. new or missing feature label Sep 18, 2024
@efriedma-quic
Copy link
Collaborator

See #51838 and #45725 for past discussion of the general brokenness of lifetime.start/lifetime.end. (Nothing relevant has changed since then.)

@hiraditya
Copy link
Collaborator

IMO it'll be better to just write a global stack coloring algorithm and then remove the other two.

@ahatanak
Copy link
Collaborator

ahatanak commented Nov 6, 2024

@nickdesaulniers do you have any plans to reapply the patch in https://reviews.llvm.org/D74094? It seems like it's at least safe to implement the more conservative optimization in which the temporaries are kept alive until the end of the full expression if I understand the discussion correctly.

@nickdesaulniers
Copy link
Member

I personally do not; I've moved on from core LLVM work to maintaining llvm-libc; llvm's C runtime.

@ilovepi has expressed interest in picking up the torch, so I'll add him to the assignees (I'll leave myself on there so that I can quickly find this issue again in the future, since it's kind of my "white whale" I guess.)

#43598 (comment) was the last thing I found, which is something I think we may need to fix in https://reviews.llvm.org/D74094. Specifically, C has different rules than C++ (IIRC) for the lifetime of temporaries that contain arrays. Perhaps that's what was missed that led to the revert.


The two largest issue for Clang's relatively poor stack usage in my experience was:

  1. no lifetimes emitted for compound literals: no lifetime markers emitted for compound literals #68746
  2. no lifetimes emitted for temporaries created when passing aggregates by value

Those 2 issues seemed like lower hanging fruit to me than reworking/combining multiple internal stack slot coloring passes.

@ilovepi
Copy link
Contributor Author

ilovepi commented Dec 18, 2024

I can confirm I'm happy to take this up at the beginning of next year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving things as opposed to bug fixing, e.g. new or missing feature llvm:codegen
Projects
None yet
Development

No branches or pull requests

6 participants