Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono][interp] Fix short branches #58806

Merged
merged 2 commits into from
Oct 12, 2021
Merged

[mono][interp] Fix short branches #58806

merged 2 commits into from
Oct 12, 2021

Conversation

BrzVlad
Copy link
Member

@BrzVlad BrzVlad commented Sep 8, 2021

In interp, long branches have 4 byte signed offset while short branches have a 2 byte signed offset. Ideally we would want to always emit short branches if possible due to improved performance. The problem is that we emit a short or long branch early in the codegen process, there is no way to know whether a branch is long or short at that time and no further computations were done later to determine whether the branch is really long or short. Before this commit we arbitrarily decided that all branches in methods with IL size of less than 25000 are short and, in some cases, completely ignored that long branches actually happen in practice.

This commit makes it such that we always emit long opcodes at the beginning of the codegen process, also serving as a simplification since optimizations operating on IR code don't need to care about both short and long versions of branches. Later on, we will end up converting all these long branches to short branches when emitting the final method code. We achieve this by doing a quick preliminary code iteration and computing conservative native offsets (assuming all branches are long). Since the final code will be equal in size or smaller, we have the guarantee that if a branch is short between the conservative offsets, it will surely be short also in the final code. With this approach we guarantee correctness and we can fail to shorten only a negligible amount of branches.

Since the super instruction pass generates some branching instructions that are supported only for the short version, we need to run a computation of conservative offsets beforehand. These iterations can later be optimized out, by prioritizing general compilation time over marginal improvements in code quality for very rare gigantic methods.

Fixes #57363

@ghost
Copy link

ghost commented Sep 8, 2021

Tagging subscribers to this area: @BrzVlad
See info in area-owners.md if you want to be subscribed.

Issue Details

In interp, long branches have 4 byte signed offset while short branches have a 2 byte signed offset. Ideally we would want to always emit short branches if possible due to improved performance. The problem is that we emit a short or long branch early in the codegen process, there is no way to know whether a branch is long or short at that time and no further computations were done later to determine whether the branch is really long or short. Before this commit we arbitrarily decided that all branches in methods with IL size of less than 25000 are short and, in some cases, completely ignored that long branches actually happen in practice.

This commit makes it such that we always emit long opcodes at the beginning of the codegen process, also serving as a simplification since optimizations operating on IR code don't need to care about both short and long versions of branches. Later on, we will end up converting all these long branches to short branches when emitting the final method code. We achieve this by doing a quick preliminary code iteration and computing conservative native offsets (assuming all branches are long). Since the final code will be equal in size or smaller, we have the guarantee that if a branch is short between the conservative offsets, it will surely be short also in the final code. With this approach we guarantee correctness and we can fail to shorten only a negligible amount of branches.

Since the super instruction pass generates some branching instructions that are supported only for the short version, we need to run a computation of conservative offsets beforehand. These iterations can later be optimized out, by prioritizing general compilation time over marginal improvements in code quality for very rare gigantic methods.

Fixes #57363

Author: BrzVlad
Assignees: -
Labels:

area-Codegen-Interpreter-mono

Milestone: -

@marek-safar
Copy link
Contributor

marek-safar commented Sep 16, 2021

@vargaz please review

@BrzVlad
Copy link
Member Author

BrzVlad commented Sep 16, 2021

wasm failures are fixed by #59225

In interp, long branches have 4 byte signed offset while short branches have a 2 byte signed offset. Ideally we would want to always emit short branches if possible due to improved performance. The problem is that we emit a short or long branch early in the codegen process, there is no way to know whether a branch is long or short at that time and no further computations were done later to determine whether the branch is really long or short.  Before this commit we arbitrarily decided that all branches in methods with IL size of less than 25000 are short and, in some cases, completely ignored that long branches actually happen in practice.

This commit makes it such that we always emit long opcodes at the beginning of the codegen process, also serving as a simplification since optimizations operating on IR code don't need to care about both short and long versions of branches. Later on, we will end up converting all these long branches to short branches when emitting the final method code. We achieve this by doing a quick preliminary code iteration and computing conservative native offsets (assuming all branches are long). Since the final code will be equal in size or smaller, we have the guarantee that if a branch is short between the conservative offsets, it will surely be short also in the final code. With this approach we guarantee correctness and we can fail to shorten only a negligible amount of branches.

Since the super instruction pass generates some branching instructions that are supported only for the short version, we need to run a computation of conservative offsets beforehand.
@vargaz
Copy link
Contributor

vargaz commented Oct 12, 2021

Wouldn't be better to merge all the long branches into 1 instruction, they are probably pretty rare ?

@BrzVlad
Copy link
Member Author

BrzVlad commented Oct 12, 2021

@vargaz Yeah, however it requires a bit more changes. We should do that at some point.

@BrzVlad BrzVlad merged commit 383a479 into dotnet:main Oct 12, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Nov 11, 2021
@thaystg
Copy link
Member

thaystg commented Jul 26, 2022

@BrzVlad Can we backport to .net6 this PR? We have this issue related to it: #63581

@thaystg
Copy link
Member

thaystg commented Jul 28, 2022

/backport to release/6.0

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[interp] Short branch offset can overflow with very large methods
4 participants