You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the Regex source generator, we special-case alternations where every branch begins with a different char, preferring to generate a switch statement, e.g. "abc|def" yields a switch statement with a case for 'a' and a case for 'd', as the implementation sees that all branches begin with a different character and thus there's no need to fall through from one case to the next. However, while the alternation itself doesn't require backtracking (since if any branch matches, it's then known impossible for another to), it's possible that a branch expression itself employs backtracking (e.g. if a branch contains a loop). In such a case, we currently don't apply the optimization involving a switch, because the backtracking would result in a label being output in the branch's code and that label needing to be visible to later code... if the label were to be declared inside the braces/scope for the switch, that would lead to compilation failures. However, while we can't emit it as a switch (which enables the C# compiler to generate whatever efficient code it's able to, especially for larger alternations), we could still avoid cascading from branch to branch, e.g. even though we don't emit a switch, we can still make it so that failing to match a branch fails the whole alternation rather than cascading to try the next branch.
The text was updated successfully, but these errors were encountered:
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.
Issue Details
In the Regex source generator, we special-case alternations where every branch begins with a different char, preferring to generate a switch statement, e.g. "abc|def" yields a switch statement with a case for 'a' and a case for 'd', as the implementation sees that all branches begin with a different character and thus there's no need to fall through from one case to the next. However, while the alternation itself doesn't require backtracking (since if any branch matches, it's then known impossible for another to), it's possible that a branch expression itself employs backtracking (e.g. if a branch contains a loop). In such a case, we currently don't apply the optimization involving a switch, because the backtracking would result in a label being output in the branch's code and that label needing to be visible to later code... if the label were to be declared inside the braces/scope for the switch, that would lead to compilation failures. However, while we can't emit it as a switch (which enables the C# compiler to generate whatever efficient code it's able to, especially for larger alternations), we could still avoid cascading from branch to branch, e.g. even though we don't emit a switch, we can still make it so that failing to match a branch fails the whole alternation rather than cascading to try the next branch.
In the Regex source generator, we special-case alternations where every branch begins with a different char, preferring to generate a switch statement, e.g. "abc|def" yields a switch statement with a case for 'a' and a case for 'd', as the implementation sees that all branches begin with a different character and thus there's no need to fall through from one case to the next. However, while the alternation itself doesn't require backtracking (since if any branch matches, it's then known impossible for another to), it's possible that a branch expression itself employs backtracking (e.g. if a branch contains a loop). In such a case, we currently don't apply the optimization involving a switch, because the backtracking would result in a label being output in the branch's code and that label needing to be visible to later code... if the label were to be declared inside the braces/scope for the switch, that would lead to compilation failures. However, while we can't emit it as a switch (which enables the C# compiler to generate whatever efficient code it's able to, especially for larger alternations), we could still avoid cascading from branch to branch, e.g. even though we don't emit a switch, we can still make it so that failing to match a branch fails the whole alternation rather than cascading to try the next branch.
The text was updated successfully, but these errors were encountered: