Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less RegEx construction on recursive method #16351

Merged
merged 5 commits into from
Dec 5, 2023
Merged

Conversation

Thorium
Copy link
Contributor

@Thorium Thorium commented Nov 28, 2023

Just a minor code-change, there is no reason to rebuild new RegEx classes on each recursion in buildObjMessageL

@Thorium Thorium requested a review from a team as a code owner November 28, 2023 14:35
Copy link
Contributor

@brianrourkeboll brianrourkeboll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Thorium
Copy link
Contributor Author

Thorium commented Nov 29, 2023

Yes, a concurrent dictionary with single-threaded add-lock is always nice. :-)

@brianrourkeboll
Copy link
Contributor

brianrourkeboll commented Nov 29, 2023

Yes, a concurrent dictionary with single-threaded add-lock is always nice. :-)

Fair enough, although I guess (as you know) it all depends how often the outer function is called, how deeply the inner one recurses, whether the static Regex cache is likely to be modified/accessed concurrently on other threads, etc...

Edit: Also, is that illFormedBracketPattern really just matching unescaped curly braces? If so, a simple function like this is more than an order of magnitude faster anyway:

let containsUnescapedCurlyBrace (txt : string) =
    let txt = txt.AsSpan ()
    let mutable i = txt.Length
    let mutable isMatch = false
    while not isMatch && i > 0 do
        i <- txt.Slice(0, i).LastIndexOfAny ('{', '}')
        if txt[max (i - 1) 0] <> '\\' then
            isMatch <- true
    isMatch
Benchmarks
open System
open System.Text.RegularExpressions
open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Running

let illFormedBracketPattern = @"(?<!\\){|(?<!\\)}"

let containsUnescapedCurlyBrace (txt : string) =
    let txt = txt.AsSpan ()
    let mutable i = txt.Length
    let mutable isMatch = false
    while not isMatch && i > 0 do
        i <- txt.Slice(0, i).LastIndexOfAny ('{', '}')
        if txt[max (i - 1) 0] <> '\\' then
            isMatch <- true
    isMatch

type Benchmarks () =
    let mutable reusedRegex = null

    [<GlobalSetup>]
    member _.SetUp () = reusedRegex <- Regex illFormedBracketPattern

    [<Benchmark>]
    member _.IsMatch_Static () = Regex.IsMatch (@"\{\}", illFormedBracketPattern)

    [<Benchmark>]
    member _.IsMatch_ReusedInstance () = reusedRegex.IsMatch @"\{\}"

    [<Benchmark>]
    member _.IsMatch_NewEveryTime () = (Regex illFormedBracketPattern).IsMatch @"\{\}"

    [<Benchmark>]
    member _.LastIndexOfAny () = containsUnescapedCurlyBrace @"\{\}"

ignore (BenchmarkRunner.Run<Benchmarks> ())
| Method                 | Mean         | Error      | StdDev     |
|----------------------- |-------------:|-----------:|-----------:|
| IsMatch_Static         |   136.074 ns |  0.5089 ns |  0.4761 ns |
| IsMatch_ReusedInstance |   137.080 ns |  0.4396 ns |  0.4112 ns |
| IsMatch_NewEveryTime   | 1,864.311 ns | 11.8064 ns | 10.4660 ns |
| LastIndexOfAny         |     8.857 ns |  0.0464 ns |  0.0362 ns |

@Thorium
Copy link
Contributor Author

Thorium commented Nov 30, 2023

I think your code is even better than the numbers:
As far as I understand, BenchmarkDotNet is by default running single threaded?

So if you look just the numbers, you may miss the fact of locks inside RegEx matching, which may affect real-life performance: suddenly your 20 core laptop (or server) running x parallel threads is limited to a single core execution speed. Meanwhile one can argue it's a tiny locking time per execution, if there are enough of those in parallel, it may cause bigger picture issues. Similar reasons why Tasks were introduced to .NET in the first place.

@Thorium
Copy link
Contributor Author

Thorium commented Nov 30, 2023

I don't know if that is ok, but with this pattern RegEx returns true and your code false: @"{\}"

@brianrourkeboll
Copy link
Contributor

I don't know if that is ok, but with this pattern RegEx returns true and your code false: @"{\}"

Yeah, there was a bug when '{' or '}' was the first char 🙃. I updated the snippet. But @T-Gro's suggestion is probably a good compromise.

@Thorium
Copy link
Contributor Author

Thorium commented Nov 30, 2023

CI: I merged the latest main to here, even though there is a trimming-failure-test in main.

@T-Gro
Copy link
Member

T-Gro commented Nov 30, 2023

CI: I merged the latest main to here, even though there is a trimming-failure-test in main.

This means FSharp.Core now has a different size, which was expected. You have to update the check.ps1 file at ~\tests\AheadOfTime\Trimming

@psfinaki psfinaki merged commit 1c4245c into dotnet:main Dec 5, 2023
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants