Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using stackalloc for string.Split #6266

Closed
jamesqo opened this issue Jul 6, 2016 · 25 comments
Closed

Consider using stackalloc for string.Split #6266

jamesqo opened this issue Jul 6, 2016 · 25 comments
Assignees
Labels
area-System.Runtime enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors
Milestone

Comments

@jamesqo
Copy link
Contributor

jamesqo commented Jul 6, 2016

Right now, string.Split allocates a new int array for both the char and string versions. We might want to consider using stackalloc here instead for strings of lengths under a certain threshold.

Alternatively, we could introduce some type of ArrayPool-like API to mscorlib and use that, or just forego allocating anything altogether and simply count the length needed for the resulting array on our first pass.

@gkhanna79
Copy link
Member

CC @danmosemsft @AlexGhiondea

@AlexGhiondea
Copy link
Contributor

Sounds like a great idea for an enhancement. Anyone interested in picking this up?

@4real
Copy link

4real commented Mar 24, 2017

I'm looking into it, seems like a nice introduction to coreclr.

@danmoseley
Copy link
Member

@karelz can you please give @4real this issue

@4real whatever you do will need a fair bit of performance measurement to find and prove the best approach on various inputs. We usually use Benchmark.NET.

@karelz
Copy link
Member

karelz commented Mar 24, 2017

@4real I sent you the invite, please ping us when you accept.

@danmosemsft I thought you have same admin permissions as I do ... we should fix that ;-)

@4real
Copy link

4real commented Mar 25, 2017

@karelz Accepted invite!

@danmosemsft Thanks, I'll have a look at Benchmark.NET. I take it that sufficient proof is empirical evidence a new solution is significantly faster and there being a line of reasoning as to why that is? Should I write a performance test for this issue specifically? (Apologies if this is covered by the documentation, I haven't processed it all yet.)

@danmoseley
Copy link
Member

danmoseley commented Mar 25, 2017

@4real there aren't really hard rules but generally we're looking for

  1. parity or faster across all interesting ranges of inputs
  2. improvement is worth any increase in code complexity and maintenance cost. This depends on how heavily used/critical the API is. We have taken and will continue to take quite hairy changes eg in Encoding or StringBuilder because those show up in key performance scenarios. In something that's less performance critical we are more wary of accepting code complexity. Of course lower in the stack, the worse it is if we have a bad impact on some scenario. Sometimes changes take a long time to get right and often they don't work out and have to be aborted.
  3. no affect on functional behavior and acceptably low risk of breaking something

Of course there are plenty of perf changes where it's a no brainer and sometimes we don't even bother measuring (eg just removing redundant code).

@danmoseley
Copy link
Member

Incidentally for profiling you can use the Visual Studio 2017 profiler but many of us use PerfView as it's owned by our team. It's very powerful and has good tutorials.

@danmoseley danmoseley assigned 4real and unassigned danmoseley Mar 25, 2017
@4real
Copy link

4real commented Mar 25, 2017

Thanks, I'll have a look at PerfView too.

From what I can tell, there are XUnit.Performance tests in the codebase for the GC and JIT, but not mscorlib. So instead, I will create a Benchmark.NET performance test for string.Split which will not be included in the commit, but for which I will post the results of here - does this seem like the correct approach? I will however create a functional test to show correctness of changes to string.Split if deemed not covered by existing tests.

@danmoseley
Copy link
Member

@4real correct, that's what we normally do -- have a throwaway perf test -- of course you could archive it in a gist with a link in the PR in case we want it again.

There are lots of XUnit.Performance tests in the repo and they are run regularly but those are to catch regressions and there is a certain cost to maintaining them (watching results). So we could consider adding one if it was an important scenario that might regress. Normally we don't. In fact we may delete some of those tests -- they need an audit.

@danmoseley
Copy link
Member

@4real still planning to take a look? just realized I never checked back. it would be nice to optmize this.

@lkts
Copy link

lkts commented Nov 30, 2017

I would like to give it a try if nobody is looking

@4real
Copy link

4real commented Nov 30, 2017

Hello, feel free to take it over, I've just realized how much time this has been resting. My attention has been shifted elsewhere unfortunately and I wouldn't start working on this within reasonable time.

@karelz
Copy link
Member

karelz commented Nov 30, 2017

@cod7alex I sent you Collaborator invite - that will allow to assign the issue to you (GH limitation). Ping me when you accept.
ProTip: Accepting will automatically subscribe you to all notifications. There's plenty (500+ per day). We recommend to disable them and use just notifications for your mentions and explicit subscriptions to issues.

@danmoseley danmoseley assigned lkts and unassigned 4real Nov 30, 2017
@danmoseley
Copy link
Member

@cod7alex to restate the above, we typically use PerfView to profile, Benchmark.NET to prove the win (post results here).

To run corefx tests (where most tests are) against changes made in coreclr (where string is) look at https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/developer-guide.md#testing-with-private-coreclr-bits to see the flag to force this to happen.

If the benchmark results and change is satisfactory we will want to check in benchmark tests.Currently we still use xunit.perf for checked-in tests (that may change) - see existing tests in corefx\src\System.Runtime\tests\Performance\Perf.String.cs. This may be helpful
https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/performance-tests.md

I think we need better docs on using Benchmark.NET against private coreclr changes, based on https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/dogfooding.md. Hopefully @ViktorHofer can throw up some notes he has. We can also help answer questions.

@lkts
Copy link

lkts commented Nov 30, 2017

thank you @danmosemsft

@ViktorHofer
Copy link
Member

ViktorHofer commented Nov 30, 2017

BenchmarkDotNet instructions not yet reviewed by someone else available here: dotnet/corefx#25612

@lkts
Copy link

lkts commented Dec 2, 2017

@ViktorHofer this guide is very helpful, i was able to run benchmark.
First thing i have done was to use stackalloc for int array when string length is less than 512.
I must have done something wrong, will attach benchmarks later.

@lkts
Copy link

lkts commented Dec 6, 2017

Can someone help me with Spans? If i have the code like the following, can i pass Span to function and fill it and what is the way to do it if yes? Is it even a good idea?

private unsafe Span<int> CreateSeparatorList(int length)
{
    if (Length < 512)
    {
        int* stackBuffer = stackalloc int[length];
        return new Span<int>(stackBuffer, length);
    }
    return new int[length];
}

@Rattenkrieg
Copy link
Contributor

@cod7alex this is explicitly forbidden by design of Span. Memory it points will be reclaimed when you leave CreateSeparatorList's activation record. Spans (actually anything stacalloced) can only be passed from callers to callees.

@lkts
Copy link

lkts commented Dec 6, 2017

thanks @Rattenkrieg, I am not really familiar with stackalloc. So it means I should inline this everywhere it is needed and return the filled span from methods?

@Rattenkrieg
Copy link
Contributor

@cod7alex if you know desired size in advance you can allocate span in top level method and pass it by ref where it will be populated.
if you will ever hit unexplainable performance degradation caused by stackalloc this may shed some light: #6122 dotnet/coreclr#8534 #4384

@lkts
Copy link

lkts commented Dec 7, 2017

@Rattenkrieg many thanks

@jamesqo
Copy link
Contributor Author

jamesqo commented Dec 7, 2017

@cod7alex By the way, you don't have to stackalloc a temporary ptr and then create a span from that. You can just stackalloc a Span directly. https://github.com/dotnet/corefx/pull/24212/files

@lkts
Copy link

lkts commented Dec 7, 2017

@jamesqo yes, i have found that information recently. Thank you.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Runtime enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors
Projects
None yet
Development

No branches or pull requests

10 participants