Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Use stackalloc in string.Split #15435

Merged
merged 6 commits into from
Feb 5, 2018
Merged

Conversation

lkts
Copy link

@lkts lkts commented Dec 8, 2017

Adds usage of Span and stackalloc for strings that are not large. Allows to avoid allocations of int arrays.

Benchmarks:

|                    Method | Mean after | Mean before |  Mean diff |  Allocated after | Allocated before | Allocated diff |
|-------------------------- |-----------:|------------:|-----------:|-----------------:|-----------------:|---------------:|
|         SplitCharLength20 |   201.6 us |  213.80 us  |    5.71%   |     210.94 KB    |     312.50 KB    |     32.50%     |
|        SplitCharLength200 | 1,199.6 us | 1194.10 us  |   -0.46%   |    1546.88 KB    |    2351.56 KB    |     34.21%     |
|        SplitCharLength600 | 3,367.1 us | 3276.00 us  |   -2.78%   |    6882.81 KB    |    6882.81 KB    |       0%       |
|       SplitStringLength20 |   159.4 us |  181.80 us  |   12.32%   |     132.81 KB    |     234.38 KB    |     43.34%     |
|      SplitStringLength200 |   767.9 us |  844.40 us  |    9.06%   |     835.94 KB    |    1640.63 KB    |     49.05%     |
|      SplitStringLength600 | 2,160.8 us | 2253.70 us  |    4.12%   |    4765.63 KB    |    4765.63 KB    |       0%       |
|  SplitStringArrayLength20 |   308.5 us |  347.90 us  |   11.33%   |     210.94 KB    |     414.06 KB    |     49.06%     |
| SplitStringArrayLength200 | 1,838.7 us | 1848.60 us  |    0.54%   |    1195.31 KB    |    2804.69 KB    |     57.38%     |
| SplitStringArrayLength600 | 5,171.7 us | 5155.00 us  |   -0.32%   |    8117.19 KB    |    8117.19 KB    |       0%       |

Benchmark code:
https://gist.github.com/cod7alex/f123c3e662d21c3327399aa71d338485

Closes https://github.com/dotnet/coreclr/issues/6136

@danmosemsft @stephentoub

@@ -1160,7 +1162,7 @@ private String[] SplitInternal(ReadOnlySpan<char> separators, int count, StringS
return new String[] { this };
}

int[] sepList = new int[Length];
Span<int> sepList = Length < StackallocStringLengthLimit ? stackalloc int[Length] : new int[Length];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to avoid the new allocation here every time even for larger Lengths.

We may want to introduce internal ValueListBuilder to help with it. It would be similar to https://github.com/dotnet/coreclr/blob/master/src/mscorlib/shared/System/Text/ValueStringBuilder.cs, but it would be much simpler without all the string specific handling (e.g. it should just have a single Append method that takes T).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should i create issue for it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to fix this as part of #6136 - the use of ArrayPool is mentioned there as one of ideas. It should also help with cases where your current fix makes things a bit slower.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, i will take a look

Copy link
Author

@lkts lkts Dec 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas what API for it do you expect? We can`t return array to pool while returning result as ValueStringBuilder does.
See initial version https://gist.github.com/cod7alex/d1299959de8af5a18e9107c304ee1343

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the shape should be a bit different to make it work for this use case. E.g. there can be AsReadOnlySpan() method that returns the current list as ReadOnlySpan<T> and there should be Dispose() method that returns the array back to the bool if there is any.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if you think that it would be better to give this some other name than ValueListBuilder - feel free to change it.

int[] sepList = new int[Length];
int[] lengthList;
Span<int> sepList = Length < StackallocStringLengthLimit ? stackalloc int[Length] : new int[Length];
Span<int> lengthList = singleSeparator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be cheaper to do Span<int> lengthList = new Span<int>() here and put Length < StackallocStringLengthLimit ? stackalloc int[Length] : new int[Length]; inside the else clause below? That saves a branch, but adds newing up a struct, so I don't know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most expensive part in the current code is variable sized-stackalloc. Fixed-size stackalloc is much cheaper,..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why that is? Both cases zero the memory right, or does the compiler determine it doesn't need to?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both cases zero the memory right

No. We have zero-init for stackalloc disabled in CoreLib.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along the lines of @danmosemsft's comment, why not just keep the existing structure of the code, which would avoid having to check singleSeparator twice? e.g.:

Span<int> sepList = Length < StackallocStringLengthLimit ? stackalloc int[Length] : new int[Length];
Span<int> lengthList;
int defaultLength;
int numReplaces;

if (singleSeparator)
{
    lengthList = default;
    defaultLength = separator.Length;
    numReplaces = MakeSeparatorList(separator, sepList);
}
else
{
    lengthList = Length < StackallocStringLengthLimit ? stackalloc int[Length] : new int[Length];
    defaultLength = 0;
    numReplaces = MakeSeparatorList(separators, sepList, lengthList);
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinvp it does not compile with A result of a stackalloc expression of type 'Span<int>' canno t be used in this context because it may be exposed outside of the containing method

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try making it Span<int> lengthList = stackalloc int[0];, e.g.:

Span<int> sepList = Length < StackallocStringLengthLimit ? stackalloc int[Length] : new int[Length];
Span<int> lengthList = stackalloc int[0];
int defaultLength;
int numReplaces;

if (singleSeparator)
{
    defaultLength = separator.Length;
    numReplaces = MakeSeparatorList(separator, sepList);
}
else
{
    lengthList = Length < StackallocStringLengthLimit ? stackalloc int[Length] : new int[Length];
    defaultLength = 0;
    numReplaces = MakeSeparatorList(separators, sepList, lengthList);
}

Reference: dotnet/corefx#25426 (comment)

Try making this Span temp = stackalloc int[0];

Otherwise the local is classified as returnable and later you cannot mix it with stackallocated ones

stackalloc int[0]; is basically a noop, but will have effect of marking the temp as not returnable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that works

@AndyAyersMS
Copy link
Member

Do we have data on the distribution of string lengths seen by split?

If very short strings are common we might special case the 32 bytes or less case too, as current jit will optimize these further (#14623). If you can show the distribution peaks at a value not too much larger than 32 we could perhaps increase the threshold to extend the range of the jit optimization. I left it at 32 because larger values started causing local offset code bloat and I did not know of any cases that would benefit.

@@ -1270,7 +1272,7 @@ private String[] SplitInternal(String separator, String[] separators, Int32 coun
// the original string will be returned regardless of the count.
//

private String[] SplitKeepEmptyEntries(Int32[] sepList, Int32[] lengthList, Int32 defaultLength, Int32 numReplaces, int count)
private string[] SplitKeepEmptyEntries(Span<int> sepList, Span<int> lengthList, int defaultLength, int numReplaces, int count)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change sepList and lengthList to be typed as ReadOnlySpan<int> instead of Span<int> since the spans are not modified in this method? The call sites wouldn't need to change as there's an implicit conversion from Span<T> to ReadOnlySpan<T>.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks

@@ -1308,7 +1310,7 @@ private String[] SplitKeepEmptyEntries(Int32[] sepList, Int32[] lengthList, Int3


// This function will not keep the Empty String
private String[] SplitOmitEmptyEntries(Int32[] sepList, Int32[] lengthList, Int32 defaultLength, Int32 numReplaces, int count)
private string[] SplitOmitEmptyEntries(Span<int> sepList, Span<int> lengthList, int defaultLength, int numReplaces, int count)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change sepList and lengthList to be typed as ReadOnlySpan<int> instead of Span<int> since the spans are not modified in this method?

else
{
Grow();
Append(item);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to implement like this without recursive call:

if (pos >= _span.Length)
    Grow();

_span[pos] = item;
_pos = pos + 1;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely, thank you

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to implement like this without recursive call

Why? I could be misremembering, but I thought I'd looked at the asm for both and the fast path was better this way. Maybe not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code generators tend to convert recursive calls like these into loops.

If you have verified by looking at the disassembly that this gives the best code, it is fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong and would need to look again.

@lkts
Copy link
Author

lkts commented Dec 12, 2017

@jkotas I made changes only for char parameter version now, am i moving in right direction?
Benchmarks:

|             Method |       Mean |     Error |    StdDev |     Gen 0 |  Allocated |
|------------------- |-----------:|----------:|----------:|----------:|-----------:|
|  SplitCharLength20 |   198.8 us |  1.642 us |  1.536 us |   68.6035 |  210.94 KB |
| SplitCharLength200 | 1,131.7 us | 10.376 us |  9.706 us |  501.9531 | 1546.88 KB |
| SplitCharLength600 | 3,217.5 us | 22.944 us | 21.461 us | 1468.7500 | 4515.63 KB |

Looked a bit at string versions and they seem to be not so easy to change due to logic around lengthList.

@@ -1162,23 +1162,31 @@ private String[] SplitInternal(ReadOnlySpan<char> separators, int count, StringS
return new String[] { this };
}

Span<int> sepList = Length < StackallocStringLengthLimit ? stackalloc int[Length] : new int[Length];
int numReplaces = MakeSeparatorList(separators, sepList);
Span<int> initialSpan = Length < StackallocStringLengthLimit ? stackalloc int[Length] : stackalloc int[StackallocStringLengthLimit];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can always allocate the full buffer here: stackalloc int[StackallocStringLengthLimit].

}
}
}
break;
}

return foundCount;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think it maybe a bit easier to understand it the AsReadOnlySpan call is done by the caller.

@jkotas
Copy link
Member

jkotas commented Dec 12, 2017

@cod7alex Yes, I think you are on the right path.

@danmoseley
Copy link
Member

Do we have data on the distribution of string lengths seen by split?

@billwert for this question

@lkts
Copy link
Author

lkts commented Dec 16, 2017

Latest version has two problematic benchmarks, see below (i have added some new tests and increased iteration count to better see the trend). Not sure what is the reason to be honest, for second case it may be Grow() i guess.

Method Mean after Mean before Mean diff (%) Allocated after Allocated before Allocated diff (%)
SplitCharLength20 2.023 ms 2.082 ms 2.83 2109.38 KB 3.05 MB 32.46
SplitCharLength200 11.159 ms 11.388 ms 2.01 15468.75 KB 22.96 MB 34.21
SplitCharLength600 31.034 ms 31.345 ms 0.99 45156.25 KB 67.21 MB 34.39
SplitStringLength20 1.934 ms 1.814 ms -6.62 1328.13 KB 2.29 MB 43.36
SplitStringLength200 7.912 ms 8.231 ms 3.88 8359.38 KB 16.02 MB 49.04
SplitStringLength600 20.456 ms 23.353 ms 12.41 23984.38 KB 46.54 MB 49.67
SplitStringLength600RemoveEmpty 21.132 ms 23.097 ms 8.51 26640.63 KB 49.13 MB 47.05
SplitStringArrayLength20 3.330 ms 3.466 ms 3.92 2109.38 KB 4.04 MB 49.01
SplitStringArrayLength200 17.739 ms 18.109 ms 2.04 11953.13 KB 27.39 MB 57.38
SplitStringArrayLength600 49.494 ms 52.430 ms 5.60 33828.13 KB 79.27 MB 58.33
SplitStringArrayLength600RemoveEmpty 49.959 ms 50.931 ms 1.91 38828.13 KB 84.15 MB 54.94
SplitCharLength1000NoMatches 6.073 ms 7.319 ms 17.02 312.5 KB 38.68 MB 99.21
SplitCharLength1000AllCharsAreSeparators 69.571 ms 63.715 ms -9.19 78437.5 KB 114.97 MB 33.37

@jkotas
Copy link
Member

jkotas commented Dec 16, 2017

Not sure what is the reason to be honest

This is allocation heavy benchmark. It is not unusual to see fluctuations like these in allocation heavy benchmarks. The GC kicks in at random spots. These random spots change as you change the allocation pattern. It may result into slowing down innocent victims.

To prove this theory, you can try running the individual tests in a different order, or changing between workstation and server GC.


int foundCount = 0;
int sepListCount = sepList.Length;
int currentSepLength = separator.Length;

fixed (char* pwzChars = &_firstChar)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unsafe code should not be necessary now that you have simplified the loop condition. for (int i = 0; i < s.Length; i++) { s[i] is a pattern that the JIT bounds-check elimination is pretty good at recognizing.


namespace System
{
public partial class String
{
private const int StackallocStringLengthLimit = 512;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

512 feels a bit on the high side. Some of the methods allocate twice the size, and they will have 4kB+ stack frame.

It may be worth looking at the right size for this - check what is the smallest size that helps microbenchmark results.


namespace System
{
public partial class String
{
private const int StackallocStringLengthLimit = 512;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not have much to do with StringLength anymore. Rename it?

@lkts
Copy link
Author

lkts commented Dec 18, 2017

Case with 1000 separators is still slower with suggested experiments. PerfView CPU stacks:
before:

Name Exc % Exc Inc % Inc
stringsplitbenchmarks!StringSplitBenchmarks.Program.Main() 0,0 6 98,8 21 769
system.private.corelib.il!String.SplitInternal 0,3 74 98,8 21 763
system.private.corelib.il!String.SplitKeepEmptyEntries 0,4 86 98,3 21 654
system.private.corelib.il!System.String.Substring(int32,int32) 0,4 81 97,8 21 545

after:

Name Exc % Exc Inc % Inc
stringsplitbenchmarks!StringSplitBenchmarks.Program.Main() 0,0 3 98,9 23 317
system.private.corelib.il!String.SplitInternal 0,0 2 98,8 23 306
system.private.corelib.il!String.SplitKeepEmptyEntries 0,1 26 93,2 21 972
system.private.corelib.il!System.String.Substring(int32,int32) 0,2 52 92,5 21 823
system.private.corelib.il!String.MakeSeparatorList 0,1 20 5,3 1 246
system.private.corelib.il!System.Buffers.ValueListBuilder`1[System.Int32].Grow() 0,0 1 5,1 1 201
system.private.corelib.il!System.Span1[System.Int32].TryCopyTo(value class System.Span1) 0,0 0 1,9 448
system.private.corelib.il!System.Buffers.ConfigurableArrayPool`1[System.Int32].Rent(int32) 0,7 170 1,4 330
system.private.corelib.il!System.Buffers.ConfigurableArrayPool`1[System.Int32].Return(!0[],bool) 0,8 190 1,4 330
system.private.corelib.il!System.Span.CopyTo(!!0&,!!0&,int32) 0,8 177 1,3 307

@jkotas
Copy link
Member

jkotas commented Dec 19, 2017

The problem is that the MakeSeparatorList is 2x slower for SplitCharLength1000AllCharsAreSeparators with this change. I do not think that the copy is the problem. The problem seems to be in the ValueListBuilder.Append method. This method is causing the hot loop in MakeSeparatorList to have about 2x more instructions.

Try to tweak the implementation of this method. E.g. making it look like the similar method in ValueStringBuilder may help:

Also, changing if (pos < _span.Length) to if ((uint)pos < (uint)_span.Length) may help because it should allow JIT to eliminate the bounds check.

@lkts
Copy link
Author

lkts commented Dec 25, 2017

@jkotas Sorry, my results were for some weird test version of code, missed that. The reason is truly Append, and changing it helps to get better results, but it is still slower than before (btw uint trick does not change anything).

In my opinion, it is not possible to achieve same performance for this test case (SplitCharLength1000AllCharsAreSeparators) with this approach since we can not eliminate the if statement, which is additional logic on each append comparing to array insert.

What i think is possible to do is to slightly tweak first version of PR so it gets stackalloc`ed Span or array from pool depending on string length instead of just allocation of array for big strings.

@jkotas
Copy link
Member

jkotas commented Dec 25, 2017

it is still slower than before

How much slower? I think it is acceptable to take some regression for this corner case with this change.

@lkts
Copy link
Author

lkts commented Dec 26, 2017

Actually, I don`t see the improvement while running clean benchmark with just that test, sorry about confusion once more. Moving to the same layout as list builder has same results (~8% slower) or there are very small diffs like 1ms.

@@ -0,0 +1,61 @@
using System.Diagnostics;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add the standard license header

@jkotas
Copy link
Member

jkotas commented Dec 26, 2017

I think it is fine to take the 8% regression for the case of long strings that are all separators here.

Could you please address the other feedback? It looks fine to me otherwise.

using System.Diagnostics;
using System.Runtime.CompilerServices;

namespace System.Buffers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: System.Collections.Generic may be more appropriate namespace for this.

@@ -614,6 +614,7 @@
<Compile Include="$(BclSourcesRoot)\mscorlib.Friends.cs" />
</ItemGroup>
<ItemGroup>
<Compile Include="shared\System\Buffers\ValueListBuilder.cs" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add this to src\mscorlib\shared\System.Private.CoreLib.Shared.projitems instead. (More details: https://github.com/dotnet/coreclr/blob/master/src/mscorlib/shared/README.md)

@jkotas
Copy link
Member

jkotas commented Jan 31, 2018

You maybe again just seeing a noise from the GC. Would you mind pushing your latest changes to github so we can take a look?

@lkts
Copy link
Author

lkts commented Jan 31, 2018

Yes, sure

@jkotas
Copy link
Member

jkotas commented Feb 2, 2018

@dotnet-bot test this please

@jkotas
Copy link
Member

jkotas commented Feb 2, 2018

@dotnet-bot test Ubuntu x64 Checked corefx_baseline please
@dotnet-bot test Windows_NT x64 Checked corefx_baseline please

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

There is about ~10% regression in the corner case of longer string of separators. I think it is a good trade-off given the other improvements.

@jkotas
Copy link
Member

jkotas commented Feb 2, 2018

@dotnet-bot test Ubuntu x64 Checked corefx_baseline please
@dotnet-bot test Windows_NT x64 Checked corefx_baseline please

@lkts
Copy link
Author

lkts commented Feb 2, 2018

Thanks @jkotas and guys for reviews and help. Do i need to do anything about test failures? They look unrelated to me.

@jkotas
Copy link
Member

jkotas commented Feb 2, 2018

@dotnet-bot test Ubuntu x64 Checked corefx_baseline please
@dotnet-bot test Windows_NT x64 Checked corefx_baseline please

@jkotas
Copy link
Member

jkotas commented Feb 2, 2018

@cod7alex The failures are infrastructure issues. Nothing required from you.

@jkotas
Copy link
Member

jkotas commented Feb 3, 2018

@dotnet-bot test this please

@jkotas
Copy link
Member

jkotas commented Feb 4, 2018

@dotnet-bot test Ubuntu x64 Checked corefx_baseline please
@dotnet-bot test Windows_NT x64 Checked corefx_baseline please

1 similar comment
@jkotas
Copy link
Member

jkotas commented Feb 4, 2018

@dotnet-bot test Ubuntu x64 Checked corefx_baseline please
@dotnet-bot test Windows_NT x64 Checked corefx_baseline please

@jkotas jkotas merged commit d062934 into dotnet:master Feb 5, 2018
@lkts lkts deleted the string-split-stackalloc branch February 5, 2018 18:09
dotnet-bot pushed a commit to dotnet/corefx that referenced this pull request Feb 5, 2018
* Use stackalloc in string.Split

* Added initial usage of ValueListBuilder

* Added usage of list builder to string separator Split overloads

Signed-off-by: dotnet-bot-corefx-mirror <[email protected]>
jkotas pushed a commit to dotnet/corefx that referenced this pull request Feb 5, 2018
* Use stackalloc in string.Split

* Added initial usage of ValueListBuilder

* Added usage of list builder to string separator Split overloads

Signed-off-by: dotnet-bot-corefx-mirror <[email protected]>
dotnet-bot pushed a commit to dotnet/corert that referenced this pull request Feb 11, 2018
* Use stackalloc in string.Split

* Added initial usage of ValueListBuilder

* Added usage of list builder to string separator Split overloads

Signed-off-by: dotnet-bot <[email protected]>
jkotas pushed a commit to dotnet/corert that referenced this pull request Feb 12, 2018
* Use stackalloc in string.Split

* Added initial usage of ValueListBuilder

* Added usage of list builder to string separator Split overloads

Signed-off-by: dotnet-bot <[email protected]>
@jamesqo
Copy link

jamesqo commented Feb 19, 2018

Nice job! ❤️

@jefffischer-episerver
Copy link

Hi, @lkts & @jkotas,

I know I'm a little late to the party, but my original suggestion when reporting this issue was to break down Split into two main paths:

  • Single Delimiter (~80% scenario)
  • Multi-Delimiter (~20% scenario)

The latter scenario is where int arrays seem to be necessary. Adding int array allocations into the single delimiter path, ~80-95%, scenario seems to only add unnecessary allocation to the most common scenario, consequently slowing it.

I've ran into many scenarios doing .ETL type operations, arguably better done elsewhere, that have required parsing entire files of delimited data where the char length can easily be >1k and you're running through MB of data. The int arrays remaining in this single-delimiter path is going to be expensive no matter what.

Thoughts on branching up front based on array length of 1 with an alternate int-array-free path?

Thanks,
Jeff

@danmoseley
Copy link
Member

@jefffischer-episerver my suggestion is to open a new issue. It's easy to miss comments on closed issues.

@jkotas
Copy link
Member

jkotas commented Feb 10, 2020

The int arrays remaining in this single-delimiter path is going to be expensive no matter what.

FWIW, there are no int arrays on the single-delimiter path after this change. The single-delimiter path will use stack allocated Span<int> after this change that should be about as good as your suggestion.

@jefffischer-episerver
Copy link

Thanks, @jkotas, unless I'm mistaken I don't believe that's the case.

It should be unnecessary to pass through the string twice in the single-delimiter path. The call to MakeSeparator should be eliminated from that path and both SplitOmitEmptyEntries and SplitKeepEmptyEntries should be overloaded to accept a single string delimiter, instead.

These overloads should do the unsafe string parsing, rather than MakeSeparator.

Currently, there is about a 10x (the string size) Gen 0 memory allocation, we should look to bring that down to ~1x + some extra allocations (the string size) in the single-delimiter path. Maybe Span speeds things up, but it's not doing anything to Gen 0 mem allocations.

@jkotas
Copy link
Member

jkotas commented Feb 10, 2020

As @danmosemsft suggested, please open a new issue on this. There should not be any 10x (the string size) Gen 0 memory allocation the way it is implemented right now.

@jefffischer-episerver
Copy link

jefffischer-episerver commented Feb 10, 2020

Apologies, I see that it's dependent on the density of delimiters. Catching up on .NET core changes.

private void MakeSeparatorList(string[] separators, ref ValueListBuilder<int> sepListBuilder, ref ValueListBuilder<int> lengthListBuilder)

Given a high density of delimiters per line and an int (4 bytes) allocated for each delimited char, the numbers aren't 10x, but higher than they truly need to be.

Thanks for the recommendation on opening another issue.

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Use stackalloc in string.Split

* Added initial usage of ValueListBuilder

* Added usage of list builder to string separator Split overloads


Commit migrated from dotnet/coreclr@d062934
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants