Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary allocations in Hash task. #7162

Merged
merged 11 commits into from
Jan 21, 2022

Conversation

AR-May
Copy link
Member

@AR-May AR-May commented Dec 20, 2021

Fixes #7086

Context

Hash.Execute() allocates a string which gets to the large object heap. This could be avoided without changing the resulting hash function.

Changes Made

Hash function is rewritten.

Testing

Unit tests & manual testing

@AR-May AR-May marked this pull request as draft December 20, 2021 19:20
@AR-May
Copy link
Member Author

AR-May commented Dec 21, 2021

Benchmark results

Method itemsN Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated Code Size
Hash_old 1 4.160 us 0.0135 us 0.0113 us 0.3815 - - 1.58 KB 0.05 KB
Hash_new 1 3.909 us 0.0201 us 0.0168 us 0.5188 - - 2.13 KB 0.05 KB
Hash_old 50 21.633 us 0.4320 us 0.4802 us 4.6997 - - 19.36 KB 0.05 KB
Hash_new 50 19.812 us 0.2839 us 0.2370 us 0.5188 - - 2.22 KB 0.05 KB
Hash_old 100 38.248 us 0.2357 us 0.1841 us 9.0332 - - 37.3 KB 0.05 KB
Hash_new 100 35.920 us 0.3721 us 0.3299 us 0.5493 - - 2.38 KB 0.05 KB
Hash_old 200 74.621 us 1.3318 us 1.2458 us 17.9443 1.9531 - 73.71 KB 0.05 KB
Hash_new 200 68.415 us 0.4927 us 0.4608 us 0.4883 - - 2.3 KB 0.05 KB
Hash_old 500 175.176 us 1.9376 us 1.7177 us 43.9453 9.7656 - 182.67 KB 0.05 KB
Hash_new 500 165.182 us 1.3158 us 1.0273 us 0.4883 - - 2.07 KB 0.05 KB
Hash_old 1000 457.556 us 3.5648 us 2.7831 us 90.8203 90.8203 90.8203 364.31 KB 0.05 KB
Hash_new 1000 323.996 us 2.5561 us 2.2659 us 0.4883 - - 2.18 KB 0.05 KB
Hash_old 2000 805.439 us 6.8509 us 6.0732 us 230.4688 230.4688 230.4688 732.84 KB 0.05 KB
Hash_new 2000 656.162 us 7.2575 us 6.4336 us - - - 2.38 KB 0.05 KB


var hashStringSize = sha1.HashSize;
int maxItemStringSize = encoding.GetMaxByteCount(Math.Min(ComputeMaxItemSpecLength(ItemsToHash), maxInputChunkLength));
byte[] bytesBuffer = System.Buffers.ArrayPool<byte>.Shared.Rent(maxItemStringSize);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I am not completely sure that it is a good idea to rent a buffer from a pool.
We might just use allocations instead here and it will not be much worse.
What do you think of using System.Buffers in a task? Could it lead to any problems? For example, for users who would like to use this task somehow?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They'd have to JIT something extra if they hadn't already...it looks like we use System.Buffers in InterningBinaryReader.cs and System.Buffers.Binary in NodeProviderOutOfProcBase.cs. I'm not sure if System.Buffers.Binary is a separate assembly, but if it is, maybe not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If System.Buffers is present (net6+), the BCL already uses it in many places, so I guess it's JITted already

@AR-May AR-May marked this pull request as ready for review December 21, 2021 17:10
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved

var hashStringSize = sha1.HashSize;
int maxItemStringSize = encoding.GetMaxByteCount(Math.Min(ComputeMaxItemSpecLength(ItemsToHash), maxInputChunkLength));
byte[] bytesBuffer = System.Buffers.ArrayPool<byte>.Shared.Rent(maxItemStringSize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They'd have to JIT something extra if they hadn't already...it looks like we use System.Buffers in InterningBinaryReader.cs and System.Buffers.Binary in NodeProviderOutOfProcBase.cs. I'm not sure if System.Buffers.Binary is a separate assembly, but if it is, maybe not?

{
foreach (var item in ItemsToHash)
string itemSpec = IgnoreCase ? ItemsToHash[i].ItemSpec.ToUpperInvariant() : ItemsToHash[i].ItemSpec;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible benefit of char-by-char copying and capitalizing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I do not yet have a good idea how I can make an efficient char-by-char in place capitalization, without extra allocations and in a way that code is short and clear. I am not sure this question is worth looking further, because when msbuild is using this task, these itemSpec are just file paths and not huge strings.

src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
@Therzok
Copy link
Contributor

Therzok commented Dec 21, 2021

They'd have to JIT something extra if they hadn't already...it looks like we use System.Buffers in InterningBinaryReader.cs and System.Buffers.Binary in NodeProviderOutOfProcBase.cs. I'm not sure if System.Buffers.Binary is a separate assembly, but if it is, maybe not?

Won't the BCL already use arraypool, hence it'd be jitted already?

@AR-May
Copy link
Member Author

AR-May commented Dec 22, 2021

Won't the BCL already use arraypool, hence it'd be jitted already?

I am not sure about that. Do you know anybody who we could ask and figure this out?

@Forgind
Copy link
Member

Forgind commented Dec 23, 2021

Won't the BCL already use arraypool, hence it'd be jitted already?

I am not sure about that. Do you know anybody who we could ask and figure this out?

@danmoseley maybe?

@danmoseley
Copy link
Member

Won't the BCL already use arraypool, hence it'd be jitted already?

I am not sure about that. Do you know anybody who we could ask and figure this out?

@danmoseley maybe?

Yes the core libraries use System.Buffers.ArrayPool<T>.Shared extensively, for byte and char. I would not expect MSBuild to add more jitting by using it for those.

Eventually I expect MSBuild could make use of ValueStringBuilder instead of StringBuilder in many places, and thus avoid pooling/allocating the StringBuilder as well. That's internal right now for safety reasons but there's discussion about making it public.

Copy link
Member

@ladipro ladipro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably sha1.TransformBlock has a constant cost which led to introducing the intermediate buffer. Do you have numbers on the perf characteristics of TransformBlock to back it up?

In other words, I am wondering how the code would perform without sha1Buffer, so if you called TransformBlock always, regardless of how small the incoming buffer is.

src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
@AR-May
Copy link
Member Author

AR-May commented Jan 3, 2022

Presumably sha1.TransformBlock has a constant cost which led to introducing the intermediate buffer. Do you have numbers on the perf characteristics of TransformBlock to back it up?

In other words, I am wondering how the code would perform without sha1Buffer, so if you called TransformBlock always, regardless of how small the incoming buffer is.

I definitely tried to write this without additional buffer in the first place, but it works longer in this case and makes me think about how to trade CPU for memory improvements. Adding this buffer I found a way not to think of it too much: both CPU and memory improves in this PR. I would need more time to repeat these measurements to show you numbers, but maybe you could check comments under previous PR about Hash function. There are such measurements too and they coincide with what I got. Their Hash1 is my Hash_old and Hash2 is the version with TransformBlock without any buffer. I suppose those measurements would be enough to get an insight how the code would perform.

Copy link
Member

@ladipro ladipro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Two random thoughts:

  1. Would wrapping sha1Buffer with a MemoryStream simplify the code or would it make it worse?
  2. Do existing unit tests cover all code paths?

@AR-May
Copy link
Member Author

AR-May commented Jan 3, 2022

  1. Would wrapping sha1Buffer with a MemoryStream simplify the code or would it make it worse?
  2. Do existing unit tests cover all code paths?
  1. Hm, it might make the code more readable right, but i am do not have time for it right now - I would need to run it through benchmark too, I suppose. I can make a follow up pr for that later. I suppose current version is readable enough.
  2. They do go through all the code paths, but not good enough, as for me. I suppose I can make some improvements there when I have more time, as a follow up. There is no long strings in the tests. Also, see the commit "Fix bug.". It could be a bug if the constants were different but worked correctly with current constants. This code path could not be hit in a wrong way when chunk size is less then sha1 buffer size. I caught it in my additional manual tests when I modified the sizes of the buffer and checked for random strings that output for old and new hash coincides. I suppose I might add constants modification to the tests for more coverage.

@AR-May
Copy link
Member Author

AR-May commented Jan 3, 2022

I ran the tests again: the failure was not related to my changes. Now the tests pass.

@Forgind
Copy link
Member

Forgind commented Jan 3, 2022

  1. Would wrapping sha1Buffer with a MemoryStream simplify the code or would it make it worse?
  2. Do existing unit tests cover all code paths?
  1. Hm, it might make the code more readable right, but i am do not have time for it right now - I would need to run it through benchmark too, I suppose. I can make a follow up pr for that later. I suppose current version is readable enough.
  2. They do go through all the code paths, but not good enough, as for me. I suppose I can make some improvements there when I have more time, as a follow up. There is no long strings in the tests. Also, see the commit "Fix bug.". It could be a bug if the constants were different but worked correctly with current constants. This code path could not be hit in a wrong way when chunk size is less then sha1 buffer size. I caught it in my additional manual tests when I modified the sizes of the buffer and checked for random strings that output for old and new hash coincides. I suppose I might add constants modification to the tests for more coverage.

Are you planning to incorporate any of these changes into this PR, or do you want it merged as-is and perhaps tackle some in a follow-up? (If follow-up, maybe make an issue for that?)

Copy link
Member

@ladipro ladipro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I synced with @AR-May offline. I think it would be better to incorporate some of the changes into this PR once she gets back from vacation.

src/Tasks/Hash.cs Outdated Show resolved Hide resolved
array[i] = new string[inputSize];
for (int j = 0; j < array[i].Length; j++)
{
array[i][j] = $"Item{i}{j}";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 1_000_000 iterations, is it? What is usual duration if this unit test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's 1 sec. It is less than the HashTaskDifferentImputSizesTest takes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duration of these tests worries me though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 sec is too much, IMHO, can we get it under 100ms?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is most unnecessary test of all. Maybe drop it at all?

Copy link
Member Author

@AR-May AR-May Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I would like to save the test with different sizes of input (HashTaskDifferentInputSizesTest) despite it takes 3 sec, cause it tests a dangerous place where one can easily make a mistake.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on trying to make the tests fast. I'd say 3 seconds is too much.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, got it. I will remove this test as not very necessary one and I will improve HashTaskDifferentInputSizesTest. So far I got ~0.5 sec on my machine instead of 3.

src/Tasks.UnitTests/Hash_Tests.cs Outdated Show resolved Hide resolved
src/Tasks.UnitTests/Hash_Tests.cs Outdated Show resolved Hide resolved
src/Tasks.UnitTests/Hash_Tests.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
src/Tasks/Hash.cs Outdated Show resolved Hide resolved
array[i] = new string[inputSize];
for (int j = 0; j < array[i].Length; j++)
{
array[i][j] = $"Item{i}{j}";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on trying to make the tests fast. I'd say 3 seconds is too much.

@Forgind Forgind added the merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now. label Jan 13, 2022
@ladipro ladipro merged commit f69c8fb into dotnet:main Jan 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now. performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hash.Execute() allocates a string which gets to the large object heap.
6 participants