-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hash.Execute() allocates a string which gets to the large object heap. #7086
Comments
You might use HashCode.Combine. It's in box for .NET Core 2.1+ but it's available in a NuGet package for .NET Framework so no multi targeting is necessary. |
Thank you for the suggestion. I looked at |
If you're relying on different objects to have different hash codes, that's not hashing and hash codes should not be used for this. You need some kind of unique identifier. If your goal is to just have a measurably small rate of collisions, hopefully a failure just results in lower efficiency rather than incorrect behavior. If this is your goal yes you would want to use a sufficiently large cryptographic hash, not a regular hash code as it has no particular guarantees and could choose to generate poorly distributed codes for efficiency reasons.
As far as I'm aware, all the hash code generation in the core libraries is stable except for String. We randomize the hash codes of string by default, to make DOS attacks more difficult. Of course it is possible to generate your own hashcodes for strings, if you need stable ones. |
Well, using hash for such goals is current behavior and initial goal of this issue was just to remove unnecessary LOH allocations in hash computations rather than rethinking the whole approach. Thinking of the usage of hash function, from one point of view, I agree that such usage of hash function is not quite common and indeed might be dangerous. Collision, as far as i know, will result in build error and the behavior of MSBuild thus would be not correct. cc @rainersigwald |
The cost of a collision here is silent incorrect underbuild, which is pretty bad as build errors go but was deemed to be acceptable for this case, especially since we shouldn't get any particularly adversarial input. As you say, the workaround is to do a full build. We're looking at options to improve this, for example #7043. For now I think you're on a fine track @AR-May. |
Fixes #7086 ### Context `Hash.Execute()` allocates a string which gets to the large object heap. This could be avoided without changing the resulting hash function. ### Changes Made Hash function is rewritten. ### Testing Unit tests & manual testing
I noticed that
Hash.Execute()
sometimes allocates a big string.It happens because we the first join all the items into one string and then apply hashing algorithm.
It would be nice to try hash it one by one or use a buffer with fixed length to break this big string into chunks.
Additional info:
There was an attempt to improve this function already: #5560
The text was updated successfully, but these errors were encountered: