-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Format/Parse binary from/to BigInteger #85392
Format/Parse binary from/to BigInteger #85392
Conversation
Tagging subscribers to this area: @dotnet/area-system-numerics Issue DetailsFix #83619
|
} | ||
|
||
// each byte is typically eight chars | ||
var sb = new ValueStringBuilder(stackalloc char[Math.Min(charCount, 512)]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe
var sb = (charCount <= 512)
? new ValueStringBuilder(stackalloc char[charCount])
: new ValueStringBuilder(charCount);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, didn't know this pattern can mark sb
as 'unsafe to return' for stackalloc before, thanks !
…t 'unsafe to return' so that can assign stackalloced.
…ion span, so to save extra buffer allocation and copy in ValueStringBuilder.
@@ -1002,6 +1096,103 @@ internal static char ParseFormatSpecifier(ReadOnlySpan<char> format, out int dig | |||
} | |||
} | |||
|
|||
private static string? FormatBigIntegerToBin(bool targetSpan, BigInteger value, int digits, Span<char> destination, out int charsWritten, out bool spanSuccess) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike existing FormatBigIntegerToHex()
, this FormatBigIntegerToBin()
is implemented by calculating required char length before format, this can:
- avoids from extending ValueStringBuilder's capacity during append char;
- for targetSpan flow, formatted chars can write to wanted destination span directly rather than allocate buffer in ValueStringBuilder and copy to destination at the end
Please give advice if not proper, thanks !
@@ -440,11 +440,11 @@ private static unsafe bool ParseNumber(ref char* str, char* strEnd, NumberStyles | |||
int digEnd = 0; | |||
while (true) | |||
{ | |||
if (char.IsAsciiDigit(ch) || (((options & NumberStyles.AllowHexSpecifier) != 0) && char.IsBetween((char)(ch | 0x20), 'a', 'f'))) | |||
if ((options & NumberStyles.AllowBinarySpecifier) == 0 ? char.IsAsciiDigit(ch) || ((options & NumberStyles.AllowHexSpecifier) != 0 && char.IsBetween((char)(ch | 0x20), 'a', 'f')) : char.IsBetween(ch, '0', '1')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would that last bit of the ternary be better as : (char == '0' || char == '1)
instead of the char.IsBetween(ch, '0', '1')
? Seems we're really driving toward optimal performance :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also struggled on these 2, but selected to use char.IsBetween() based on its implementation because it can always just use one check but need 2 arithmetic operation..
Np, changed to (char == '0' || char == '1)
now, thanks.
{ | ||
state |= StateDigits; | ||
|
||
if (ch != '0' || (state & StateNonZero) != 0 || (bigNumber && ((options & NumberStyles.AllowHexSpecifier) != 0))) | ||
if (ch != '0' || (state & StateNonZero) != 0 || (bigNumber && (options & (NumberStyles.AllowHexSpecifier | NumberStyles.AllowBinarySpecifier)) != 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we hoist this options
test and the ones on line 443 to outside the loop so they are only bool
flags by the time we're inside the loop?
Perhaps insert around line 440...
const bool allowBinary = (options & NumberStyles.AllowBinarySpecifier) != 0 ;
const bool allowHex = (options & NumberStyles.AllowHexSpecifier) != 0 ;
const bool allowLeadingZero = allowBinary || allowHex;
Then inside the loop, at line 443 looks like
if (allowBinary
? (char == '0' || char == '1)
: char.IsAsciiDigit(ch) || (allowHex && char.IsBetween((char)(ch | 0x20), 'a', 'f')))
and line 447 becomes
if (ch != '0' || (state & StateNonZero) != 0 || (bigNumber && allowLeadingZero))
The same thing could be done for options tested elsewhere inside the loop if performance testing makes that wise...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also not sure what's going on with the bigNumber &&
here. This flag is set to indicate that we're backed by a StringBuilder, not that we're parsing a particular kind of number...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love all the new tests :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we hoist this
options
test
Got your point. As you found all the existing options tests code were inside whole loop, and during run time some option tests may happen only when data is met, so not sure whether we should hoist 'options' test which will turn to be called once but always happen. That is why I did not change this convention :)
I am also not sure what's going on with the
bigNumber &&
here.
I am also not sure, so that I just added Binary stuff behind it like Hex :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitting it apart for readability is generally helpful. The logic is getting quite verbose to have on a single line IMO.
For the primitive types, we have it split apart into some early checks that go to TryParseBinaryInteger*Style
: https://source.dot.net/#System.Private.CoreLib/src/libraries/System.Private.CoreLib/src/System/Number.Parsing.cs,437
The check for "IsValidChar" can then be specialized by use of the generics and made a lot cheaper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tannergooding Fixed. After introducing Hex/Bin parser types, here simplifies not only prev line 447, but also prev line 443 which involved hex/binary checks previously. Please check, thanks.
…ions for previous Hex and new Binary.
{ | ||
public static bool IsValidChar(char c) => char.IsAsciiDigit(c); | ||
|
||
public static bool IsHexBinary() => false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: IsHexOrBinary
-or- maybe IsHexOrBinaryParser
and rename IDigitValidator
to IDigitParser
, to more closely match the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Longer term I'd like to see this move even closer to the IHexOrBinaryParser
logic we have in corelib, but that can be done in a future PR and isn't necessary as part of this.
Having the names and general logic mostly start to line up helps with that, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks.
src/libraries/System.Runtime.Numerics/src/System/Numerics/BigNumber.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Runtime.Numerics/src/System/Numerics/BigNumber.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Runtime.Numerics/src/System/Numerics/BigNumber.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Runtime.Numerics/src/System/Numerics/BigNumber.cs
Outdated
Show resolved
Hide resolved
…ern similar as HexNumberToBigInteger's
@tannergooding all comments are fixed, please check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
It would be good to get a secondary review from someone. CC. @stephentoub since you've helped on similar bits for the corelib side (let me know if you're busy and I can find someone else).
@@ -511,6 +515,112 @@ private static ParsingStatus HexNumberToBigInteger(ref BigNumberBuffer number, o | |||
} | |||
} | |||
|
|||
private static ParsingStatus BinNumberToBigInteger(ref BigNumberBuffer number, out BigInteger result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can we expand "Bin" to "Binary" in the name? I realize this is likely an attempt to match the conciseness of "Hex", but "Bin" and "Big" are so close that this keeps making me do a double-take to know which it was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed method names, also for other 'Bin' related ones.
// each byte is typically eight chars | ||
sb = charsIncludeDigits > 512 | ||
? new ValueStringBuilder(charsIncludeDigits) | ||
: new ValueStringBuilder(stackalloc char[512].Slice(0, charsIncludeDigits)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why slice it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is unnecessary, removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Hi @tannergooding I have also addressed @stephentoub 's comments. |
Hi @tannergooding any chance to have a look at status of this approved PR ? thanks ! |
Fix #83619
Hi, currently PR is just in draft stage and in developing, will mark it ready for review when ready, thanks.
Would need final binary format definition firstly.
This is marked as ready for review now, thanks.