BigInt Fixes for Issues in Violet Tests #262

mgriebling · 2023-04-19T21:40:36Z

This update fixes all known issues published by @LiarPrincess (see various pull requests for details)
Improves the LosslessStringConvertible init() run time by 35X (radix 10) and 310X (radix 16).
Improves the CustomStringConvertible toString() run time by 20X (radix 10) and 90X (radix 16).
Test suite has been updated to include tests for all known issues.
Latest update removes Foundation import and fixes additional problems noted by @LiarPrincess.

…de faster performance. Removed unnecessary _digits and _digitRadix. Fixed a cut/paste problem in "^=" function. Added test cases for string conversion performance and logical operation tests.

…d into a negative. Fixed division/remainder signs after a multi-word division.

…ows.

…han the default).

…ersions by 35X to 310X times over original code. New implementation for CustomStringConvertible toString(); speeds conversions by 20X to 90X times over original code.

stephentyrone · 2023-04-26T16:14:17Z

Sources/BigIntModule/BigInt.swift

@@ -9,6 +9,8 @@
 //
 //===----------------------------------------------------------------------===//

+import Foundation


Numerics can't have a dependency on Foundation for layering reasons. Is it actually used anywhere in this change, or did it just sneak in on accident?

I needed the log() function to do a quick digit count.
I'll see if this can be done another way.

Violet does it differently (without log). It uses magic numbers from V8 (google javascrtipt thinge). It is a really good approximation for the upper bound.

Unfortunately it is an upper bound, so sometimes it has a tiny overallocation. This is why after the filling of the buffer there is a if index != -1 check (with comment 'Check for invalid capacity estimation'). If we overallocate then we have to move the whole result (because we are filling from the back). Overallocation is not a problem (because it is tiny), the shift may be because of performance.

Btw. just so we are clear: I confirm that for radix 10 (most common case) this overallocation happens, which means that sometime we have to shift. But even with this penalty the performance is much better than the current main branch.

Btw2. There is a way to avoid this shifting. In Violet if your platform does not have String(unsafeUninitializedCapacity: capacity, initializingUTF8With: initializer) then we implement our own shim. At the end of this shim we call String(cString: ptr) // Borrows 'ptr' to create owned copy.. To avoid shifting we could just increment the ptr so that the buffer starts exactly where the digits start. Though, obviously if your platform has String(unsafeUninitializedCapacity: capacity, initializingUTF8With: initializer) then you have to correctly fill the provided buffer. Anyway it does not really matter...

Thanks @LiarPrincess for all your comments. Hope you get better soon.
@stephentyrone, I've removed the Foundation dependency by using a table.

LiarPrincess · 2023-04-26T17:56:01Z

Just for reference, original Violet code:

This is XsProMax version (the one without the inlined Int32), according to #256 [BigInt tests][No merge] rabbit How NOT to write performance tests it is slightly faster than standard Violet.

Violet versions have more comments than this PR, so it may be easier to read.

I don't want to start a formal review, because I have not compiled your code, but:

in public func toString(radix: Int, uppercase: Bool) -> String

What if UInt.bitWidth = 64 and radix = 8?
Then bitsPerChar = 3, so a single word contains 64/3=21 full groups. What happens to the remaining bit? It should be 'remembered' and used with high bits of the next word. I think that the current implementation just forgets it, but I may be wrong.

See the radixIsPowerOfTwo_charCanUseMultipleWords in original Violet implementation.
in public func toString(radix: Int, uppercase: Bool) -> String

The if #available(macOS 11.0, *) checks could be a little bit more robust like in Violet. Is there anything wrong with Violet version? I think that at least iOS should be also covered with fast path.
in public init?(_ description: String, radix: Int = 10)

There is also string.withUTF8 which would allow you to use non-generic code.
In Violet there is a comment why generic version is chosen and what is the performance trade-off.
Is this just a copy-paste from Violet or conscious decision?
You report that string parsing when radix is a power of 2 is 310x faster than the old version. According to my performance tests Violet goes up to 345x, so it is basically the same.

On the other hand, decimal parsing can be 190x faster and you report 35x. Is that because addition and multiplication in Violet are faster (4x and 1.4x respectively) or is there some other reason? You can always drop into UnsafeBuffer for performance. Or even write specialized AddMulBuffer, similar how Violet has a DivBuffer in toString.

Btw. Violet performance tests are for both positive and negative numbers. You only test positive and with 2 complement negation means another pass. Though it is not that bad, because in the same operation we do division which is even more expensive.

It would be nice to run all of the Violet tests from all of the PRs on this code to check if there is something else hiding. I am still under covid, and with my mental-potato-state I would do more harm than good. Even writing this comment took me >1h. I may be able to do this during the weekend.

mgriebling · 2023-04-26T21:01:40Z

in public func toString(radix: Int, uppercase: Bool) -> String
What if UInt.bitWidth = 64 and radix = 8?
Then bitsPerChar = 3, so a single word contains 64/3=21 full groups. What happens to the remaining bit? It should be 'remembered' and used with high bits of the next word. I think that the current implementation just forgets it, but I may be wrong.
See the radixIsPowerOfTwo_charCanUseMultipleWords in original Violet implementation.

Yes, mea culpa, I thought I reviewed the different binary cases and all divided evenly into 64.
As you point out, radix 8 (and 32) are oddballs with 3 (and 5) bits.
I don't have covid as an excuse -- just getting old. 😉 I'll fix the code.

in public func toString(radix: Int, uppercase: Bool) -> String
The if #available(macOS 11.0, *) checks could be a little bit more robust like in Violet. Is there anything wrong with Violet version? I think that at least iOS should be also covered with fast path.

No, I was focused on the performance. I'm sure Apple has a tool to do the availability checks for us.

in public init?(_ description: String, radix: Int = 10)
There is also string.withUTF8 which would allow you to use non-generic code.
In Violet there is a comment why generic version is chosen and what is the performance trade-off.
Is this just a copy-paste from Violet or conscious decision?

The most common radix in use is 10, so, while there are performance improvements in binary radices, it didn't seem worth all the code.

You report that string parsing when radix is a power of 2 is 310x faster than the old version. According to my performance tests Violet goes up to 345x, so it is basically the same.
On the other hand, decimal parsing can be 190x faster and you report 35x. Is that because addition and multiplication in Violet are faster (4x and 1.4x respectively) or is there some other reason? You can always drop into UnsafeBuffer for performance. Or even write specialized AddMulBuffer, similar how Violet has a DivBuffer in toString.
Btw. Violet performance tests are for both positive and negative numbers. You only test positive and with 2 complement negation means another pass. Though it is not that bad, because in the same operation we do division which is even more expensive.

I agree, I would love to spent all my days making this faster, but this seems quite good compared to what it was.

LiarPrincess · 2023-04-27T06:44:45Z

Hmm... I can't compile the code from this PR:

/mnt/Storage/Programming/TEST/swift-numerics/Sources/BigIntModule/BigInt.swift:387:20: error: value of type 'UnsafeMutablePointer<UInt8>' has no member 'update'
            dstPtr.update(from: srcPtr, count: count)
            ~~~~~~ ^~~~~~

More or less here:

if index != -1 {
  count = stringMaxSize - index - 1
  let dstPtr = buffer.baseAddress!
  let srcPtr = dstPtr.advanced(by: index+1)
  dstPtr.update(from: srcPtr, count: count) // <-----
  dstPtr[count] = 0
}

In Violet this code looks like this:

// Check for invalid capacity estimation
if index != -1 {
  count = capacity - index - 1
  let dstPtr = ptr.baseAddress!
  let srcPtr = dstPtr.advanced(by: index + 1)
  dstPtr.assign(from: srcPtr, count: count) // <-----
  // For some reason we have to do this on macOS,
  // otherwise the tests will fail (Swift 5.5.2).
  dstPtr[count] = 0
}

So, the assign(from:count:) was changed to update(from:count:). From what I see assign(from:count:) is now depreciated, and renamed to update(from:count:). Internally it is the same thing:

@inlinable
@_alwaysEmitIntoClient
@available(*, deprecated, renamed: "update(from:count:)")
@_silgen_name("_swift_se0370_UnsafeMutablePointer_assign_from_count")
public func assign(from source: UnsafePointer<Pointee>, count: Int) {
  update(from: source, count: count)
}

Platforms:

Ubuntu 22.04.1 LTS 64-bit + Swift 5.7.3 (swift-5.7.3-RELEASE) from Swift.org

macOS BigSur (11.7.4) + Xcode 13.2.1

swift --version
swift-driver version: 1.26.21 Apple Swift version 5.5.2 (swiftlang-1300.0.47.5 clang-1300.0.29.30)
Target: x86_64-apple-macosx11.0

In both cases I am using swift test --enable-test-discovery to run.

Btw. If you want to have swift-numerics with all of the Violet PRs merged then you can use this ultra sophisticated script. Note that those PRs only add tests and do not modify the sources of the actual BigInt, so you can just copy the test files to your build without any conflicts. Or we can just wait, I may find some time over the weekend to look more closely.

git clone https://github.com/LiarPrincess/swift-numerics.git
cd swift-numerics

git fetch origin biginteger:biginteger
git fetch origin 0-Node:0-Node
git fetch origin 1-Init-from-int:1-Init-from-int
git fetch origin 2-String:2-String
git fetch origin 3-Comparable-equatable-hashable:3-Comparable-equatable-hashable
git fetch origin 4-Unary-PLUS-MINUS-INVERT:4-Unary-PLUS-MINUS-INVERT
git fetch origin 5-Binary-ADD-SUB:5-Binary-ADD-SUB
git fetch origin 6-Binary-MUL-DIV-MOD:6-Binary-MUL-DIV-MOD
git fetch origin 7-Binary-AND-OR-XOR:7-Binary-AND-OR-XOR
git fetch origin 8-Binary-SHIFTS:8-Binary-SHIFTS
git fetch origin 9-Int-properties:9-Int-properties
git fetch origin 10-Copy-on-write:10-Copy-on-write
git fetch origin 11-Property-based-tests:11-Property-based-tests
git fetch origin 12-Apple:12-Apple
git fetch origin 13-Init-from-float:13-Init-from-float
git fetch origin 13-Performance:13-Performance
git fetch origin 15-Codable:15-Codable

git checkout biginteger

git merge --commit --no-edit 0-Node
git merge --commit --no-edit 1-Init-from-int
git merge --commit --no-edit 2-String
git merge --commit --no-edit 3-Comparable-equatable-hashable
git merge --commit --no-edit 4-Unary-PLUS-MINUS-INVERT
git merge --commit --no-edit 5-Binary-ADD-SUB
git merge --commit --no-edit 6-Binary-MUL-DIV-MOD
git merge --commit --no-edit 7-Binary-AND-OR-XOR
git merge --commit --no-edit 8-Binary-SHIFTS
git merge --commit --no-edit 9-Int-properties
git merge --commit --no-edit 10-Copy-on-write
git merge --commit --no-edit 11-Property-based-tests
git merge --commit --no-edit 12-Apple
git merge --commit --no-edit 13-Init-from-float
git merge --commit --no-edit 13-Performance
git merge --commit --no-edit 15-Codable

LiarPrincess · 2023-04-27T07:53:36Z

Oki, when I change update(from:count:) to assign(from:count:) I can run the tests. Not sure what is going on...

Then if we run the full Violet test suite we will get a bunch of failures in BitWidthTests and TrailingZeroBitCountTests. Those are expected, I marked them as a 'rabbit Discussion' in #252 [BigInt tests] rabbit Int properties.

The only other failure is in InitFromBinaryFloatingPoint754Tests:

  // swift test --enable-test-discovery --filter=BigIntTests.InitFromBinaryFloatingPoint754Tests/test_float80_spiderman
  // Skipping the: #if (arch(i386) || arch(x86_64)) && !os(Windows) && !os(Android)
  func test_float80_spiderman() {
    let d: Float80 = 9223372036854775807.5
    let expected: BigInt = BigInt("9223372036854775807")!

    let result = BigInt(d)
    if result != expected {
      print("d:        \(d)")
      print("expected: \(expected)")
      print("result:   \(result)")
    }
  }

This will print:

d:        9223372036854775807.5
expected: 9223372036854775807
result:   9223372036854775807

Btw. 9223372036854775807.5 is a special number for Float80, because UInt64(1) << Float80.significandBitCount = 9223372036854775808 which is the 1st number where spacing between numbers (away from 0) is 1. In other words 9223372036854775807.5 is the highest number where the next number is 0.5 away.

print(Float80.significandBitCount) // 63 - well *teknikly* there is a hidden bit, but whatever...
let int = UInt64(1) << Float80.significandBitCount
let d = Float80(int)
print("int:             ", int)        // 9223372036854775808
print("Float80:         ", d)          // 9223372036854775808.0
print("Float80.nextDown:", d.nextDown) // 9223372036854775807.5 - goes down by 0.5 - our test case
print("Float80.nextUp:  ", d.nextUp)   // 9223372036854775809.0 - goes up by 1

I do not think this matters... it is just a fancy floating point trick. This does not happen for Float and Double.

Though, I don't think it has to be fixed in this PR. @mgriebling already did a lot of work. It can wait until we merge Violet tests (*if ever*), then we will deal with this failure.

Added an additional check to isValidPowerOfTwoRadix so odd numbers of bits (except radix 2) don't use the fast path in toString. Now use a table lookup to determine the radix number bits. Added a check for iOS for StringLiteralType. Tested code under iOS and macOS under simulator and x86

stephentyrone · 2023-04-27T12:38:12Z

// well teknikly there is a hidden bit, but whatever...

A digression: Float80 doesn't have a hidden significand bit; the leading bit is stored explicitly, unlike all other common IEEE 754 formats (https://en.wikipedia.org/wiki/Extended_precision).

significandBitCount is defined to be the number of fractional bits in the significand (i.e., it does not include the leading bit, which is implicit in every other format)--its the number of bits that are actually stored in the representation of Float16, Float, and Double (https://developer.apple.com/documentation/swift/binaryfloatingpoint/significandbitcount).

For fixed-width floating-point types, this is the actual number of fractional significand bits.

And we follow this pattern for Float80 even though the extra bit is stored in that format:

Note that Float80.significandBitCount is 63, even though 64 bits are used to store the significand in the memory representation of a Float80 (unlike other floating-point types, Float80 explicitly stores the leading integral significand bit, but the BinaryFloatingPoint APIs provide an abstraction so that users don’t need to be aware of this detail).

LiarPrincess · 2023-04-27T13:16:42Z

Yep, I know.

This is the reason why we can just be generic over 'floating point' and not 'floating point with a special case for Float80'.

In the comment I wanted to point that 'there is a hidden bit' unlike in other types, though I may have worded that incorrectly/imprecisely. Maybe // well *teknikly* there IS a hidden bit, but whatever... would be better (uppercase IS).

Also, there are usual checks for the existence of Float80. This code snippet does not contain them, because I wanted to simplify. Obviously, the Violet PR (#258 [BigInt tests] skull Init from float 754) contains the #if thingie.

Anyway, the reason why this code snippet does not do what I would expect is:

result.words:   [9223372036854775807, 0]
expected.words: [9223372036854775807]

Then == checks if the words are equal -> they are not -> we enter if body. I may be wrong somewhere, but if the String representations are equal then I would expect the == to hold (for BigInt).

Btw. if we change == definition to:

  @inlinable
  public static func == (lhs: BigInt, rhs: BigInt) -> Bool {
    print("lhs: \(lhs.words)")
    print("rhs: \(rhs.words)")
    return lhs.words == rhs.words
  }

We will get (just look at the last line):

/mnt/Storage/Programming/TEST/swift-numerics (biginteger*) swift test --enable-test-discovery --filter=BigIntTests.InitFromBinaryFloatingPoint754Tests/test_initFromBinaryFloatingPoint754_Float80 > out.txt
warning: 'swift-numerics': found 4 file(s) which are unhandled; explicitly declare them as resources or exclude from the target
    /mnt/Storage/Programming/TEST/swift-numerics/Tests/BigIntTests/Helpers/StringTestCases.generated.py
    /mnt/Storage/Programming/TEST/swift-numerics/Tests/BigIntTests/Performance/PerformanceTests.generated.py
    /mnt/Storage/Programming/TEST/swift-numerics/Tests/BigIntTests/Nodejs/NodeTests.generated.js
    /mnt/Storage/Programming/TEST/swift-numerics/Tests/BigIntTests/Nodejs/README.md

Building for debugging...
[12/12] Linking swift-numericsPackageTests.xctest
Build complete! (6.22s)
error: malformed

I have not tested this extensively, but any print statement (for example print("ala")) would do this. I did clean and rm -rf .build. I don't think this matters that much for BigInt, it is not like we *actually* want to print those values on every ==.

Ubuntu platform that I mentioned above.

LiarPrincess · 2023-05-05T14:25:08Z

Ugh...

I forgot to mention that the compilation failure (rename from UnsafeMutablePointer.assign(from:count:) to UnsafeMutablePointer.update(from:count:)) was caused by SE-0370: Pointer Family Initialization Improvements and Better Buffer Slices. This change was specifically mentioned in 'Source compatibility' and 'Alternatives considered'.

I can't find in the proposal the exact version of Swift/stdlib where change was implemented (so that we can #if on it). It is also not mentioned in the code:

@inlinable
@_alwaysEmitIntoClient
@available(*, deprecated, renamed: "update(from:count:)")
@_silgen_name("_swift_se0370_UnsafeMutablePointer_assign_from_count")
public func assign(from source: UnsafePointer<Pointee>, count: Int) {
  update(from: source, count: count)
}

I can confirm that:

✅ compilation succeeds:
- Ubuntu 22.04.1 LTS 64-bit + Swift 5.8 (swift-5.8-RELEASE-ubuntu22.04) from Swift.org
❌ compilation fails:
- Ubuntu 22.04.1 LTS 64-bit + Swift 5.7.3 (swift-5.7.3-RELEASE) from Swift.org
- macOS BigSur (11.7.4) + Xcode 13.2.1

Swift 5.8 Release notes mention SE-0370 in 'Swift Evolution Appendix'. Though, I can't 100% confirm that #if swift(>=5.8) is all we need.

LiarPrincess · 2024-06-26T14:48:32Z

The whole Vapor/Fibonacci/BigInt thingie on the forum reminded me of this PR. Just so that we are clear: some parts of this PR are uncomfortably similar to Violet code, and I’m not sure if I’m fine with this.

mgriebling added 4 commits April 18, 2023 09:42

Added improved LosslessStringConvertible init() with order of magnitu…

d73aef0

…de faster performance. Removed unnecessary _digits and _digitRadix. Fixed a cut/paste problem in "^=" function. Added test cases for string conversion performance and logical operation tests.

Fixed a negate() corner case issue where a negative integer was turne…

4009679

…d into a negative. Fixed division/remainder signs after a multi-word division.

Fixes issue with FloatingPoint overflow and other integer test overfl…

5e55107

…ows.

Fix corner-case overflow with Int.min and -1 division.

67b17c2

This was referenced Apr 19, 2023

Improve String init() execution time and fix Violet bugs #263

Open

[BigInt] Using tests from “Violet - Python VM written in Swift” #242

Open

mgriebling added 2 commits April 20, 2023 17:01

Added new CustomStringConvertible implementation (20X to 70X faster t…

01928d0

…han the default).

New implementation for LosslessStringConvertible init?(); speeds conv…

696ea95

…ersions by 35X to 310X times over original code. New implementation for CustomStringConvertible toString(); speeds conversions by 20X to 90X times over original code.

stephentyrone reviewed Apr 26, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BigInt Fixes for Issues in Violet Tests #262

BigInt Fixes for Issues in Violet Tests #262

mgriebling commented Apr 19, 2023 •

edited

Loading

stephentyrone Apr 26, 2023 •

edited

Loading

mgriebling Apr 26, 2023

LiarPrincess Apr 26, 2023 •

edited

Loading

mgriebling Apr 27, 2023

LiarPrincess commented Apr 26, 2023 •

edited

Loading

mgriebling commented Apr 26, 2023 •

edited

Loading

LiarPrincess commented Apr 27, 2023 •

edited

Loading

LiarPrincess commented Apr 27, 2023 •

edited

Loading

stephentyrone commented Apr 27, 2023

LiarPrincess commented Apr 27, 2023 •

edited

Loading

LiarPrincess commented May 5, 2023

LiarPrincess commented Jun 26, 2024

BigInt Fixes for Issues in Violet Tests #262

Are you sure you want to change the base?

BigInt Fixes for Issues in Violet Tests #262

Conversation

mgriebling commented Apr 19, 2023 • edited Loading

stephentyrone Apr 26, 2023 • edited Loading

Choose a reason for hiding this comment

mgriebling Apr 26, 2023

Choose a reason for hiding this comment

LiarPrincess Apr 26, 2023 • edited Loading

Choose a reason for hiding this comment

mgriebling Apr 27, 2023

Choose a reason for hiding this comment

LiarPrincess commented Apr 26, 2023 • edited Loading

mgriebling commented Apr 26, 2023 • edited Loading

LiarPrincess commented Apr 27, 2023 • edited Loading

LiarPrincess commented Apr 27, 2023 • edited Loading

stephentyrone commented Apr 27, 2023

LiarPrincess commented Apr 27, 2023 • edited Loading

LiarPrincess commented May 5, 2023

LiarPrincess commented Jun 26, 2024

mgriebling commented Apr 19, 2023 •

edited

Loading

stephentyrone Apr 26, 2023 •

edited

Loading

LiarPrincess Apr 26, 2023 •

edited

Loading

LiarPrincess commented Apr 26, 2023 •

edited

Loading

mgriebling commented Apr 26, 2023 •

edited

Loading

LiarPrincess commented Apr 27, 2023 •

edited

Loading

LiarPrincess commented Apr 27, 2023 •

edited

Loading

LiarPrincess commented Apr 27, 2023 •

edited

Loading