-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement more efficient squaring operation #133
Conversation
@jellevos can you rebase this? |
@tarcieri Sorry for the state of the git history, but it should be mergeable now. I hope it's okay given that PRs are squashed before merging. |
src/uint/mul.rs
Outdated
@@ -86,7 +86,73 @@ impl<const LIMBS: usize> UInt<LIMBS> { | |||
|
|||
/// Square self, returning a "wide" result in two parts as (lo, hi). | |||
pub const fn square_wide(&self) -> (Self, Self) { | |||
self.mul_wide(self) | |||
// This is a re-implementation from https://github.com/ucbrise/jedi-pairing/blob/c4bf151b8d2b560973cb3805f9b5f4a144597f7e/include/core/bigint.hpp#L410. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to be pretty careful with IPR, even with source code translations, with the goal of being able to confidently state the resulting code is Apache 2+MIT.
This means we generally prefer public domain sources or getting an explicit blessing from the original author to license the resulting work as Apache 2+MIT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samkumar are you OK with this translation of your code being licensed as Apache 2.0+MIT? If so, would you prefer a larger IPR statement about the original code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with it being licensed as Apache 2.0 + MIT. Can you clarify what you mean by "a larger IPR statement about the original code"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samkumar awesome, thanks!
The statement is whatever you'd like us to put, just let us know. Is the existing comment OK?
We can probably also mention permission was given on this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the comment about the original implementation with relicensing info in this commit: a158037
This PR extends #130. The only relevant change is this method.
I re-implemented the algorithm implemented from this repo.
This algorithm can skip a large chunk of computation because multiplying a number by itself normally causes many computations to be done twice.
Some not-so-thorough results on an M1 chip (results do seem consistent over multiple runs):
512 bits: 19 vs 36 ns
1024 bits: 182 vs 197 ns
1536 bits: 357 vs 561 ns
2048 bits: 605 vs 987 ns
I benchmarked it with the following code and
cargo bench
: