Split out and clean scalar multiplication code #120

hdevalence · 2018-03-27T00:53:11Z

Creates a new scalar_mul module hierarchy. The goal is to move the existing scalar multiplication code into this submodule, then call it from the user-facing API.

This can also contain code for things we can't do now, like multiscalar multiplication with precomputation, or multiscalar multiplication targeting large problem sizes (e.g., like the ones used for Bulletproofs).

The serial (u32/u64) implementations use a multiple curve models, passing between extended and projective coordinates when performing addition and doubling (respectively). But the AVX2 backend doesn't, so in order to write a single scalar mult implementation, we have to either abstract over the curve models or have two implementations.

A generic solution is possible but extremely unreadable (see this experiment): the scalar mul implementation would be parameterized over the point types used by the serial implementations, with many where clauses describing how the types relate. The AVX2 types could then be substituted in the appropriate places.

Instead of doing that, this duplicates the scalar mul code into the avx2 backend. But this allows changing the algorithms (e.g. window sizes) as is optimal there.

The existing AVX2 code is split into the same hierarchy. The fixed-base code is removed. This was faster than the non-AVX2 code, but the serial code is already so fast that there's no reason not to use it. The vectorized point addition is removed -- only the readdition formulas are actually used by scalar multiplication, so there's no reason to implement vectorized addition. The variable-time code is slightly faster now since it uses readdition: for instance, this is a size-128 multiscalar mul.

This should contain generic implementations of scalar multiplication algorithms that can be used with multiple backends. The goal is to move the existing scalar multiplication code into this submodule, then call it from the user-facing API. This can also contain code for things we can't do now, like multiscalar multiplication with precomputation.

The serial (`u32`/`u64`) implementations use a multiple curve models, passing between extended and projective coordinates when performing addition and doubling (respectively). But the AVX2 backend doesn't, so in order to write a single scalar mult implementation, we have to either abstract over the curve models or have two implementations. A generic solution is possible but extremely unreadable: the scalar mul implementation would be parameterized over the point types used by the serial implementations, with many where clauses describing how the types relate. The AVX2 types could then be substituted in the appropriate places. Instead we just duplicate the code into the `avx2` backend.

This was faster than the non-AVX2 code, but the serial code is already so fast that there's no reason not to use it.

Only the readdition formulas are actually used by scalar multiplication, so there's no reason to implement vectorized addition.

isislovecruft

LGTM, thanks!

* Updated to new curve25519 scalar API * Removed clamping from constructors; clamping is always done during scalar-point multiplication * Updated test to reflect new functionality * Updated changelog

hdevalence added 8 commits March 26, 2018 14:34

Pull out constant-time straus implementation

2864a42

Pull out variable-time straus implementation

2d99892

Pull out vartime double-base scalar mul code

0c4e718

Remove AVX2 fixed-base code.

c73a0fd

This was faster than the non-AVX2 code, but the serial code is already so fast that there's no reason not to use it.

Remove AVX2 addition formulas

e8b053b

Only the readdition formulas are actually used by scalar multiplication, so there's no reason to implement vectorized addition.

Reorganize AVX2 point code

7ef6a1e

hdevalence requested a review from isislovecruft March 27, 2018 00:53

Feature-gate multiscalar impls on alloc

4a648df

isislovecruft reviewed Apr 2, 2018

View reviewed changes

isislovecruft merged commit 4a648df into dalek-cryptography:develop Apr 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split out and clean scalar multiplication code #120

Split out and clean scalar multiplication code #120

hdevalence commented Mar 27, 2018

isislovecruft left a comment

Split out and clean scalar multiplication code #120

Split out and clean scalar multiplication code #120

Conversation

hdevalence commented Mar 27, 2018

isislovecruft left a comment

Choose a reason for hiding this comment