Skip to content
This repository has been archived by the owner on Jan 26, 2022. It is now read-only.

Alg Opt with GLV #4

Closed
wants to merge 99 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
99 commits
Select commit Hold shift + click to select a range
a64d7fb
First draft affine batch ops & wnaf
jon-chuang Jul 31, 2020
b7024dd
changes to mutability and lifetimes
jon-chuang Jul 31, 2020
40ef5d7
delete superfluous files
jon-chuang Jul 31, 2020
0fa5eeb
crazy direction: Passing a FnMut to generate an iterator locally
jon-chuang Aug 1, 2020
eebb12b
unsuccessful further attempts
jon-chuang Aug 1, 2020
4d22acf
compile sucess using index approach
jon-chuang Aug 1, 2020
bbbec75
fixes for mutable borrows
jon-chuang Aug 1, 2020
3a6e45c
Successfully passed scalar mul test
jon-chuang Aug 1, 2020
5c65917
benchmarks + prefetching
jon-chuang Aug 3, 2020
3bf2bc1
stash
jon-chuang Aug 6, 2020
4bb5ad5
generic impl of batch arith for all affinecurves
jon-chuang Aug 6, 2020
67da071
batched affine formulas for TE - too expensive
jon-chuang Aug 6, 2020
2e54f67
improved TE affine
jon-chuang Aug 6, 2020
62df27d
cleanup batch inversion
jon-chuang Aug 6, 2020
e6d28b6
fmt...
jon-chuang Aug 6, 2020
74d9bb7
fix minor error
jon-chuang Aug 6, 2020
908fb73
remove debugging scaffolding
jon-chuang Aug 6, 2020
c0a5a07
fmt...
jon-chuang Aug 6, 2020
5c89660
delete batch arith bench as not suitable for criterion or bench
jon-chuang Aug 6, 2020
6359f7c
fix bench removal errors
jon-chuang Aug 6, 2020
56b8181
fmt...
jon-chuang Aug 6, 2020
ec2decd
added missing coeff_a
jon-chuang Aug 6, 2020
bad37bd
refactor BatchGroupArithmetic to be separate trait
jon-chuang Aug 12, 2020
5b9cae9
Batch verification with radix sort
jon-chuang Aug 16, 2020
cbf8e49
Cache-locality & parallelisation
jon-chuang Aug 17, 2020
200f5fa
Successfully impl batch verify
jon-chuang Aug 18, 2020
ed7c4a7
added tests and bench for batch_ver, parallel_random_gen, ^ thread util
jon-chuang Aug 18, 2020
0e612e4
fmt
jon-chuang Aug 18, 2020
8819290
enabled missing test
jon-chuang Aug 18, 2020
a8e9c18
remove voracious_radix_sort
jon-chuang Aug 18, 2020
f6a2392
commented unneeded Instant::now()
jon-chuang Aug 18, 2020
2390243
Fixed batch_ver tests for curves of small or unit cofactor
jon-chuang Aug 18, 2020
cbee6a2
split recursive and non-recursive, tidy up shared functionality
jon-chuang Aug 20, 2020
0811a0f
reduce max_logn
jon-chuang Aug 20, 2020
2cbff4d
adjust max_logn further
jon-chuang Aug 20, 2020
c138904
Batch MSM, speedup only for bw6 due to poor cache performance
jon-chuang Aug 21, 2020
5068e74
fmt...
jon-chuang Aug 21, 2020
e886a38
GLV iBiginteger
jon-chuang Aug 21, 2020
1235117
stash
jon-chuang Aug 22, 2020
a60bedc
stash
jon-chuang Aug 22, 2020
31690ce
Merge branch 'jonch/batch_ver' into jonch/glv
jon-chuang Aug 22, 2020
ae69a9f
GLV with Parameter-based specialisation
jon-chuang Aug 27, 2020
1cb7e65
GLV lattice basis script success
jon-chuang Aug 30, 2020
f68cf6e
Successfully passed tests and benched
jon-chuang Aug 31, 2020
cee0204
Improvments to MSM with and bucketed adds using lightweight index sort
jon-chuang Sep 2, 2020
0c3bde5
changed rng to be external parameter for non-parallel batch veri
jon-chuang Sep 3, 2020
a87db71
remove bench print scaffolding
jon-chuang Sep 3, 2020
1909a4b
remove old batch_bucketed_add using vectors instead of fixed offsets
jon-chuang Sep 3, 2020
9bfd683
retain parallel batch_add_split
jon-chuang Sep 3, 2020
24fcd36
Comments for batch arith
jon-chuang Sep 3, 2020
ed201c0
remove need for hashmap for no std for batch_bucketed_add
jon-chuang Sep 3, 2020
517df11
minor changes
jon-chuang Sep 3, 2020
22a48d3
cleanup
jon-chuang Sep 3, 2020
b5852b4
cleanup
jon-chuang Sep 3, 2020
af70e80
fmt + use no_std Vec
jon-chuang Sep 3, 2020
4421820
removed std::
jon-chuang Sep 3, 2020
7962c8c
add scratch space
jon-chuang Sep 3, 2020
9318e37
Add GLV for non-batched SW mul
jon-chuang Sep 4, 2020
a9c951a
fix for glv_scalar_decomposition when k == MODULUS (subgroup check)
jon-chuang Sep 4, 2020
a90dfa5
Fixed performance BUG: unnecessary table generation
jon-chuang Sep 4, 2020
3a70376
GLV -> has_glv(), bigint slice bd check, refactor batch loops, u32 index
jon-chuang Sep 7, 2020
e9027c0
clean remove of batch_verify
jon-chuang Sep 7, 2020
f65bdef
fix mistake with elems indexing, unused arg for future recursion PR
jon-chuang Sep 7, 2020
e5b1182
trivial errors
jon-chuang Sep 7, 2020
c0a53df
more minor fixes
jon-chuang Sep 7, 2020
344fbd3
fix issues with batch_ver (.is_zero(), TE affine->proj mul)
jon-chuang Sep 7, 2020
646260b
fix issue with batch_bucketed_add_split
jon-chuang Sep 7, 2020
ecdd939
misname
jon-chuang Sep 7, 2020
7ba3688
Success in test and bench \(*v*)/
jon-chuang Sep 7, 2020
9ec6727
tmp commit to cache experimental batch_add_write_shift_..
jon-chuang Sep 8, 2020
1810368
remove batch_add_write_shift..
jon-chuang Sep 8, 2020
58e46b4
optional dep, fmt...
jon-chuang Sep 8, 2020
6a6e2fd
undo accidental deletion of dlsd sort
jon-chuang Sep 8, 2020
9ec0eb7
fmt...
jon-chuang Sep 8, 2020
493626d
cleanup batch bucket add, unify impl
jon-chuang Sep 8, 2020
56bf4f9
no std...
jon-chuang Sep 8, 2020
a5640a4
fixed tests
jon-chuang Sep 8, 2020
6b39608
fixed unimplemented for TE, swapped wnaf table row/col for batchaddwrite
jon-chuang Sep 8, 2020
4cf6c5f
wnaf table generation uses fewer copies, remove timing instrumentation
jon-chuang Sep 8, 2020
1a928b0
Minor Cleanup
jon-chuang Sep 9, 2020
5964b4b
Add feature-activated timing instrumentation, reduce code bloat (wnaf)
jon-chuang Sep 9, 2020
d9de7b6
unused var, no_std
jon-chuang Sep 9, 2020
5b0872f
Make timing macros defined globally, instrument more code
jon-chuang Sep 9, 2020
abad582
instrument w/ tid, better num_rounds est. f64, timing black/whitelisting
jon-chuang Sep 9, 2020
1eacd89
Minor changes
jon-chuang Sep 9, 2020
204ffa5
refactor tests, generic MSM test
jon-chuang Sep 10, 2020
9efaae4
2D test matrix :)
jon-chuang Sep 10, 2020
bd82f31
batchaffine
jon-chuang Sep 10, 2020
e5cb574
tests
jon-chuang Sep 10, 2020
3ed5d9f
additive features
jon-chuang Sep 11, 2020
2fc20e4
big_n feature for test-benching
jon-chuang Sep 11, 2020
f21f40a
prefetch unroll
jon-chuang Sep 11, 2020
c605894
minor adjustments
jon-chuang Sep 11, 2020
6a70b67
extension(s -> "")_fields
jon-chuang Sep 14, 2020
c83b29d
remove artifacts, fix asm
jon-chuang Sep 14, 2020
3a8e853
uncomment subgroup checks, glv param sources
jon-chuang Sep 14, 2020
d8c5d08
Clean up GLV murkiness and add comments
jon-chuang Sep 22, 2020
a5f4521
Set defaults for glv_window_size
jon-chuang Sep 22, 2020
6b65eda
refactor glv to use examples
jon-chuang Sep 27, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ members = [
"r1cs-core",
"r1cs-std",
"algebra-core/algebra-core-derive",
"scripts/glv_lattice_basis"
]

[profile.release]
Expand Down
4 changes: 4 additions & 0 deletions algebra-benches/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ paste = "0.1"

[features]
asm = [ "algebra/asm"]
prefetch = [ "algebra/prefetch"]
n_fold = []
mnt4_298 = [ "algebra/mnt4_298"]
mnt6_298 = [ "algebra/mnt6_298"]
Expand All @@ -42,6 +43,9 @@ bls12_381 = [ "algebra/bls12_381"]
bls12_377 = [ "algebra/bls12_377"]
cp6_782 = [ "algebra/cp6_782" ]
bw6_761 = [ "algebra/bw6_761" ]
timing = [ "algebra/timing"]
timing_detailed = [ "algebra/timing_detailed" ]
timing_thread_id = [ "algebra/timing_thread_id" ]

[build-dependencies]
rustc_version = "0.2"
2 changes: 2 additions & 0 deletions algebra-benches/src/curves/bw6_761.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,11 @@ use algebra::{
fq::Fq, fq3::Fq3, fr::Fr, Fq6, G1Affine, G1Projective as G1, G2Affine, G2Projective as G2,
Parameters, BW6_761,
},
curves::BatchGroupArithmeticSlice,
BigInteger, Field, PairingEngine, PrimeField, ProjectiveCurve, SquareRootField, UniformRand,
};

batch_arith!();
ec_bench!();
f_bench!(1, Fq3, Fq3, fq3);
f_bench!(2, Fq6, Fq6, fq6);
Expand Down
81 changes: 81 additions & 0 deletions algebra-benches/src/macros/batch_arith.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
macro_rules! batch_arith {
() => {
#[bench]
fn bench_g1_batch_mul_affine(b: &mut ::test::Bencher) {
const SAMPLES: usize = 5000;

let mut rng = XorShiftRng::seed_from_u64(1231275789u64);

let mut g: Vec<G1Affine> = (0..SAMPLES)
.map(|_| G1::rand(&mut rng).into_affine())
.collect();

let s: Vec<FrRepr> = (0..SAMPLES)
.map(|_| Fr::rand(&mut rng).into_repr())
.collect();

let now = std::time::Instant::now();
println!("Start");
b.iter(|| {
g[..].batch_scalar_mul_in_place::<FrRepr>(&mut s.to_vec()[..], 4);
println!("G1 scalar mul batch affine {:?}", now.elapsed().as_micros());
});
}

#[bench]
fn bench_g1_batch_mul_projective(b: &mut ::test::Bencher) {
const SAMPLES: usize = 5000;

let mut rng = XorShiftRng::seed_from_u64(1231275789u64);

let mut g: Vec<G1> = (0..SAMPLES).map(|_| G1::rand(&mut rng)).collect();

let s: Vec<Fr> = (0..SAMPLES).map(|_| Fr::rand(&mut rng)).collect();

let now = std::time::Instant::now();
b.iter(|| {
g.iter_mut().zip(&s).for_each(|(p, sc)| p.mul_assign(*sc));
println!("G1 scalar mul proj {:?}", now.elapsed().as_micros());
});
}

#[bench]
fn bench_g2_batch_mul_affine(b: &mut ::test::Bencher) {
const SAMPLES: usize = 5000;

let mut rng = XorShiftRng::seed_from_u64(1231275789u64);

let mut g: Vec<G2Affine> = (0..SAMPLES)
.map(|_| G2::rand(&mut rng).into_affine())
.collect();

let s: Vec<FrRepr> = (0..SAMPLES)
.map(|_| Fr::rand(&mut rng).into_repr())
.collect();

let now = std::time::Instant::now();
println!("Start");
b.iter(|| {
g[..].batch_scalar_mul_in_place::<FrRepr>(&mut s.to_vec()[..], 4);
println!("G2 scalar mul batch affine {:?}", now.elapsed().as_micros());
});
}

#[bench]
fn bench_g2_batch_mul_projective(b: &mut ::test::Bencher) {
const SAMPLES: usize = 5000;

let mut rng = XorShiftRng::seed_from_u64(1231275789u64);

let mut g: Vec<G2> = (0..SAMPLES).map(|_| G2::rand(&mut rng)).collect();

let s: Vec<Fr> = (0..SAMPLES).map(|_| Fr::rand(&mut rng)).collect();

let now = std::time::Instant::now();
b.iter(|| {
g.iter_mut().zip(&s).for_each(|(p, sc)| p.mul_assign(*sc));
println!("G2 scalar mul proj {:?}", now.elapsed().as_micros());
});
}
};
}
3 changes: 3 additions & 0 deletions algebra-benches/src/macros/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,6 @@ mod pairing;

#[macro_use]
mod utils;

#[macro_use]
mod batch_arith;
19 changes: 14 additions & 5 deletions algebra-core/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,18 +29,27 @@ num-traits = { version = "0.2", default-features = false }
rand = { version = "0.7", default-features = false }
rayon = { version = "1", optional = true }
unroll = { version = "=0.1.4" }
itertools = { version = "0.9.0", default-features = false }
voracious_radix_sort = { version = "1.0.0", optional = true }
either = { version = "1.6.0", default-features = false }
thread-id = { version = "3.3.0", optional = true }
backtrace = { version = "0.3", optional = true }

[build-dependencies]
field-assembly = { path = "./field-assembly" }
field-assembly = { path = "./field-assembly", optional = true }
rustc_version = "0.2"

[dev-dependencies]
rand_xorshift = "0.2"

[features]
default = [ "std" ]
std = []
parallel = [ "std", "rayon" ]
default = [ "std", "rand/default" ]
std = [ "voracious_radix_sort" ]
parallel = [ "std", "rayon", "rand/default" ]
derive = [ "algebra-core-derive" ]
llvm_asm = []
llvm_asm = [ "field-assembly" ]
prefetch = [ "std" ]

timing = [ "std", "backtrace" ]
timing_detailed = [ "std", "backtrace" ]
timing_thread_id = [ "thread-id" ]
17 changes: 10 additions & 7 deletions algebra-core/build.rs
Original file line number Diff line number Diff line change
@@ -1,26 +1,29 @@
use std::env;
use std::fs;
use std::path::Path;

extern crate rustc_version;
use rustc_version::{version_meta, Channel};

use field_assembly::generate_macro_string;
#[cfg(feature = "llvm_asm")]
use {
field_assembly::generate_macro_string,
std::{env, fs, path::Path},
};

#[cfg(feature = "llvm_asm")]
const NUM_LIMBS: usize = 8;

fn main() {
println!("cargo:rerun-if-changed=build.rs");

let is_nightly = version_meta().expect("nightly check failed").channel == Channel::Nightly;

let should_use_asm = cfg!(all(
let _should_use_asm = cfg!(all(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed?

feature = "llvm_asm",
target_feature = "bmi2",
target_feature = "adx",
target_arch = "x86_64"
)) && is_nightly;
if should_use_asm {

#[cfg(feature = "llvm_asm")]
if _should_use_asm {
let out_dir = env::var_os("OUT_DIR").unwrap();
let dest_path = Path::new(&out_dir).join("field_assembly.rs");
fs::write(&dest_path, generate_macro_string(NUM_LIMBS)).unwrap();
Expand Down
1 change: 1 addition & 0 deletions algebra-core/field-assembly/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ edition = "2018"

[dependencies]
mince = { path = "../mince" }
paste = "0.1"
62 changes: 31 additions & 31 deletions algebra-core/field-assembly/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,29 +13,6 @@ use std::cell::RefCell;

const MAX_REGS: usize = 6;

pub fn generate_macro_string(num_limbs: usize) -> std::string::String {
if num_limbs > 3 * MAX_REGS {
panic!(
"Number of limbs must be <= {} and MAX_REGS >= 6",
3 * MAX_REGS
);
}
let mut macro_string = String::from(
"
macro_rules! llvm_asm_mul {
($limbs:expr, $a:expr, $b:expr, $modulus:expr, $mod_prime:expr) => {
match $limbs {",
);
macro_string += &generate_matches(num_limbs, true);

macro_string += &"
macro_rules! llvm_asm_square {
($limbs:expr, $a:expr, $modulus:expr, $mod_prime:expr) => {
match $limbs {";
macro_string += &generate_matches(num_limbs, false);
macro_string
}

#[assemble]
fn generate_llvm_asm_mul_string(
a: &str,
Expand All @@ -45,25 +22,25 @@ fn generate_llvm_asm_mul_string(
mod_prime: &str,
limbs: usize,
) -> String {
reg!(a0, a1, a, limbs);
reg!(b0, b1, b, limbs);
reg!(m, m1, modulus, limbs);
reg!(a_reg, a, limbs);
reg!(b_reg, b, limbs);
reg!(m_reg, modulus, limbs);

xorq(RCX, RCX);
for i in 0..limbs {
if i == 0 {
mul_1!(a1[0], b1, zero, limbs);
mul_1!(a_reg[0], b_reg, zero, limbs);
} else {
mul_add_1!(a1, b1, zero, i, limbs);
mul_add_1!(a_reg, b_reg, zero, i, limbs);
}
mul_add_shift_1!(m1, mod_prime, zero, i, limbs);
mul_add_shift_1!(m_reg, mod_prime, zero, i, limbs);
}
for i in 0..limbs {
movq(R[i], a1[i]);
movq(R[i], a_reg[i]);
}
}

fn generate_matches(num_limbs: usize, is_mul: bool) -> String {
fn generate_match_arms(num_limbs: usize, is_mul: bool) -> String {
let mut ctx = Context::new();
for limbs in 2..(num_limbs + 1) {
ctx.reset();
Expand Down Expand Up @@ -102,3 +79,26 @@ fn generate_matches(num_limbs: usize, is_mul: bool) -> String {
ctx.end(num_limbs);
ctx.get_string()
}

pub fn generate_macro_string(num_limbs: usize) -> std::string::String {
if num_limbs > 3 * MAX_REGS {
panic!(
"Number of limbs must be <= {} and MAX_REGS >= 6",
3 * MAX_REGS
);
}
let mut macro_string = String::from(
"
macro_rules! llvm_asm_mul {
($limbs:expr, $a:expr, $b:expr, $modulus:expr, $mod_prime:expr) => {
match $limbs {",
);
macro_string += &generate_match_arms(num_limbs, true);

macro_string += &"
macro_rules! llvm_asm_square {
($limbs:expr, $a:expr, $modulus:expr, $mod_prime:expr) => {
match $limbs {";
macro_string += &generate_match_arms(num_limbs, false);
macro_string
}
18 changes: 10 additions & 8 deletions algebra-core/field-assembly/src/utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,16 @@ pub const RSI: &'static str = "%rsi";
pub const R: [&'static str; 8] = ["%r8", "%r9", "%r10", "%r11", "%r12", "%r13", "%r14", "%r15"];

macro_rules! reg {
($a_0:ident, $a_1:ident, $a:ident, $range:expr) => {
let mut $a_0 = Vec::new();
let mut $a_1 = Vec::new();
for i in 0..$range {
$a_0.push(format!("{}({})", i * 8, $a));
}
for i in 0..$range {
$a_1.push(&*$a_0[i]);
($a_reg:ident, $a:ident, $range:expr) => {
paste::item! {
let mut $a_reg = Vec::new();
let mut [<$a_reg _1>] = Vec::new();
for i in 0..$range {
[<$a_reg _1>].push(format!("{}({})", i * 8, $a));
}
for i in 0..$range {
$a_reg.push(&*[<$a_reg _1>][i]);
}
}
};
}
42 changes: 42 additions & 0 deletions algebra-core/src/biginteger/macros.rs
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,48 @@ macro_rules! bigint_impl {

res
}

#[inline]
fn mul_no_reduce(this: &[u64], other: &[u64]) -> Self {
assert!(this.len() == $num_limbs / 2);
assert!(other.len() == $num_limbs / 2);

let mut r = [0u64; $num_limbs];
for i in 0..$num_limbs / 2 {
let mut carry = 0u64;
for j in 0..$num_limbs / 2 {
r[j + i] =
arithmetic::mac_with_carry(r[j + i], this[i], other[j], &mut carry);
}
r[$num_limbs / 2 + i] = carry;
}
Self::new(r)
}

#[inline]
fn mul_no_reduce_lo(this: &[u64], other: &[u64]) -> Self {
assert!(this.len() == $num_limbs);
assert!(other.len() == $num_limbs);

let mut r = [0u64; $num_limbs];
for i in 0..$num_limbs {
let mut carry = 0u64;
for j in 0..($num_limbs - i) {
r[j + i] =
arithmetic::mac_with_carry(r[j + i], this[i], other[j], &mut carry);
}
}
Self::new(r)
}

#[inline]
fn from_slice(slice: &[u64]) -> Self {
let mut repr = Self::default();
for (limb, &value) in repr.0.iter_mut().zip(slice) {
*limb = value;
}
repr
}
}

impl ToBytes for $name {
Expand Down
Loading