Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Halve Commit Size with Ed25519 and no HSM changes #7892

Closed
ValarDragon opened this issue Sep 10, 2019 · 7 comments
Closed

Halve Commit Size with Ed25519 and no HSM changes #7892

ValarDragon opened this issue Sep 10, 2019 · 7 comments
Labels
C:crypto Component: Crypto stale for use by stalebot T:research

Comments

@ValarDragon
Copy link
Contributor

ValarDragon commented Sep 10, 2019

This is a proposal to halve the current Tendermint commit size when using Ed25519, with no changes required to hardware. I expect the verifier time to be notably improved.

The high level idea is we use Ed25519 batch verification algorithm, where the verifier randomness is obtained from hashing the entire commit. Ed25519 signatures are composed of two equal sized parts (R, s). Using batch verification, the verifier needs {R_i}_{i in validators}, but only needs the random linear combination of all s_i's. Thus, the commit will send all R_i, but only the single aggregated s. (This still must be proven sound)

Let G be the generator of the Ed25519 group, z_i be the ith random coefficient, P_i be the ith validators pubkey, m_i is the message which is being signed, (R_i, s_i) is the i-th Ed25519 signature and h_i = H(R_i, P_i, m_i). (Capitals are curve points, lower-case letters are scalars) The Ed25519 batch-verification equation that must be checked by the verifier is:

(-\sum_{i} z_i s_i)G + (\sum_{i} z_i R_i) + (\sum_{i} z_i h_i * P_i) = 0

One way to check the equation is to give the verifier (-\sum_{i} z_i s_i), and all of the R_i.

The proposed procedure is:

  1. The next proposer aggregates all Ed25519 signatures (R_i, s_i) as before
  2. The next proposer builds a pseudo-commit that contains all R_i, and timestamps_i for all validators. (Perhaps in a merkle tree, see optimization 1)
  3. The proposer gets random coefficients {z_i} from H(pseudo-commit), where each z_i is 128 bits. (The hash can be ran in XOF mode)
  4. The proposer builds the commit as {(-\sum_{i} z_i s_i), pseudo-commit}
  5. Broadcast as before

To verify this, as either a full node or as a lite-client:

  1. parse the commit as {(-\sum_{i} z_i s_i), pseudo-commit}
  2. get random coefficients {z_i} from H(pseudo-commit)
  3. Check the batch-verification equation

It remains to show that checking the equation given (-\sum_{i} z_i s_i) instead of all s_i is secure. I do not currently have a proof of it, but I have an impression that this is true. (EDIT: See following comment) The reason that it feels true is that we are taking a random linear combination of the relevant curve points we care about (which already depend on the public key and the message), and we are essentially asking for the discrete logarithm of the random linear combination. So if you didn't know any constituent signature, it seems unlikely you could know the resulting discrete logarithm. Random linear combinations of this sort typically have soundness 1/(Randomness sample size), and we choose each z_i from a set of size 2^128. Since h_i has key-prefixing and is therefore distinct for every pubkey, this builds some confidence against being able to do BLS-style rogue public key attacks.

Additional optimizations:

If the verifier wants to take the optimization where they only verify a particular subset D of size (2/3 weight) of the validators who signed the prior block, then we've already assumed a single round of communication from the verifier to their prover where they communicated D. In this model, the verifier can still achieve this same optimization. We redefine the commit to now be {(-\sum_{i} z_i s_i), MT of pseudo-commit}. The verifier receives {commit, {R_i, T_i}_{i \in D}, multi-membership proof for {R_i, T_i}_{i \in D} in pseudo-commit, (-\sum_{i \in D} z'_i s_i)}. The coefficients z'_i are all obtained from getting sufficient bytes from H({R_i, T_i}_{i \in D}).
Essentially the verifiers receives a commitment to the commit (essentially all signatures in a merkle tree), and is sent a multi-membership proof of all the signatures of interest. The random-coefficient optimization is the same, but is only for the received signatures, and so the size of the leafs is still halved.

@ValarDragon
Copy link
Contributor Author

Soundness condition: Suppose that the prover aggregates any N purported signatures {(R_i, s_i)}_{i in N} from pubkeys {A_i} on messages {m_i} respectively. Furthermore suppose that Ed25519Verify((R_j, s_j), A_j, m_j) = false for some j in N, and the prover has no knowledge of the secret key a_j that corresponds to A_j or a valid signature from A_j on m_j. Then the it would take the prover on average at least 2^{security parameter} field operations to compute a valid aggregated signature. (We implicitly assume that the size of the random coefficients chosen is >= security parameter)

Proof sketch:
(Assuming a prime order subgroup, imo we can just force everything to be in the prime-order subgroup of Ed25519)
The prover finding an s_{agg} that passes the verification implies knowledge of

sum_{i in N} z_i r_i + z_i H(R_i, A_i, m_i) a_i

First note that r_j + H(R_j, A_j, m_j) a_j is computationally indistinguishable from r_j + h_i a_j, for a random h_i, by security of the hash function / random oracle.
By assumption, the prover does not know a valid signature from A_j on m_j, or a_j. Because of this, the value u_j = r_j + h_i a_j is indistinguishable from random under the discrete logarithm. (See Self-reducability of the discrete logarithm)

To satisfy the system, the prover must provide z_j * u_j + sum_{i in N \ j} z_i n_i, for some values n_i. z_j is independently chosen from n_i and u_j (by security of the hash function / random oracle). Since u_j is indistinguishable from random, and since z_j is chosen independently from everything else, it follows that z_j * u_j is indistinguishable from the uniform random distribution, and is independent from the distribution of sum_{i in N \ j} z_i n_i. Therefore the prover can at best guess the value u_j, which has probability of success at most 2^{-128}.

Thus the probability of breaking soundness is max(hash preimage, discrete log, 2^{-128) ~= 2^{-128}, proving this system secure.

This proof is really a more detailed sketch, and isn't careful with the random distributions (128 bit, vs random in the field). It should be placed in a game based definition, where the prover is given oracle access to A_j, however it seems clear to me that this can be directly translated to the game based definition.

@ValarDragon
Copy link
Contributor Author

Similar to BLS aggregation within txs in a block, this optimization should apply equally to combining Ed25519 / Ristretto-subgroup EdDSA signatures across txs in a block, for a factor of 2 space savings.

@liamsi
Copy link
Contributor

liamsi commented Sep 13, 2019

Pretty exciting @ValarDragon! So the only thing missing is a detailed wite-up of above proof-sketch, or am I missing sth else here?

@ValarDragon
Copy link
Contributor Author

I think we just need to do that, and talk to other people to show we're not missing something.
We can probably simplify this into two components though: Check that the following procedure is secure for batch proof of knowledge of discrete logarithm (Batch PoK of DL):

Given a set of N elliptic curve points {P_i}_{i \in N} in a prime order curve, the following protocol proves that the prover knows the discrete logarithm of all P_i:
0) Prover commits to all of P_i
1) Hash all of the P_i to obtain random coefficients {z_i}_{i in N}
2) Prover provides `s_{agg}` which is the claimed discrete logarithm of `\sum z_i * P_i`
3) Verifier checks that `s_{agg} * g = \sum z_i * P_i`

Then we need to prove that soundness of this aggregation does in fact reduce to the above. The proof sketch I made earlier was combining both of these, but splitting it up should hopefully make it easier to follow.

It turns out that the Batch PoK of DL is already proven in stronger forms!

  • In http://www.ccs.neu.edu/home/koods/papers/gennaro04batching.pdf, in Theorem 1 they have proven the interactive version of this secure, so you just use the standard fiat shamir to get the protocol I described above. Further they prove it using verifier de-randomization techniques, so instead of hashing n coefficients (n different z_i), you only have to hash one random element z, and then you take powers of z. This reduces the soundness loss from batching from being 1 / F to n / F, but this doesn't impact our overall soundness since |F| = 2^256, and the soundness is already 1/2^128 from discrete log.

  • Something similar is going in https://eprint.iacr.org/2018/414.pdf, section 5.2. That paper uses a very similar technique to achieve this optimization, but their verifier should be faster. Essentially they change the Shnorr / Ed25519 equation to randomize each component separately, removing the need for z_i and allows one to achieve faster verification / simpler aggregation. Since they changed the equation it breaks HSM support, and we can't directly re-use the proof though

I think the presence of the first bullet point really just means that we should sanity check this with some cryptographers, and can then proceed.

@zmanian
Copy link
Contributor

zmanian commented Sep 13, 2019

This is seems an awful lot like the straw man scheme in Section 4 of Yao's Gamma Signatures

https://eprint.iacr.org/2018/414.pdf

The rogue key attack makes the section 4 scheme somewhat less useful for combining transaction signatures but is somewhat fine for consensus signatures where you can require a proof of knowledge of the discrete log when registering.

@ValarDragon
Copy link
Contributor Author

The reason section 4 has that attack is because they don't take take a random linear combination, they just do a straight forward addition. The random linear combination here is what gives this security. This is reminiscent to the idea of https://crypto.stanford.edu/~dabo/pubs/abstracts/BLSmultisig.html.

@tac0turtle tac0turtle added the C:crypto Component: Crypto label Sep 29, 2020
@ValarDragon
Copy link
Contributor Author

This scheme (with derandomization) is the same scheme as proposed in: https://eprint.iacr.org/2021/350.pdf !

So we inherit the proof of security from there =)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:crypto Component: Crypto stale for use by stalebot T:research
Projects
None yet
Development

No branches or pull requests

4 participants