[tl;dr]: In this blogpost we show how to extract the key of an ed25519 keypair if its secret is also used to generate a Verifiable Random Function proof. We create a script that performs such an extraction over libsodium.
In the pre-blockchain era, the typical advise from cryptographers was ``Don't roll your own crypto!''. The intention behind this advice is to avoid security pitfalls, which are less likely to exist in rigorously researched inventions. However, this advice is muffled by the accelerated advancement of the blockchain space, where new cryptography is invented and rolled out every now and then. In this article, we re-emphasize the need for rigorous security analysis of every new crypto, by demonstrating how a natural shortcut can lead to a catastrophic consequence.
Often, cryptographers get asked if the same secret key for different algorithms. This natural shortcut is bad practice - one should only use the secret key for its intended purpose, even if it's 'only' 32 bytes. The common answer, specifically if replied generically, is 'no'. However, the reasons usually quoted -- namely, security analysis being done only in the standalone setting whereby there are no security guarantees on usages that deviate from the prescribed method -- tend to be too abstract.
Specifically, ed25519 signatures and VRFs (verifiable random functions) are used in Cardano. Given their similarly structured public keys, one may be tempted to use the same secret key for both the primitives. However, doing so can allow an adversary to easily extract the secret key. Note that cryptosystems that are used in Cardano are proven secure -- see ed25519 and ECVRF, however in standalone settings; i.e., ed25519 is proven secure in the setting where an adversary has access to only signatures.
NOTE: this does not imply that using the same secret keys for any two distinct cryptosystems discloses the secret. However, it should serve as an example that if one is not certain of whether using the same key for two algorithms is secure or not, then the assumption should be that it is not.
We will begin by explaining Schnorr signatures, which was the basis of Ed25519 and ECVRF designs.
Schnorr signatures have been around for quite some years, and they are
deployed widely in many applications. In Cardano, following our valentine
upgrade, we introduced native support for Schnorr signatures over curve SECP256k1
in Plutus. Schnorr signatures are simply
sigma protocols that are
tied to a message. In particular, let
- [P] selects a random scalar
$k$ , computes$R = k \cdot G$ , and sends$R$ to the verifier - [V] selects a random scalar
$c$ , and sends it to the prover - [P] computes
$s = k + c * sk$ , and sends it to the verifier - [V] accepts if and only if
$s * G = R + c * vk$ .
Note that we have just described an interactive protocol, where there is no message involved. We will now describe how it is transformed into a message signing non-interactive version. A typical method used in Cryptography is the so-called Fiat-Shamir transformation that replaces the random challenges with outputs of a random oralce, where the input to the random oracle is the transcript thus far. Furthermore, in order to link a signature (as described above) to a message, the latter is inlcuded when computing the hash that defines the challenge.
For readability, in this blogpost we describe all algorithms in their interactive version and note that any of them can be made non-interactive via the Fiat-Shamir heuristic. Furthermore, for simplicity, we omit specifying the message from the Schnorr-like signature schemes descriptions.
Subtle deviations from the protocol can be catastrophic. One such example is producing two signatures
that share the same value
Fortunately, if the value
The signature scheme, ed25519, was introduced by Bernstein, Duif, Lange, Schwabe and Yang.
Essentially, this signature scheme is a Schnorr signature scheme but defined specifically
over curve Edwards25519 for efficiency and security considerations. We don't need to cover
the details of this curve or the benefits it brings. However, one important detail introduced
in ed25519 which differs with Schnorr is that signatures are deterministic, i.e. the
randomness used in the announcement (first step of the prover), is computed using a hash
function, rather than sampling the value
- [P] selects a (pseudo)random scalar
$k = \texttt{KDF}(sk || m, 1)$ , computes$R = k \cdot G$ , and sends$R$ to the verifier - [V] selects a random scalar
$c$ , and sends it to the prover - [P] computes
$s = k + c * \texttt{KDF}(sk, 0)$ , and sends it to the verifier - [V] accepts if and only if
$s * G = R + c * vk$ .
One can see how closely related both algorithms (Schnorr and ed25519) are. Note that in this secret key generation, the secret key is not used to multiply the elliptic curve base point, but instead to derive two scalars. As noted above, this algorithm can be made non-interactive via the Fiat-Shamir heuristic.
The ECVRF is yet another protocol that is highly inspired from Schnorr in general, and ed25519 in particular. The
concrete scheme that we look into is ECVRF-EDWARDS25519-SHA512-ELL2, from the VRF irtf draft.
A VRF, namely a Verifiable Random Function, allows a prover to create some pseudorandom value associated with
its private key, and prove that it did so correctly. The details of why that is useful or how it is used are not
relevant here. However, let's have a look at how it works. In this algorithm we use a different hash function,
- [P] computes
$L \gets H _ {s2c}(vk, m)$ and$\Gamma \gets \texttt{KDF}(sk, 0)\cdot L$ . Next, it selects a (pseudo)random scalar$k = \texttt{KDF}(sk || L, 1)$ , and computes$R = k \cdot G$ , and$P\gets k \cdot L$ . It sends$R, \Gamma, P$ to the verifier. - [V] selects a random scalar
$c$ , and sends it to the prover - [P] computes
$s = k + c * \texttt{KDF}(sk, 0)$ , and sends it to the verifier - [V] computes
$L \gets H _ {s2c}(vk, m)$ , and accepts if and only if$s * G = R + c * vk$ and$s * L = P + c * \Gamma$ .
Clearly, for the same secret key, both algorithms ed25519 and ECVRF have the same public key. This may seem a highly convenient opportunity to use the same secret key to reduce the burden on the user to remember two keys or to have any complex key derivation mechanisms in place. However, unfortunately, using the same secret key can turn out to be catastrophic.
If the same secret key is used for
the two algorithms, then an adversary could trick an ed25519 signer to effectively reveal her secret key. The adversary's strategy was hinted to earlier in the article. The idea is that the adversary can trick an ed25519 signer produce the same value of
To showcase the simplicity of the attack, we've implemented a simple script that, given a VRF proof for any message, requests the key owner to sign a particular message with ed25519, and this results in an extraction of the secret key. We show how the forged signature is accepted by libsodium's ed25519 verifier.
We begin by defining the message we'll use for the VRF proof, and initialising some variables.
#define MESSAGE (const unsigned char *) "yup"
#define MESSAGE_LEN 3
// The message that we need to craft in order to extract the key is a value
// publicly available. However, libsodium does not export the functions to
// compute it. Nonetheless, it is computed internally. To simplify our lives,
// we slightly modify libsodium VRF verifier to return the crafted message.
unsigned char crafted_msg[32], proof[80], sig[crypto_sign_BYTES], pk[crypto_sign_PUBLICKEYBYTES];
Next, we create the scope of the signer, over which we have no access when faking the signature.
{
unsigned char sk[crypto_sign_SECRETKEYBYTES];
crypto_sign_keypair(pk, sk);
// Now let's use these keys for vrf generation.
crypto_vrf_ietfdraft03_prove(proof, sk, MESSAGE, MESSAGE_LEN);
// Now, we have a proof that consists of 80 bytes that correspond to:
// * 32 bytes of an EC point that we can ignore
// * 16 bytes of a challenge C = H(pk, H, Gamma, U, V), where the values
// of H, Gamma, U and V are irrelevant.
// * 32 bytes of a scalar s' = k + C' * az
// where k = H(z || m), with z = H(sk)[32..], and az = H(sk)[..32].
unsigned char random_output[64];
if (crypto_vrf_ietfdraft03_verify(random_output, crafted_msg, pk, proof, MESSAGE, MESSAGE_LEN))
printf("failed VRF\n \n");
// Now we use the same key to create an ed25519 signature for the crafted message. Note
// that the only 'trick' we are doing is asking the signer to sign a particular message, after
// she has used the key to create a VRF proof. We do not access the secret key in any other way.
crypto_sign_detached(sig, NULL, crafted_msg, 32, sk);
if (crypto_sign_verify_detached(sig, crafted_msg, 32, pk))
printf("failed on ed25519 generation");
// Now we should have a 64 bytes signature that corresponds to:
// * first 32 bytes represent the point R = k * G, where k = H(z || m)
// where z = H(sk)[32..]
// * second 32 bytes represent a scalar s = k + az * HRAM
// where HRAM = H(R || pk || m), and az = H(sk)[..32]
}
As we said above, the problem is that the two challenges are different. What does this mean?
That we can extract the secret key. Mainly, we have that:
s - s' = (c - c') * az <=> az = (s - s') / (c - c')
Let's try to break it:
unsigned char c[64], cprime[32];
// First we need to compute c, as it is not given in the ed25519 signature. This is done
// using public values.
crypto_hash_sha512_state hs;
crypto_hash_sha512_init(&hs);
crypto_hash_sha512_update(&hs, sig, 32);
crypto_hash_sha512_update(&hs, pk, 32);
crypto_hash_sha512_update(&hs, crafted_msg, 32);
crypto_hash_sha512_final(&hs, c);
crypto_core_ed25519_scalar_reduce(c, c);
// Now we simply copy the challenge into a 16 byte string
memcpy(cprime, proof + 32, 16);
memset(cprime + 16, 0, 16); // Just for sanity.
// Now we have all we need, let's extract the secret.
unsigned char cminuscprimeinv[32], extracted_skey[32], extracted_pkey[32];
crypto_core_ed25519_scalar_sub(extracted_skey, sig + 32, proof + 48);
crypto_core_ed25519_scalar_sub(cminuscprimeinv, c, cprime);
crypto_core_ed25519_scalar_invert(cminuscprimeinv, cminuscprimeinv);
crypto_core_ed25519_scalar_mul(extracted_skey, extracted_skey, cminuscprimeinv);
crypto_scalarmult_ed25519_base_noclamp(extracted_pkey, extracted_skey);
So now, let's create a fake ed25519 signature for message {0}
not signed before.
We cannot use the normal API because the algorithm uses the preimage of an extension
of the key we have extracted. With the algorithm described above, we cannot access this
pre-image that the API expects. However, the 'missing' data is not necessary to forge
a signature. Goes without saying that the adversary now can create invalid VRF proofs.
unsigned char nonce_fake[32], challenge[64], sig_fake[64], reduced_c[32];
crypto_hash_sha512_state hs_f;
unsigned char msg[32] = {0};
// commitment
crypto_core_ed25519_scalar_random(nonce_fake);
crypto_scalarmult_ed25519_base_noclamp(sig_fake, nonce_fake);
// challenge
crypto_hash_sha512_init(&hs_f);
crypto_hash_sha512_update(&hs_f, sig_fake, 32);
crypto_hash_sha512_update(&hs_f, extracted_pkey, 32);
crypto_hash_sha512_update(&hs_f, msg, 32);
crypto_hash_sha512_final(&hs_f, challenge);
crypto_core_ed25519_scalar_reduce(reduced_c, challenge);
// response
crypto_core_ed25519_scalar_mul(sig_fake + 32, reduced_c, extracted_skey);
crypto_core_ed25519_scalar_add(sig_fake + 32, sig_fake + 32, nonce_fake);
While we didn't use the exposed API to generate the fake proof, we can use the usual API to verify it, i.e. there is no need to modify the verification algorithm in order to accept crafted signatures.
if (crypto_sign_verify_detached(sig_fake, msg, 32, pk))
printf("Failed to fake ed25519\n");
else
printf("Successfully faked an ed25519 sig\n");
And if one runs the script (see the details in the README.md file), we can see that we successfully faked an ed25519 signature!
While we should not design cryptographic algorithms to be able to share their secret keys, we should acknowledge that unfortunately that is something that highly attracts engineers. One existing proposal to solve the problem of deterministic generation of the nonce (that also caused issues in some libraries that had an incorrect API) was to combine determinism and secure source of randomness, so that the algorithm would have a flaw only if both sources somehow failed.
Another simple solution is to use a domain separation when computing the value of k
, i.e.
using some sort of padding in the hash function (the same way it is done with the suite_string
in
the output computation of the VRF) to ensure that there is no match between the randomness
used in VRF and that of ed25519.
Of course, the best solution, and the one suggested in this blogpost, is to not share the secret keys among different cryptosystems.
Thanks to my wonderful colleagues, Gamze Kilic, David Nevado, and Vanishree Rao for comments and review!
Footnotes
-
A non-expert reader should not be concerned of what this really means. Simply one should trust that extracting sk from pk is computationally hard and that operations over elliptic curve points are associative and cyclic. ↩
-
A key derivation function can simply be seen as a function that, given a secret seed, derives some secret value in a deterministic way. ↩