Replace hash_to_base PRG with HKDF-Expand. #141

chris-wood · 2019-07-04T15:49:15Z

Addresses #137.

kwantam

This is great.

It wasn't clear to me that it would be quite this easy to invoke HKDF, because I was thinking that the size of the PRK argument to HKDF was restricted to the output length of H, rather than the block size of H. But as far as I can tell, this mostly works for the hash functions we'd expect people to use (but see below).

~~We will have to stage this and #139 because the definition of m' changes in that PR. (EDIT: #139 is now merged, so we'll have to do some conflict resolution here.)~~

Assuming we go with the string "HASH-TO-CURVE" as in #139, the length of m' is 14 + H_output_len bytes. Just for sanity, let's see whether this works with the hash functions we might care to use:

224-bit hashes: len(m') = 42 bytes
- SHA2-224 block size = 64 bytes ✔️
- SHA3-224 block size = 144 bytes ✔️
- BLAKE2s224 block size = 64 bytes ✔️
256-bit hashes: len(m') = 46 bytes
- SHA2-256 block size = 64 bytes ✔️
- SHA3-256 block size = 136 bytes ✔️
- BLAKE2s256 block size = 64 bytes ✔️
- BLAKE2b256 block size = 128 bytes ✔️
384-bit hashes: len(m') = 62 bytes
- SHA2-384 block size = 128 bytes ✔️
- SHA3-384 block size = 104 bytes ✔️
- BLAKE2b384 block size = 128 bytes ✔️
512-bit hashes: len(m') = 78 bytes
- SHA2-512 block size = 128 bytes ✔️
- SHA3-512 block size = 72 bytes ❌ hmmmm
- BLAKE2b512 block size = 128 bytes ✔️

So the only one that's a little weird is SHA3-512. Maybe we should consider replacing "HASH-TO-CURVE" with "H2CURVE", which is 6 bytes shorter and thus works with SHA3-512, or even just "H2C" as @samscott89 has suggested elsewhere.

(Note that len(m') > block size isn't fatal---it just requires, per RFC2104, hashing m' again to give an HMAC key that is shorter than the block length. We should probably avoid this.)

draft-irtf-cfrg-hash-to-curve.md

kwantam · 2019-07-04T23:03:15Z

An alternative suggestion:

What if we did

m' = H(msg)
for i in (1, ..., m):
  info = "H2C" || I2OSP(ctr, 1) || I2OSP(i, 1)
  t = HKDF-Expand-H(m', info, L)
  e_i = OS2IP(t) mod p
return u = (e_1, ..., e_m)

This guarantees that the HMAC key (m') is shorter than the block length. Moreover, the info argument to HKDF-Expand is 5 bytes, which guarantees the minimum possible number of H invocations in HKDF-Expand (4 per iteration, because of HMAC) for the major hash functions I'm aware of (see below).

Also, aesthetically this is slightly nicer, since it moves all of the hash-to-curve--specific domain separation pieces ("H2C", ctr, i) into one place.

Finally, this change sort of anticipates the suggestion in my next comment.

Let's check to make sure that HMAC uses the minimum number of H invocations in all cases.

Recall that HMAC(k, msg) = H( (k XOR OPAD) || H( (k XOR IPAD) || msg ) ). For simplicity, I'm assuming that k is one block long (in reality, it's always padded or hashed-and-padded to that length, so this is a reasonable simplification).

HKDF-Expand(k, info, L) in the worst case invokes

HMAC(k, H(something) || info || b)

where b is 1 byte long. So what we need to check is that H(something) || info || b is short enough for each hash function of interest.

SHA-2 adds at least 9 bytes (rounding up) of padding to its argument, so when H is a SHA-2 function, the argument to the inner invocation of H in the HMAC invocation in HKDF-Expand is block_len + hash_len + len(info) + 1 + 9 bytes long, and we want this value to be at most 2 * block_len. Worst case is SHA2-256, which has block_len = 64, hash_len = 32. In this case, len(info) must be at most 64 - 32 - 10 = 22 bytes. ✔️
SHA-3 adds at least 1 byte (rounding up) of padding to its argument, so when H is a SHA-3 function, the argument to the inner H invocation is block_len + hash_len + len(info) + 1 + 1 bytes long. Worst case is SHA3-512, which has block_len = 72, hash_len = 64. In this case, len(info) must be at most 72 - 64 - 2 = 6 bytes ✔️
BLAKE2 doesn't force padding, so when H is a BLAKE function, the argument to the inner H invocation is block_len + hash_len + len(info) + 1 bytes long. Worst case is BLAKE2s256, which has block_len = 64, hash_len = 32. In this case, len(info) must be at most 64 - 32 - 1 = 31 bytes ✔️

So it looks like "H2C" is preferred if we want to avoid another compression function invocation in the absolute worst case, which is SHA3-512.

kwantam · 2019-07-04T23:18:10Z

Another question to consider: should we use HKDF-Extract to compute m'?

(Just spitballing here, not sure whether I like it or not. Also, I'm going to assume for concreteness that we're going with the suggested change in my prior comment. This could work either way, though.)

HKDF-Extract takes two arguments, salt and msg. In the spirit of @hoeteck's suggestion in #124 (and a suggestion from Dan out-of-band), we might require higher-level protocols to set the value for salt based on their domain separation string.

hash_to_base(msg, ctr)

Parameters:
- DSS, a domain separation string chosen according to the
  guidelines given in {{domain-separation}}.
- H, a cryptographic hash function.
- F, a finite field of characteristic p and order q = p^m.
- L = ceil((ceil(log2(p)) + k) / 8), where k is the security parameter
  of the cryptosystem (e.g., k = 128).
- HKDF-Extract-H is the HKDF-Extract function of RFC5869
  instantiated with hash function H.
- HKDF-Expand-H is the HKDF-Expand function of RFC5869
  instantiated with hash function H.

Inputs:
- msg is the message to hash.
- ctr is 0, 1, or 2.
  This is used to efficiently create independent
  instances of hash_to_base (see discussion above).

Output:
- u, an element in F.

Steps:
1. m' = HKDF-Extract-H(H(DSS), msg)
2. for i in (1, ..., m):
3.   info = "H2CURVE" || I2OSP(ctr, 1) || I2OSP(i, 1)
4.   t = HKDF-Expand-H(m', info, L)
5.   e_i = OS2IP(t) mod p
6. return u = (e_1, ..., e_m)

If DSS is fixed, H(DSS) can be precomputed to save one invocation of H. Also, this lets people use domain separation strings of arbitrary length with effectively no performance penalty.

burdges · 2019-07-04T23:29:18Z

As an aside, STROBE would handle this role fairly cleanly too.

kwantam · 2019-07-04T23:49:34Z

As an aside, STROBE would handle this role fairly cleanly too.

Great! Since this is a very general framework, is there a specific STROBE-related hash function that you have in mind here?

(My guess is that our initial ciphersuite specs will all use hashes in the SHA2 family, but I'm certain that other people will eventually want to use, e.g., BLAKE. So probably the action item with respect to STROBE is just to make sure that we're not accidentally specifying something that's incredibly inefficient.)

chris-wood · 2019-07-05T03:51:22Z

STROBE would handle this role fairly cleanly too.

That is good to know, though I don't think we could adopt it so easily at the moment.

chris-wood · 2019-07-05T04:04:41Z

Another question to consider: should we use HKDF-Extract to compute m'?

(Just spitballing here, not sure whether I like it or not. Also, I'm going to assume for concreteness that we're going with the suggested change in my prior comment. This could work either way, though.)

HKDF-Extract takes two arguments, salt and msg. In the spirit of @hoeteck's suggestion in #124 (and a suggestion from Dan out-of-band), we might require higher-level protocols to set the value for salt based on their domain separation string.

I'm fine with this change, though I think I'd remove the initial hash computation of DSS. My reasoning being that HKDF will compute this hash anyway if |DSS| > H's output size anyway. Thanks for the suggestion!

hash_to_base(msg, ctr)

Parameters:
- DSS, a domain separation string chosen according to the
  guidelines given in {{domain-separation}}.
- H, a cryptographic hash function.
- F, a finite field of characteristic p and order q = p^m.
- L = ceil((ceil(log2(p)) + k) / 8), where k is the security parameter
  of the cryptosystem (e.g., k = 128).
- HKDF-Extract-H is the HKDF-Extract function of RFC5869
  instantiated with hash function H.
- HKDF-Expand-H is the HKDF-Expand function of RFC5869
  instantiated with hash function H.

Inputs:
- msg is the message to hash.
- ctr is 0, 1, or 2.
  This is used to efficiently create independent
  instances of hash_to_base (see discussion above).

Output:
- u, an element in F.

Steps:
1. m' = HKDF-Extract-H(H(DSS), msg)
2. for i in (1, ..., m):
3.   info = "H2CURVE" || I2OSP(ctr, 1) || I2OSP(i, 1)
4.   t = HKDF-Expand-H(m', info, L)
5.   e_i = OS2IP(t) mod p
6. return u = (e_1, ..., e_m)

If DSS is fixed, H(DSS) can be precomputed to save one invocation of H. Also, this lets people use domain separation strings of arbitrary length with effectively no performance penalty.

chris-wood · 2019-07-05T04:07:13Z

It wasn't clear to me that it would be quite this easy to invoke HKDF, because I was thinking that the size of the PRK argument to HKDF was restricted to the output length of H, rather than the block size of H. But as far as I can tell, this mostly works for the hash functions we'd expect people to use (but see below).

This seems to resolve itself by just using Extract() before Expand(). :-)

kwantam

Great!

These are pretty small nits, even though they look like a bunch of comments...

One other small thing: should we add a forward ref from {{domain-separation}} to here?

Maybe a standalone paragraph before the one that starts "Care is required..." that says something like

{{hashtobase}} specifies how to apply a domain separation tag.

draft-irtf-cfrg-hash-to-curve.md

burdges · 2019-07-05T07:33:07Z

I think only the keccak-f(1600) based STROBE variant has any implementations right now.

@chris-wood Are you saying the hash-to-field functions call extract in a tree like way? It's true STROBE does not add so much for trees where you'd clone the state all the time. You'd need to impose an ordering on the extractions to exploit STROBE optimally. And doing so encurages constraints on the order in which developers extract field elements.

It's actually common to clone STROBE states, which may still save some stack space over HKDF, but not much, and maybe worse with hand optimizations. I suppose the most efficient scheme for extracting a tree is to simply use ChaCha20, assigning nonces in a tree-like way using "heap addressing".

All this is moot because BLS is really for consensus protocols, not "accounts", so nobody will ever run BLS on ridiculously constrained devices anyways, like say a Ledger device.

Co-Authored-By: Riad S. Wahby <[email protected]>

chris-wood · 2019-07-05T13:30:21Z

@chris-wood Are you saying the hash-to-field functions call extract in a tree like way? It's true STROBE does not add so much for trees where you'd clone the state all the time. You'd need to impose an ordering on the extractions to exploit STROBE optimally. And doing so encurages constraints on the order in which developers extract field elements.

No, sadly, my comment was more reflective about IETF than it was about anything technical. (We'd need to fully specify STROBE here or elsewhere prior to adopting it.)

…-to-curve into caw/hkdf

burdges · 2019-07-05T17:31:56Z

Right, I'm actually not convinced STROBE is optimal anyways. I'd think the meta_ad used for domain separation could probably safely xor into another part of the state in parallel to the main data xor, thus reducing keccak invocations. Also you'd want some fast input command analogous to kangarootwelve. And some ChaCha based variant. Anyways sorry for the derail..

kwantam · 2019-07-05T18:27:49Z

Awesome!

I just realized there's one more bit of inconsistency that this PR should fix: the description in {{hashtobase-perf}}. I opened #143 against the PR branch because suggested edits can't yet do multiline as far as I can tell.

samscott89 · 2019-07-05T18:48:07Z

draft-irtf-cfrg-hash-to-curve.md

  parameter of the cryptosystem (e.g., k = 128).
+- HKDF-Extract-H is the HKDF-Extract function of RFC5869


I find this syntax a little confusing. Is the idea that the H in HKDF-Extract-H should be expanded to SHA2, etc, in each case?

I suggest we stick to the same notation used in the original draft, and others like the TLS 1.3 draft. So, just use "HKDF-Extract" and specify under that the hash function used is given by the ciphersuite?

What if we just called it "Extract"?

- Extract is the HKDF-Extract function of RFC5869 instantiated with hash function H.

Is that clearer?

Maybe:

- HKDF-Expand and HKDF-Extract are as defined in {{rfc5869}}, instantiated with the hash function H

Again, keeping it closer to notation used elsewhere.

I preferred the -H notation since it made clear that H determined Extract, though the expansion issue is a valid concern. I'm fine with the proposal!

update description in {{hashtobase-perf}}

…-to-curve into caw/hkdf

kwantam

Nice! Just one tiny annoyance, sorry...

draft-irtf-cfrg-hash-to-curve.md

Co-Authored-By: Riad S. Wahby <[email protected]>

chris-wood · 2019-07-06T13:28:05Z

Nice! Just one tiny annoyance, sorry...

Nits are always appreciated! No need to apologize.

Replace hash_to_base PRG with HKDF-Expand.

e58ee2b

chris-wood requested review from kwantam, armfazh and samscott89 and removed request for kwantam and armfazh July 4, 2019 15:49

kwantam requested changes Jul 4, 2019

View reviewed changes

chris-wood added 2 commits July 4, 2019 21:07

Updates based on PR.

85369cd

Merge github.com:chris-wood/draft-irtf-cfrg-hash-to-curve into caw/hkdf

1736233

chris-wood force-pushed the caw/hkdf branch from 8047b4f to 1736233 Compare July 5, 2019 04:13

kwantam requested changes Jul 5, 2019

View reviewed changes

chris-wood and others added 4 commits July 5, 2019 06:25

Update draft-irtf-cfrg-hash-to-curve.md

4d03b9b

Co-Authored-By: Riad S. Wahby <[email protected]>

Update draft-irtf-cfrg-hash-to-curve.md

a8497b1

Co-Authored-By: Riad S. Wahby <[email protected]>

Update draft-irtf-cfrg-hash-to-curve.md

a166f86

Co-Authored-By: Riad S. Wahby <[email protected]>

s/DSS/DST and domain separation tag clarifications.

677bc6d

chris-wood added 2 commits July 5, 2019 08:15

s/DSS/DST and domain separation tag clarifications.

ba87b00

Merge branch 'caw/hkdf' of github.com:chris-wood/draft-irtf-cfrg-hash…

2726950

…-to-curve into caw/hkdf

update description in {{hashtobase-perf}}

bb070f6

samscott89 reviewed Jul 5, 2019

View reviewed changes

chris-wood and others added 3 commits July 5, 2019 13:05

Merge pull request #143 from kwantam/kwantam/hkdf

958655a

update description in {{hashtobase-perf}}

Drop "-H" from HKDF notation.

8a76379

Merge branch 'caw/hkdf' of github.com:chris-wood/draft-irtf-cfrg-hash…

6ac003a

…-to-curve into caw/hkdf

kwantam approved these changes Jul 5, 2019

View reviewed changes

draft-irtf-cfrg-hash-to-curve.md Outdated Show resolved Hide resolved

Update draft-irtf-cfrg-hash-to-curve.md

0419946

Co-Authored-By: Riad S. Wahby <[email protected]>

chris-wood merged commit 95e8aed into master Jul 6, 2019

kwantam mentioned this pull request Jul 6, 2019

HKDF in place of ad-hoc PRG currently used by hash_to_base? #137

Closed

chris-wood deleted the caw/hkdf branch February 18, 2022 16:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace hash_to_base PRG with HKDF-Expand. #141

Replace hash_to_base PRG with HKDF-Expand. #141

chris-wood commented Jul 4, 2019

kwantam left a comment •

edited

Loading

kwantam commented Jul 4, 2019 •

edited

Loading

kwantam commented Jul 4, 2019 •

edited

Loading

burdges commented Jul 4, 2019

kwantam commented Jul 4, 2019 •

edited

Loading

chris-wood commented Jul 5, 2019

chris-wood commented Jul 5, 2019

chris-wood commented Jul 5, 2019

kwantam left a comment •

edited

Loading

burdges commented Jul 5, 2019

chris-wood commented Jul 5, 2019

burdges commented Jul 5, 2019

kwantam commented Jul 5, 2019

samscott89 Jul 5, 2019

kwantam Jul 5, 2019 •

edited

Loading

samscott89 Jul 5, 2019

chris-wood Jul 5, 2019

kwantam left a comment

chris-wood commented Jul 6, 2019

		parameter of the cryptosystem (e.g., k = 128).
		- HKDF-Extract-H is the HKDF-Extract function of RFC5869

Replace hash_to_base PRG with HKDF-Expand. #141

Replace hash_to_base PRG with HKDF-Expand. #141

Conversation

chris-wood commented Jul 4, 2019

kwantam left a comment • edited Loading

Choose a reason for hiding this comment

kwantam commented Jul 4, 2019 • edited Loading

kwantam commented Jul 4, 2019 • edited Loading

burdges commented Jul 4, 2019

kwantam commented Jul 4, 2019 • edited Loading

chris-wood commented Jul 5, 2019

chris-wood commented Jul 5, 2019

chris-wood commented Jul 5, 2019

kwantam left a comment • edited Loading

Choose a reason for hiding this comment

burdges commented Jul 5, 2019

chris-wood commented Jul 5, 2019

burdges commented Jul 5, 2019

kwantam commented Jul 5, 2019

samscott89 Jul 5, 2019

Choose a reason for hiding this comment

kwantam Jul 5, 2019 • edited Loading

Choose a reason for hiding this comment

samscott89 Jul 5, 2019

Choose a reason for hiding this comment

chris-wood Jul 5, 2019

Choose a reason for hiding this comment

kwantam left a comment

Choose a reason for hiding this comment

chris-wood commented Jul 6, 2019

kwantam left a comment •

edited

Loading

kwantam commented Jul 4, 2019 •

edited

Loading

kwantam commented Jul 4, 2019 •

edited

Loading

kwantam commented Jul 4, 2019 •

edited

Loading

kwantam left a comment •

edited

Loading

kwantam Jul 5, 2019 •

edited

Loading