-
-
Notifications
You must be signed in to change notification settings - Fork 750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New encryption scheme may use HKDF in a sub-optimal (and cryptographically unproven) way. #7953
Comments
Thanks for looking at borg2's crypto! Just to be a bit more specific about what we are discussing, this is (besides our test suite) currently the only usage of hkdf in master branch in
What do we have there?
|
Right, and using a static key (
|
If I read the rfc correct, then the session id should go to info instead of salt |
The RFC in confusing in the sense that it recommended the salt to be random and the info to be a context identifier,the whole thing is pretty confusing |
The Blog post is most instructive As far as I understand borg with actual keys doesn't need hkdf as presented the input key is already matching the requirements |
Exactly -- HKDF was really designed for deriving keys after Diffie–Hellman exchange, where the IKM is not uniformly random. Since Borg's master key is already a PseudoRandom Key (PRK), there is no need for the "HKDF-Extract" stage. Only the "HKDF-Expand" stage is needed, with the random Therefore, if you follow RFC 5869 definition for the
One last note -- I'm pretty sure we can all agree (and the design paper for HKDF seems to second this) that there's really no reason to have the Since that's not the case here, that simplifies the equation down to:
Essentially, when you are using HKDF with an already pseudorandom IKM, and you only want to derive a single hash block (or less) from it, it basically simplifies down to a simple I think it makes sense for Borg's session id to be derived as follows (in Python): def _get_session_key(self, sessionid):
assert len(sessionid) == 24 # 192bit
key_64 = hmac_sha512(
key=self.crypt_key,
message=sessionid + b"borg-session-key-" + self.CIPHERSUITE.__name__.encode(),
)
return key_64[:32] |
Perhaps @soatok would know more than me, but since Borg is using HKDF with input key material that is already pseudorandom, I guess technically what it's doing now is sub-optimal, but likely not broken. IE there likely isn't an actual security problem here (although that fact is more difficult to prove than it would be if Borg were just to use the HMAC construction I suggested above). Does that sound right? |
From the perspective of "I'm an academic and I want to write a formal model or security proof", only having PRF security when KDF security is desired is a minor nuisance. From the real world perspective, PRFs are pretty damn good. |
@soatok but given our circumstances (crypt_key being fully random), there is no need to require or desire KDF security and we would be totally fine with PRF security, right? At NIST, they even describe a one-step kdf there: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-56Cr2.pdf in section 4 (that's a newer revision of the document you linked from your blog post) For our needs, that seems to boil down to just a truncated-hmac (as in #7953 (comment)) Or, as they also allow a hash instead a hmac, even simpler (if I understood correctly):
Note: we tended to use sha512, considering it is faster on 64bit CPUs when done in software. But, considering that there are more and more CPUs with hw accelerated sha2 and that this computation is only done once per session, we can just use the simpler code. Update: for |
also: convert sessionid memoryview to bytes before calling _get_cipher, to avoid TypeError in (session + flavour) operation.
See #7955 where I implemented the first approach with hmac_sha256 (in the first commit, the 2nd commit is just removing unused code). |
I haven't audited your design, I was just tagged and notified, so I can't say for sure. :) What I can say is, to most people, "KDF security" vs "PRF security" only matters in rare situations, like "I'm doing fun things with Diffie-Hellman" or "I invented a new kind of commutative algebraic group that aims to be post-quantum secure and want to use its output for a symmetric key". |
Hmm... Actually, I guess that makes a lot of sense -- HMAC really is meant for just that - calculating MACs with hash functions. Specifically, it was designed to prevent forced collisions when the attacker controls the plaintext (as is the case when a message is being authenticated). Since Borg controls the entire string that would be hashed, even HMAC is likely overkill. FYI FIPS 180 defines SHA-512/256 for exactly the reason you discussed above -- on 64-bit machines, it's often faster to just use SHA-512 and truncate it. So your construction above, but following the minor changes from the FIPS paper to produce sha512_256 seems like a reasonable solution:
|
I had the impression that hw acceleration accelerates sha256 (but not sha512, or not as fast), so I stopped optimising for sha512 if sha256 gives enough bits. |
Ah, you're right. I see they won't be accelerating SHA-512 until Arrow Lake Either way, as you said -- This isn't so critical that you're going to be performing thousands of hashes per second. Probably best to favour the simple path and just use SHA-256 (like you had suggested previously). It looks like it is both faster and easier. |
WRT #7955, it seems to me also that you have permission from FIPS to just use the SHA-256 hash function without the HMAC construction. Either would be okay -- Obviously the naked hash function would be a bit cheaper. It seems FIPS doesn't really state a preference. |
also: - convert sessionid memoryview to bytes before calling _get_cipher, to avoid TypeError in (session + flavour) operation. - add docstring and comments
Updated the PR to simply use sha256, also added some comments, flavour -> domain and updated the diagram. |
also: - convert sessionid memoryview to bytes before calling _get_cipher, to avoid TypeError in (session + flavour) operation. - add docstring and comments
…master crypto: use a one-step kdf for session keys, fixes #7953
Have you checked borgbackup docs, FAQ, and open GitHub issues?
Yes
Is this a BUG / ISSUE report or a QUESTION?
Issue
Describe the problem you're observing.
Looking at Borg's encryption design document and the discussion in Issue #3814, it seems that Borg may be using HKDF in a way that is unproven by cryptanalysis and makes sub-optimal use of the randomness introduced by
SessionId
by using it as thesalt
parameter of HKDF instead of feeding it into theinfo
parameter (calledcontext
in the diagram).In the blog post by @soatok entitled Understanding HKDF, he discusses the formal proofs provided for HKDF, and how many developers accidentally misuse the algorithm in ways that invalidate those proofs. It appears that the algorithm was designed so that the same "(salt, IKM)" would be used across all iterations of HKDF for a given key being used in a given context (in this case, the master keying material in the context of a Borg archive), and the cryptographic proofs do not extend to "same IKM, varying salt". If different keying material is desired for the same IKM within the same context, it should be produced by varying the
info
parameter, not thesalt
.In fact, according to RFC 5869 Section 3.3, if the IKM is already strong and uniformly-random, the
HKDF-Extract
stage can just be skipped altogether (which is the only step that usessalt
). Borg's IKM already meets that requirement, so it would be permissible (and still within the cryptographic proof for HKDF) for Borg to do away with the first stage altogether (including eliminating thesalt
parameter) and just directly use the 512-bitcrypt_key
in theHKDF-Expand
stage.So, if we assume that it's okay to get rid of the
HKDF-Extract
stage (which RFC 5869 supports), and since we're only extracting a single 256-bit output block fromHKDF-Expand
, the session key derivation becomes a single iteration ofHKDF-Expand
withcrypt_key
as the PRK andsession_id | flavor_string
as theinfo
:session_key = HMAC(crypt_key, session_id | flavor_string | 0x01)
(Technically, you need to have the
0x01
concatenated in there to comply with the proofs from the paper, although I'd wager that it doesn't add much in terms of security, since it's already being concatenated with a random number).The text was updated successfully, but these errors were encountered: