-
-
Notifications
You must be signed in to change notification settings - Fork 750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new AEAD crypto with session keys #6463
Conversation
Codecov Report
@@ Coverage Diff @@
## master #6463 +/- ##
==========================================
+ Coverage 82.71% 82.82% +0.10%
==========================================
Files 39 39
Lines 10525 10649 +124
Branches 2076 2088 +12
==========================================
+ Hits 8706 8820 +114
- Misses 1312 1320 +8
- Partials 507 509 +2
Continue to review full report at Codecov.
|
bcd8c8b
to
0518dce
Compare
@enkore @textshell "it works", thus ready for some early feedback, if you have some time. |
f4b2429
to
9b55c7b
Compare
header = b'\x12\x34\x56' | ||
header = b'\x12\x34\x56' + iv_int.to_bytes(12, 'big') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the old AES256_OCB / CHACHA20_POLY1305 code used to append the IV to the header (and fed it all into the auth tag computation).
to keep our test vectors working, the caller now has to append the iv to the header as the code does not do that any more.
new AEAD crypto can be used with borg >= 1.3. old crypto is used by attic and borg < 1.3.
A lot of people have concerns about AES-GCM. Considering we can use AES-OCB, I guess we will not use AES-GCM anyway, thus no need to talk about it.
one openssl call less due to simpler layout! also change default for aad_offset to 0: by default, we want to authenticate the complete header.
ab75726
to
8a17f53
Compare
I adapted the type bytes, so we can keep the ciphersuite:4 keytype:4 and do this: ciphersuite 0 == legacy modes, "as is" (in the past, we used $00 .. $07, so this fits) |
olen is assigned by OpenSSL, but the compiler can't know that and generates these warnings: warning: src/borg/crypto/low_level.pyx:271:22: local variable 'olen' referenced before assignment warning: src/borg/crypto/low_level.pyx:274:22: local variable 'olen' referenced before assignment warning: src/borg/crypto/low_level.pyx:314:22: local variable 'olen' referenced before assignment warning: src/borg/crypto/low_level.pyx:317:22: local variable 'olen' referenced before assignment warning: src/borg/crypto/low_level.pyx:514:22: local variable 'olen' referenced before assignment warning: src/borg/crypto/low_level.pyx:517:22: local variable 'olen' referenced before assignment warning: src/borg/crypto/low_level.pyx:566:22: local variable 'olen' referenced before assignment warning: src/borg/crypto/low_level.pyx:572:22: local variable 'olen' referenced before assignment
9952dbb
to
c668265
Compare
Looks like there is no ongoing review (although I asked around), so I'll merge this after tests finish. If you find something after the merge, just file an issue on the issue tracker. |
Note: i did not change the dispatching yet, so it still works using the full type byte (not the 4bits suite + 4 bits keytype). guess the dispatching can be changed more easily after we completely remove the legacy crypto. |
@hexagonrecursion merging this will likely cause some merge conflicts for your ongoing PR - sorry about that. i can offer increasing the bounty for the additional work or helping rebasing what you have now (whatever you prefer). |
I prefer to rebase my PR myself. |
As proposed, my analysis, which is focused on the use of ChaCha20-Poly1305. Within one session, are all chunks encrypted using the same invocation of the AEAD? Or is there a separated invocation of AEAD for each chunk? < obsolete due to the answers below > If it is the second case (separated invocation per chunk / multiple AEAD invocations per session), I have a somewhat ambivalent view of the CTR. I assume 2^48 counters with at least 512K chunk each seems sufficient for a long time for average users, being able to encrypt 134.217.728 TB before the counter starts again at 0. But in the second case, it might be an easy-to-achieve goal to consider every potential use case in Borg's design, so that it can be easily forked/adjusted to, e.g., enterprise use cases. Storing amounts like 134.217.728 TB is already today the dimension of some multinational corporations, and the amounts keep increasing exponentially (a related 2016 paper; and when a new repo is created, one session includes all initial data at once :). The goal should be to ensure that no architect in any use case will need to consider the nonce (and the dangerous security issues it causes when it repeats with the same key) in the foreseeable future. Here the design could be a bit more future-proof and (user-)fault-tolerant. My suggestion would be to increase the counter to at least 64 bit for the "normal" ChaCha20 ("normal" in order to separate it from the modification XChaCha, see below). RFC 8439 suggests a 96 bit nonce but 32 bit are sender-specific and only 64 bit of the nonce are a counter (additional to the block counter, which is separated from the nonce); but the RFC also considers other use cases that might not be relevant here: 64 bit (as a 64 bit counter) for the nonce is appropriate if the "normal" ChaCha20 has to be used. But using a 96 bit counter as nonce (unlike the RFC), if it does not make a difference in the implementation, might be the best solution for the "normal" ChaCha20. The then reduced 32 bit block counter is more than sufficient if chunk sizes are >256GB (assuming 1 AEAD invocation : 1 chunk). However, it might be noted that XChaCha12/20 (as in Adiantum in the Linux kernel for disk encryption) is often preferred for use cases that have to be prepared for future-proof large data amounts. Nevertheless, I think the "normal" ChaCha20 is sufficient as well for the foreseeable future if the counter size is adjusted because the chunk sizes are much larger than block sizes of Linux's disk encryption with XChaCha. Alternatively to the increased counter size, a new session could be automatically invoked if the counter is at its end. This should be implemented explicitly, and would split a very huge backup automatically into >1 increments due to >1 session. Feel free to repeat my calculations to make sure I didn't make an error. I hope this is somewhat helpful for this great project. Supplement: Given the other implications of the backup environment (see below), scenarios in which a 48 bit counter will reach its limit are unlikely and thus, 48 bit remain sufficient within Borg's architecture. |
The HKDF is used once per session to derive the session key (one invocation of e.g. The AEAD crypto is invoked once per chunk and usually there are many chunks processed in a session. The default target size for a chunk is 2MiB, but if the input file size is way smaller than that, it will also result in a separate chunk. Due to the deduplication and for any non-first backup of some data set, the usual amount of new chunks (and only these will need to get encrypted) is relatively small though, because most chunks will be duplicate because they already exist in the repository. Can you please edit your post and remove the stuff that does not apply for this use case? |
About counter size:
|
Note: we use AES-OCB, not OFB. |
BTW, thanks for your review! |
I have seen worse solutions, even in larger use cases :) Think about how many easy-to-fix / legacy vulnerabilities have been used in the past to successfully attack large organizations/enterprises. Unfortunately, large organizations do not imply best practices or proper design stages before implementation, especially in non-IT organizations. Expectations can be dangerous, and failure in consequent/preceding stages of a component/system should be anticipated and balanced if possible (avoid single points of failure and such). In short, apply "secure by design" not just to code but also to the social/organizational environment in which it evolves/deploys :-). But you are right, my statement assumes a "worst practice" scenario. If an admin designs a backup system properly, my scenario is obsolete anyway.
As mentioned, my statement only intended to address large scale and long term scenarios, just in case Borg wants to consider them already now. I like to have design stages always consider every practically-implementable and foreseeable case. However, given the other limitations you mentioned, my statement is obviously obsolete in Borg's use case, and in the overall architecture of Borg, there is consequently no incentive to adjust the counter at all: the RAM usage you indicated is not realistic / not feasible for 2^48 chunks (>12,5 Petabyte RAM). However, one question that goes beyond the scope of this PR but that is related to its security: what is/are the origin(s) of the secret(s) in HKDF's input (so, enc_key:256 & hmac_key:256)?
My bad. Thinking error during writing; I recently had to read some stuff about the other one (seems to have impacted me more than expected :-) |
the ikm (enc_key + hmac_key) is coming from the borg repokey or keyfile. when creating that key, borg uses for a new repo, there are 2 scenarios:
|
- The id is now also input into the authentication tag computation. | ||
This strongly associates the id with the written data (== associates the key with | ||
the value). When later reading the data for some id, authentication will only | ||
succeed if what we get was really written by us for that id. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Curious user here who happens to be a cryptographer.)
What gains do you expect from this?
So if I read this correctly, then the chunk id is an additional MAC over the uncompressed and unencrypted data. But the AEAD already authenticates the unencrypted data (no matter whether it's encrypted or not), so I fail to see the purpose of adding the id to the AAD.
Does the id have a purpose in other parts of borg or is this just for the encryption?
I may be able to provide a more detailed review but I realize I'm quite late to the party... Do you think this would still be helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Tim,
thanks for looking at our crypto, review there is very welcome (and not too late, borg2 is still in alpha, soon beta, so we could still change things)!
If we do not consider time needed for computing the MAC (chunkid) from plaintext, there is no advantage. But we would have to compute the MAC to verify if it is the same as the chunkid we wanted when we asked the repo for that chunk. borg 1.x does it like that. Without that, the self-made AEAD construction we used in 1.x (AES-CTR + HMAC-SHA256 and not including the chunkid) would just be able to tell that the content was written by us (authentic), but we could not be sure if we wrote that for that chunkid (could have been also some other chunkid).
If we feed the chunkid into the AAD computation when storing the chunk and also when requesting the chunk, the authentication will fail if:
- the content is corrupted / tampered
- the content is ok, but not for the chunkid we wanted
Thus, we do not need to compute the MAC to verify if the chunk is really for the chunkid we wanted - saves some CPU time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this makes a lot of sense now! Adding the chunkid as AAD binds the key (as in "key-value store") to the value.
Thus, we do not need to compute the MAC to verify if the chunk is really for the chunkid we wanted - saves some CPU time.
Indeed yes, I see that this would save CPU time. But then I'm confused whether it's really implemented like this? AFAIU the code still recomputes the MAC:
Line 897 in b52781a
self.assert_id(id, data) |
thanks for looking at our crypto, review there is very welcome (and not too late, borg2 is still in alpha, soon beta, so we could still change things)!
Thanks for the warm welcome. I'll keep looking at the PR and docs here. I'll probably continue to ask some very basic questions, so please bear with me. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, fixed by #6880 (and added a test to make sure it works as expected).
A more high-level question: Is there a reason why deterministic encryption / SIV mode wasn't used? Deterministic encryption sounds scary on first glance but it fits the deduplication model of borg. We'd anyway never store the same chunk twice. Or in other words, even if we were to create multiple different ciphertexts for the same chunk, the attacker would notice as they're index by the chunk ID if I understand correctly. I see that has been discussed a few times, e.g., #1044 (comment) and @ozppupbg reinvented it in #3814 (comment). @DemiMarie commented that compression could be an issue #3814 (comment) but I'm pretty sure this could be solved by including the compression parameters in the AAD. (Actually, are they currently authenticated somewhere?) (Do we want me to move the discussion to somewhere else, e.g., one these issues?) |
Guess the main spoiler was that SIV is not in openssl. Did that change? |
OpenSSL 3 apparently has it but not sure if this is too recent for use in borg. edit: So the advantage of SIV would that we don't need to rely on the system RNG to provide good session ids. The usual disadvantages that this is slower because it needs to compute a MAC of the entire message to derive a deterministic IV. In theory, that won't be a problem in our case because we anyway compute a MAC of the entire message. But I haven't checked if the OpenSSL API is sufficient to really implement it like that (without redoing the MAC). Independent of this, are the compression parameters currently authenticated somewhere? |
fixes #3814, fixes #1031, fixes #1044. also related to #45.