-
-
Notifications
You must be signed in to change notification settings - Fork 750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new crypto code, blackbox, aead internally #1034
Conversation
borg/crypto.pyx
Outdated
cdef class AES_GCM_256_GMAC: | ||
# Layout: GMAC 16 + IV 16 + CT (borg 1.2) | ||
# additionally, each chunk starts with a type byte, | ||
# which is not passed to this code or added by this code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC for AES-GCM a 12 byte nonce/IV is recommended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, it is the default. but recommended / why not more?
i mainly wanted to avoid the byte-frickling as seen in the legacy class.
found it, rfc 5084.
Do we want so split the implementation of cipher suites in two parts ("low level" part in crypto.pyx and "high level" part in key.py), or just keep it in one piece? I think it would be easier to just have it in one piece and have the KeyXXX class in key.py as it's also now. These cipher suites are kinda low level primitives anyway. btw. I added a few notes how to handle IV storage now, since with multiple keys the final IV can't be retrieved from the manifest anymore. See the link in #1031 (And for repokey the key must also be stored locally for the same replay-safety as before) |
I'ld keep encrypt/mac and auth/decrypt in one (low-level) piece - like it is now - they can also deal with reading / writing the bytes from / to the right places into the buffer. |
I read some more about AES-GCM. RFC5084 and others recommend the 12 byte nonce/IV for efficiency. Other sources (e.g. https://www.cryptopp.com/wiki/GCM_Mode) also recommend it for the same reason. The authentication tag is always calculated as 128 bits wide and only truncated at the end of the operation to the requested size, so there are no performance gains in using a smaller authentication tag. RFC5084 gives no rationale for their recommendation here. In RFC5288 (AES-GCM for TLS 1.2) they use a 128 bit tag and a 96 bit nonce/IV. In RFC4106 the nonce is constructed from a 64 bit IV and a 32 bit salt, so it's 96 bits again, however, the only required tag size is 128 bit:
In http://csrc.nist.gov/publications/nistpubs/800-38D/SP-800-38D.pdf they also recommend the 96 bit nonce/IV for "simplicity and efficiency". This paper also says that for IVs >= 96 bits only a random IV for at least 96 bits should be used; so they recommend a completely random 96 bit IV (section 8.2). Also (section 9)
Which wouldn't be a problem with a good random number generator for the IV. What is interesting here is that repeating IVs seem to not only compromise the encryption (as with AES-CTR), but also the authentication. Summary: Stick with 96 bit IV as everyone recommends, use a 128 bit tag for maximum security w/ no drawbacks? |
Yes, I'll change it. 96bit mac felt crappy anyway. |
33c6355
to
a4c1437
Compare
86c2b64: If I understood correctly there is an internal per-message counter in GCM appended to the external IV (if it is 96 bits, else to the derived 96 bit IV from however many bits where passed in), hence there is no IV to be retrieved. This is also the reason for the 64 GiB message length limit: A 4 byte CTR only allows 2**32-1 blocks = 64 GiB to be encrypted before the CTR would overflow and the IV where repeated (with the usual implosion of the universe following). Names: AES_CTR_..._legacy would serve both as the lower abstraction (formerly crypto.AES) for both the old (fixed) crypto code and (later) for a DEK cipher suite? Then it's not really legacy, remaining a valid choice. (It is strongly unforgable "to 256 bits" instead of "just" 128 bits in AES-GCM) AES_GCM_256_GMAC: _GMAC is kinda redundant, _GCM already implies authentication using GCM/GMAC. |
borg/crypto.pyx
Outdated
def iv(self): | ||
return self.ctx.iv[:16] | ||
def current_iv(self): | ||
raise NotImplemented # gcm mode does not maintain/increment the counter in self.ctx.iv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there's no interfacing mandating current_iv(): Just remove the method entirely? Since this is sensitive/crypto code it might be a good idea to explain in a comment why this doesn't make sense with AES-GCM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll solve this by providing next_iv function, based on self.iv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, seems like wikipedia was misleading concerning the counter in gcm mode.
i'll update the names.
For AES-GCM the message size (number of blocks) is not relevant to the 96 bit IV: When using "AES-GCM-CTR" (so to speak) the 96 bit IV represents the number of messages (EVP_Encrypt... EncryptFinal) encrypted, not the number of blocks. I think it would be simpler to use a random 96 bit IV for AES-GCM and keep the IV handling as-is for the existing AES-CTR code. Or, alternatively, keep the IV handling as-is for AES-CTR, and do the 12 byte iv+=1 manually (as in 70e006e) I feel like AES-GCM with a random IV would be safer in Borg considering "evil repository" scenarios where an attacker interferes with live operations (e.g. acting like a commit crashed, after roll-back the client may re-use CTRs that the attacker already saw during the rolled back transaction <-- and catching all of these cases could be challenging!) |
Yeah, I realized that after your previous comment. It would not malfunction btw, it would just be wasteful with counter values. Random IVs is not part of this PR, so I'll just make it work in the "counting way" as we currently do it. |
70e006e
to
5e7e04d
Compare
I am not fixing existing potential attacks on the IV here, I am just implementing a new internal interface and want to make it work in a similar way as it worked until now. So, to get IV stuff fixed, I suggest you file a bug about this first and we fix it outside of this PR. What I was referring to when saying "function correctly, but wasteful" was that incrementing by 1 (0 for empty) is required for correct gcm function, but I was incrementing by num_aes_blocks which is >= 1 (and also 0 for empty). But it was wasting 2^32 / 16 counter values in extreme case. |
Sorry, I didn't want to hijack the thread. I moved these thoughts to another ticket, which I should have done in the first place.
Absolutely :)
Ok. I think this is inevitable with AES-GCM and as intended by the GCM spec. In any case, not a problem - independent of CTR or random IVs one never wants to use a significant portion of the possible IV space. |
borg/crypto.pyx
Outdated
@@ -318,6 +329,7 @@ cdef class AES_GCM_256_GMAC: | |||
if not EVP_DecryptInit_ex(&self.ctx, EVP_aes_256_gcm(), NULL, NULL, NULL): | |||
raise Exception('EVP_DecryptInit_ex failed') | |||
iv = self.fetch_iv(<unsigned char *> idata.buf) | |||
self.set_iv(iv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needed so that next_iv() works.
we need that after loading / decrypting manifest to know the next IV we may use.
encrypt: has aad support now, but aad is not added to the returned data, so caller would have to prepend/append it, which needs another memcpy / creates string garbage. just always prepend aad in output data? even with that, we also need the type byte (which is not part of aad in legacy mode), add a "anad" parameter for additional non-authenticated data and just prepend it at the very beginning? decrypt: nothing special needed, just need to check whether it works with memoryviews for data and aad, so we can give 2 views for them at different offsets / with different lengths on the whole input data. |
idea for encrypt: just have a "header" and "aad" param and use it like: out = encrypt(data, header=type_byte+aad, aad=aad) it would prepend the header to the cdata (but not include that in the mac computation). |
borg/crypto.pyx
Outdated
if not HMAC_Final(&self.hmac_ctx, hmac_buf, NULL): | ||
raise Exception('HMAC_Final failed') | ||
if CRYPTO_memcmp(hmac_buf, idata.buf+hlen, 32): | ||
raise Exception('Authentication failed') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IntegrityError
f407c70
to
5c1897b
Compare
borg/crypto.pyx
Outdated
|
||
|
||
class IntegrityError(CryptoError): | ||
"""Integrity checks failed. Corrupted or tampered data.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helpers.IntegrityError
<=> crypto.IntegrityError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wanted to avoid potentially cyclic imports and not import stuff from higher-level python modules into lower-level cython stuff, but yes, in the end 1 is enough.
I am thinking about replacing decrypt's aad param by aad_offset. For legacy format: aad_offset=1 header_len=1 (skip type byte, no other header aad) |
So
? |
Could be done like that, but we'ld need to peek into the header before even knowing which decrypt function we want to call. |
Yep. I'm thinking whether there's a case where that would be useful, but none occur to me, so let's stick with Names: I think it'd be better to use specific names.
|
OK, I am updating api / code to:
|
borg/crypto.pyx
Outdated
"""A thin OpenSSL wrapper""" | ||
"""An AEAD style OpenSSL wrapper | ||
|
||
Note: AES-GCM mode needs OpenSSL >= 1.0.1d due to bug fixes in OpenSSL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also in docs/installation.rst ("From Source" section)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this changeset is still intentionally isolated (just crypto.pyx + test), integration will happen in 1.2.
we need it to encrypt/decrypt key files / config keys.
there are some more places where it is used.
it can be used to integrate the plaintext mode with the AEAD modes, both use same api now.
position and length of iv depends on cipher
it's needed for extract_iv already, so it should be given to init, not encrypt/decrypt
6315f06
to
dc4abff
Compare
ok, rebased again, collapsed some fixup changesets. guess i am finished here. |
\o/ |
crypto.pyx now implements a AEAD style API:
Implements:
Note about iv/nonce/counter handling:
cipher.encrypt() and .decrypt() always do a full openssl context init, thus it is required for encrypt to set a IV before the call (or give iv as encrypt() param). there is no IV state kept inside the openssl ctx which is used for next encrypt() call.
set_iv sets the iv to use for next encrypt() call, it also resets the blocks counter that is used to reflect how many cipher blocks were encrypted with the last encrypt() call. using that, next_iv() computes the IV you shall use for the next encrypt() call.
this simulates the internal openssl IV state, which is in the opaque ctx since 1.1.
TODO:
Anything else?