Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new crypto code, blackbox, aead internally #1034

Merged
merged 34 commits into from
Jul 29, 2017

Conversation

ThomasWaldmann
Copy link
Member

@ThomasWaldmann ThomasWaldmann commented May 9, 2016

crypto.pyx now implements a AEAD style API:

  • encrypt-then-mac inside encrypt(), optionally mac-ing additional data
  • auth-then-decrypt inside decrypt(), optionally auth-ing additional data

Implements:

  • aes-ctr-hmac-sha256, aes-gcm
  • aes-ocb, chacha20-poly1305 (openssl 1.1)
  • simple AES, still needed by the keyfile code
  • unencrypted/unauthenticated (so we do not need to specialcase that)

Note about iv/nonce/counter handling:

cipher.encrypt() and .decrypt() always do a full openssl context init, thus it is required for encrypt to set a IV before the call (or give iv as encrypt() param). there is no IV state kept inside the openssl ctx which is used for next encrypt() call.

set_iv sets the iv to use for next encrypt() call, it also resets the blocks counter that is used to reflect how many cipher blocks were encrypted with the last encrypt() call. using that, next_iv() computes the IV you shall use for the next encrypt() call.

this simulates the internal openssl IV state, which is in the opaque ctx since 1.1.

TODO:

  • rename to cipher_key, mac_key and id_key?
  • collapse some fixups

Anything else?

borg/crypto.pyx Outdated
cdef class AES_GCM_256_GMAC:
# Layout: GMAC 16 + IV 16 + CT (borg 1.2)
# additionally, each chunk starts with a type byte,
# which is not passed to this code or added by this code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC for AES-GCM a 12 byte nonce/IV is recommended.

Copy link
Member Author

@ThomasWaldmann ThomasWaldmann May 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, it is the default. but recommended / why not more?

i mainly wanted to avoid the byte-frickling as seen in the legacy class.

found it, rfc 5084.

@enkore
Copy link
Contributor

enkore commented May 13, 2016

Do we want so split the implementation of cipher suites in two parts ("low level" part in crypto.pyx and "high level" part in key.py), or just keep it in one piece? I think it would be easier to just have it in one piece and have the KeyXXX class in key.py as it's also now. These cipher suites are kinda low level primitives anyway.

btw. I added a few notes how to handle IV storage now, since with multiple keys the final IV can't be retrieved from the manifest anymore. See the link in #1031

(And for repokey the key must also be stored locally for the same replay-safety as before)

@ThomasWaldmann
Copy link
Member Author

I'ld keep encrypt/mac and auth/decrypt in one (low-level) piece - like it is now - they can also deal with reading / writing the bytes from / to the right places into the buffer.

@enkore
Copy link
Contributor

enkore commented May 14, 2016

I read some more about AES-GCM. RFC5084 and others recommend the 12 byte nonce/IV for efficiency. Other sources (e.g. https://www.cryptopp.com/wiki/GCM_Mode) also recommend it for the same reason.

The authentication tag is always calculated as 128 bits wide and only truncated at the end of the operation to the requested size, so there are no performance gains in using a smaller authentication tag. RFC5084 gives no rationale for their recommendation here. In RFC5288 (AES-GCM for TLS 1.2) they use a 128 bit tag and a 96 bit nonce/IV. In RFC4106 the nonce is constructed from a 64 bit IV and a 32 bit salt, so it's 96 bits again, however, the only required tag size is 128 bit:

Implementations MUST support a full-length 16-octet ICV, and MAY
support 8 or 12 octet ICVs, and MUST NOT support other ICV lengths.

In http://csrc.nist.gov/publications/nistpubs/800-38D/SP-800-38D.pdf they also recommend the 96 bit nonce/IV for "simplicity and efficiency". This paper also says that for IVs >= 96 bits only a random IV for at least 96 bits should be used; so they recommend a completely random 96 bit IV (section 8.2).

Also (section 9)

A loss of power to the module shall not cause the repetition of IVs. If the generation unit
cannot recover from a loss of power, then
the authenticated encryption function shall enter a failure state until a fresh key can be established

Which wouldn't be a problem with a good random number generator for the IV.

What is interesting here is that repeating IVs seem to not only compromise the encryption (as with AES-CTR), but also the authentication.


Summary: Stick with 96 bit IV as everyone recommends, use a 128 bit tag for maximum security w/ no drawbacks?

@ThomasWaldmann
Copy link
Member Author

Yes, I'll change it. 96bit mac felt crappy anyway.

@enkore
Copy link
Contributor

enkore commented May 14, 2016

86c2b64: If I understood correctly there is an internal per-message counter in GCM appended to the external IV (if it is 96 bits, else to the derived 96 bit IV from however many bits where passed in), hence there is no IV to be retrieved. This is also the reason for the 64 GiB message length limit: A 4 byte CTR only allows 2**32-1 blocks = 64 GiB to be encrypted before the CTR would overflow and the IV where repeated (with the usual implosion of the universe following).

Names:

AES_CTR_..._legacy would serve both as the lower abstraction (formerly crypto.AES) for both the old (fixed) crypto code and (later) for a DEK cipher suite? Then it's not really legacy, remaining a valid choice. (It is strongly unforgable "to 256 bits" instead of "just" 128 bits in AES-GCM)

AES_GCM_256_GMAC: _GMAC is kinda redundant, _GCM already implies authentication using GCM/GMAC.

borg/crypto.pyx Outdated
def iv(self):
return self.ctx.iv[:16]
def current_iv(self):
raise NotImplemented # gcm mode does not maintain/increment the counter in self.ctx.iv
Copy link
Contributor

@enkore enkore May 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's no interfacing mandating current_iv(): Just remove the method entirely? Since this is sensitive/crypto code it might be a good idea to explain in a comment why this doesn't make sense with AES-GCM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll solve this by providing next_iv function, based on self.iv.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, seems like wikipedia was misleading concerning the counter in gcm mode.
i'll update the names.

@enkore
Copy link
Contributor

enkore commented May 14, 2016

For AES-GCM the message size (number of blocks) is not relevant to the 96 bit IV: When using "AES-GCM-CTR" (so to speak) the 96 bit IV represents the number of messages (EVP_Encrypt... EncryptFinal) encrypted, not the number of blocks.

I think it would be simpler to use a random 96 bit IV for AES-GCM and keep the IV handling as-is for the existing AES-CTR code. Or, alternatively, keep the IV handling as-is for AES-CTR, and do the 12 byte iv+=1 manually (as in 70e006e)

I feel like AES-GCM with a random IV would be safer in Borg considering "evil repository" scenarios where an attacker interferes with live operations (e.g. acting like a commit crashed, after roll-back the client may re-use CTRs that the attacker already saw during the rolled back transaction <-- and catching all of these cases could be challenging!)

@ThomasWaldmann
Copy link
Member Author

Yeah, I realized that after your previous comment. It would not malfunction btw, it would just be wasteful with counter values.

Random IVs is not part of this PR, so I'll just make it work in the "counting way" as we currently do it.

@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented May 14, 2016

I am not fixing existing potential attacks on the IV here, I am just implementing a new internal interface and want to make it work in a similar way as it worked until now.

So, to get IV stuff fixed, I suggest you file a bug about this first and we fix it outside of this PR.

What I was referring to when saying "function correctly, but wasteful" was that incrementing by 1 (0 for empty) is required for correct gcm function, but I was incrementing by num_aes_blocks which is >= 1 (and also 0 for empty). But it was wasting 2^32 / 16 counter values in extreme case.

@enkore
Copy link
Contributor

enkore commented May 14, 2016

I am not fixing existing potential attacks on the IV here, I am just implementing a new internal interface and want to make it work in a similar way as it worked until now.

Sorry, I didn't want to hijack the thread. I moved these thoughts to another ticket, which I should have done in the first place.

So, to get IV stuff fixed, I suggest you file a bug about this first and we fix it outside of this PR.

Absolutely :)

What I was referring to when saying "function correctly, but wasteful" was that incrementing by 1 (0 for empty) is required for correct gcm function, but I was incrementing by num_aes_blocks which is >= 1 (and also 0 for empty). But it was wasting 2^32 / 16 counter values in extreme case.

Ok. I think this is inevitable with AES-GCM and as intended by the GCM spec. In any case, not a problem - independent of CTR or random IVs one never wants to use a significant portion of the possible IV space.

borg/crypto.pyx Outdated
@@ -318,6 +329,7 @@ cdef class AES_GCM_256_GMAC:
if not EVP_DecryptInit_ex(&self.ctx, EVP_aes_256_gcm(), NULL, NULL, NULL):
raise Exception('EVP_DecryptInit_ex failed')
iv = self.fetch_iv(<unsigned char *> idata.buf)
self.set_iv(iv)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed so that next_iv() works.

we need that after loading / decrypting manifest to know the next IV we may use.

@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented May 14, 2016

encrypt:

has aad support now, but aad is not added to the returned data, so caller would have to prepend/append it, which needs another memcpy / creates string garbage. just always prepend aad in output data?

even with that, we also need the type byte (which is not part of aad in legacy mode), add a "anad" parameter for additional non-authenticated data and just prepend it at the very beginning?

decrypt:

nothing special needed, just need to check whether it works with memoryviews for data and aad, so we can give 2 views for them at different offsets / with different lengths on the whole input data.

@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented May 14, 2016

idea for encrypt:

just have a "header" and "aad" param and use it like: out = encrypt(data, header=type_byte+aad, aad=aad)

it would prepend the header to the cdata (but not include that in the mac computation).
it would not prepend aad (except if included in header), but include it in mac computation.
as type_byte+aad is of negligible size, this doesn't cause lots of overhead or garbage.

borg/crypto.pyx Outdated
if not HMAC_Final(&self.hmac_ctx, hmac_buf, NULL):
raise Exception('HMAC_Final failed')
if CRYPTO_memcmp(hmac_buf, idata.buf+hlen, 32):
raise Exception('Authentication failed')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IntegrityError

borg/crypto.pyx Outdated


class IntegrityError(CryptoError):
"""Integrity checks failed. Corrupted or tampered data."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helpers.IntegrityError <=> crypto.IntegrityError?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wanted to avoid potentially cyclic imports and not import stuff from higher-level python modules into lower-level cython stuff, but yes, in the end 1 is enough.

@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented May 15, 2016

I am thinking about replacing decrypt's aad param by aad_offset.
We already have header_len, so aad_offset would just say where in the header the aad starts.
Of course that would require that the full aad is included in the header, but it always is, right?

For legacy format: aad_offset=1 header_len=1 (skip type byte, no other header aad)
For future formats, likely: aad_offset=0 header_len=X (all header is aad).

@enkore
Copy link
Contributor

enkore commented May 15, 2016

So

  • encrypt(data, header, aad_offset=0) -> envelope
  • decrypt(envelope, header_len, aad_offset=0) -> data, header (or just data)

?

@ThomasWaldmann
Copy link
Member Author

Could be done like that, but we'ld need to peek into the header before even knowing which decrypt function we want to call.

@enkore
Copy link
Contributor

enkore commented May 15, 2016

Yep. I'm thinking whether there's a case where that would be useful, but none occur to me, so let's stick with decrypt(envelope, …) -> data

Names: I think it'd be better to use specific names.

  • The encrypted stuff with headers = envelope (the old code already called it by that name in error messages etc)
  • Then data = only used for plaintext data. Synonymous to payload, but let's stick with data.

@ThomasWaldmann
Copy link
Member Author

OK, I am updating api / code to:

    encrypt(data, header, aad_offset=0) -> envelope
    decrypt(envelope, header_len, aad_offset=0) -> data

borg/crypto.pyx Outdated
"""A thin OpenSSL wrapper"""
"""An AEAD style OpenSSL wrapper

Note: AES-GCM mode needs OpenSSL >= 1.0.1d due to bug fixes in OpenSSL.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in docs/installation.rst ("From Source" section)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this changeset is still intentionally isolated (just crypto.pyx + test), integration will happen in 1.2.

@ThomasWaldmann
Copy link
Member Author

ok, rebased again, collapsed some fixup changesets. guess i am finished here.

@enkore enkore merged commit 7d02c7e into borgbackup:master Jul 29, 2017
@ThomasWaldmann ThomasWaldmann deleted the crypto-aead branch July 29, 2017 11:32
@ThomasWaldmann
Copy link
Member Author

\o/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants