Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use randomized key derivation instead of relying on stateful nonces #3814

Closed
DemiMarie opened this issue May 12, 2018 · 34 comments · Fixed by #6463
Closed

Use randomized key derivation instead of relying on stateful nonces #3814

DemiMarie opened this issue May 12, 2018 · 34 comments · Fixed by #6463
Assignees
Milestone

Comments

@DemiMarie
Copy link

Have you checked borgbackup docs, FAQ, and open Github issues?

Yes

Is this a BUG / ISSUE report or a QUESTION?

Bug

System information. For client/server mode post info for both machines.

N/A

Describe the problem you're observing.

Borg currently uses the same AES key for all encryption. This is an anti-pattern and leads to Borg’s weaknesses when it comes to nonces.

A much better solution is to generate a fresh encryption key every time as (say) blake2b(my_personality_string, random_number, key). This has the following advantages:

  • There is no reliance on storing state on the server, since keys are generated at random each time. We can instead use a randomly generated nonce.
  • We can use AEAD ciphers, such as AES-GCM or ChaCha20-Poly1305, which have formal security reductions to the security of the underlying cipher.
@DemiMarie
Copy link
Author

This is likely a CVE-worthy security vulnerability, but as it was already documented I am reporting this publicly. This must be fixed before borg can safely be used with QubesOS.

@enkore
Copy link
Contributor

enkore commented May 14, 2018

It's a fairly well-known bug, though (e.g. #1044 / #1031). But it is of course correct, encryption in attic/borg is conceptually broken.

@DemiMarie
Copy link
Author

DemiMarie commented May 15, 2018 via email

@ThomasWaldmann ThomasWaldmann added this to the helium milestone Oct 7, 2018
@t4777sd
Copy link

t4777sd commented Nov 6, 2018

Any plan to fix this? Looks like a the security of borg is broken

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Nov 6, 2018

It depends on how you use it, using 1 client per repo is fine (see the docs for more details).
Even multiple clients are no problem except for the case of an evil repo server tricking you into nonce reuse.

@kroki
Copy link

kroki commented Dec 19, 2019

@DemiMarie , just to clarify, by "random key" this issue means "random Nonce", right? Otherwise it's not clear how to decrypt data later.

If I'm correct, then blake2b(my_personality_string, random_number, key) is perhaps not an optimal choice:

  • since my_personality_string and key are constants for a particular client, the expression is functionally the same as blake2b(random_number) if we assume that random_number is indeed random enough, so cross-client collision is unlikely (either it is a true random, or a pseudorandom with a true random seed).
  • for pseudorandom numbers if seeds collide then random sequences also collide in each value. You could use my_personality_string to resolve collisions, but to my understanding this only permutates the problem to "if MIX(my_personality_string, seed) collide...". So it's worth limiting the scope of such collisions instead.
  • for implementation reasons it will be more convenient to choose the same cipher used for data encryption, ie. AES instead of Blake2b.
    Migrate from CTR encryption mode to CBC #4889 suggests ENCRYPT(RANDOM xor MAC(plain-text)) and I have the impression that this (or some variation of the idea) would be a better choice.

Or do I misunderstand your idea completely?

@DemiMarie
Copy link
Author

@kroki I do in fact mean “random key”. The random number (salt) and personality string would be stored (unencrypted) along with the ciphertext, so the key can be recomputed for decryption. They need not (and must not) be kept secret. The personality string can be shared amongst all users of Borg, while the random salt must be unpredictable, but is safe to publish. Of course, the master key must be kept secret.

Random nonces are not okay because they are usually only 96 bits, which is not enough to ensure that collisions will not happen. An AES-256 key is 256 bits, which is sufficient. If the key derivation function (KDF) provides enough data (which both HKDF and Blake2b do), additional data from the KDF can be used for the nonce.

The salt and personality string must, along with their lengths, be included in the computation of the authenticator. If the KDF output is used for authenticated encryption, this is sufficient.

MAC-then-encrypt and MAC-and-encrypt are not good choices, as neither are secure in general. They can be secure, but that requires significant extra care. Encrypt-then-MAC just works.

AES is not a good choice for key derivation because it is vulnerable to related-key attacks. A true hash function is a far, far better choice. I suggest using libsodium’s Blake2b implementation, which is both portable and very heavily optimized. Alternatively, HKDF can be used.

I strongly recommend using libsodium’s crypto_secretstream API. for bulk data encryption.

@enkore
Copy link
Contributor

enkore commented Dec 22, 2019

^ on point

@ThomasWaldmann
Copy link
Member

@DemiMarie I had a look at libsodium a while ago, back then they did not support accelerated AES256 (something that borg needs).

I felt that depending on openssl (for that) and on libsodium (for other stuff) is not desirable.

@ozppupbg
Copy link

ozppupbg commented Jun 2, 2020

Based on the ideas from #1031 / @enkore:
crypto-draft-1.2

I would remove the id_key from the HKDF input
ikm := enc_key || enc_hmac_key
set the salt as the chunk ID
salt := chunk_id
and also generate the IV via the HKDF.
This would allow to reduce the envelope to:
envelope := type_byte(1) || auth_tag(16) || body

With a SHA512 HKDF this should not increase the number of required hash calculations and at the same time reduce the storage overhead.
Edit: I just noticed, that the session key would have to be computed only once, while this variant requires the derivation for every chunk.

The only case where there could be key & IV reuse would be if the same plaintext is re-encrypted with a different compression (or compression parameters), but as far as I understand, this should not happen in normal operation.

Is there any case where this might happen?
If yes, could this be solved by adding the compression parameters to the HKDF info or adding a 32 or 64bit random nonce?

Or are there other (cryptographic) problems, which I'm not seeing right now?

@DemiMarie
Copy link
Author

Or are there other (cryptographic) problems, which I'm not seeing right now?

Sadly, there are, although they are easy to miss.

Consider the case where a system is snapshotted and then restored from the snapshot more than once. Both restored systems will likely perform different operations, so when they run Borg again, they will encrypt different data. However, Borg is not aware of the snapshot, so nonces will be reused. Using sufficient randomness (at least 192 bits) in the key derivation prevents this.

The session ID should be increased to 192 bits to prevent such attacks, but it does not need to be stored separately in each chunk.

@ozppupbg
Copy link

ozppupbg commented Jun 3, 2020

@DemiMarie
I see your point, but I don't seen (yet?) why I poses a problem:
The chunk ID is a keyed hash (MAC) of the chunk plaintext, compression parameters are usually static througout the repository and the compression algorithms should(?) be determinitic.
So, even in case of a snapshot I don't see why it should use the same key/IV for different data - the same data maybe, but to my knowledge this should not be a problem?

And I don't see a good way to not store the session ID with every chunk. There has to be a place to know, which chunk has been encrypted with which session ID. And the only way to save some space would be some level of indirection, which is difficult to handle and introduces additional points of failure in case of data corruption.

@pgerber
Copy link
Contributor

pgerber commented Jun 3, 2020

The only case where there could be key & IV reuse would be if the same plaintext is re-encrypted with a different compression (or compression parameters), but as far as I understand, this should not happen in normal operation.

There is the --recompress argument for borg recreate that allows to recompress existing archives with a different algorithm and parameters.

@ozppupbg
Copy link

@pgerber thank you for the input!

@DemiMarie
What is your rationale for moving to 192bit?
Based on the table here, if a backup job runs with 10 paralell threads or sessions every minute it would be almost 5000 years for a collision probability of 10^-18 and 4.9467 * 10^11 (almost 500 billion) years for a 1% collision chance.
So, even with this quite exessive backup strategy, I don't see any reason to increase the session ID size.

To cope with the recreate issues, it makes sense to drop my approach using the chunk ID in the key generation and move back to an approach closer along the lines of the initial approach I was referring to:
To further limit the impact of collisions, the 32bit (preferrably random) nonce could be extended with 64 or more bits of the chunk ID. So, even if a session ID would collide, it would have to happen on the same chunk ID and with the same nonce in order to pose a problem.

As a side-note: I already have two ideas, how to ensure a semi-random nonce without book-keeping overhead

@DemiMarie
Copy link
Author

@ozppupbg Using a keyed MAC as a nonce is known as SIV (Synthetic Initialization Vector) mode, and is considered secure provided the nonce size is large enough. XChaCha20-Poly1305 with random nonces is fine.

However, for this security proof to hold, the following must be true:

  • The nonce must be sufficiently long that the probability of nonce reuse is negligible. A 192-bit nonce is considered long enough for this, while shorter nonces might not be.
  • The data passed into the MAC must be the same as the data being encrypted. Compression between these steps voids the proof, and another proof would need to be written.

The biggest reason for using long, random nonces is simplicity. Complexity is the enemy of security, especially in cryptography. Using long, random nonces is simple and correct. Yes, it adds overhead, but compared to the size of a chunk (probably >10KiB) the overhead of a 24 byte nonce is negligible.

@vvug
Copy link

vvug commented May 19, 2021

The broken crypto seems to be the main reason why multi-client on a single repo is discouraged in the docs [1]. I guess this should get more priority?

[1] https://borgbackup.readthedocs.io/en/stable/faq.html#can-i-backup-from-multiple-servers-into-a-single-repository

@andrewchen5678
Copy link

+1, this is the only reason that holds me back from using borg.

@enkore
Copy link
Contributor

enkore commented Jan 1, 2022

@DemiMarie After re-reading this and #1031 I think the only question left open here is: Do we want to derive a random key for each chunk (necessitating a sufficiently long random nonce, >192 bits) or do we want to derive a random session key, which wouldn't need as long a nonce (because there are many fewer sessions than there are chunks) and is faster (no per-chunk KDF).

I don't see anything immediately wrong with #1031 (comment) (though I would remove the ID key as a KDF input) except that one might feel a random 128 bit session key is too short. On the other hand, Borg has pretty high per-session costs, so even a small session tends to take a few seconds to complete. So even when doing small transactions on a repo 24/7 it'd take a hundred years or so to do 2**32 transactions, at which point the chances of colliding in a 128 bit space are negligible.

@DemiMarie
Copy link
Author

@enkore A random session key is perfectly fine. I would go with a 256 bit session key out of an abundance of caution: the performance and space overheads are negligible, and it fits the general rule that a key should be twice the security parameter.

@ThomasWaldmann
Copy link
Member

https://soatok.blog/2021/11/17/understanding-hkdf/ just stumbled over that. huh?

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Mar 17, 2022

borg-crypto

Took @enkore's diagram and re-did it:

  • using FOSS software (libreoffice draw crashed less often than inkscape on my MBA), exportable to .svg
  • adding the RNG and CTR
  • adding color for active components, white is all sorts of data
  • adding arrows for better indication of data flow direction
  • removed id_key from hkdf ikm, visualized it being an input for the id-mac-generator, obviously
  • grew the SessionID to 192 bits of random
  • just used HKDF and AEAD instead of more specific algorithm specifications (candidates obviously being the stuff we have in master branch)
  • kept semantics as in enkore's design to have a "mostly as is" starting point

Unclear to me:

  • how does indexing (including rebuilding a lost index) on the repo level work if the chunk ID is not contained in the output format (see bottom row)? A: Repository.write_put(id, data) adds the crc32, put header, chunk id in front of the data.
  • do we need to change according to soatok's article (we have random IKM, so guess we are not required to)?
  • 32bit MessageIV (message counter): if we have big chunks, this would suffice. but what if rather small chunks are used, could we overflow that?

@ThomasWaldmann
Copy link
Member

From @enkore in #1031:
"""
Key generation using the TAM scheme means that replication is hindered

Therefore, adjusting the design slightly from "very compatible [but conditionally blows up with old versions that don't understand new formats and maybe it's hard to tell which things are affected by that]" to "explicitly enable it, locking older versions out".

In that case we can add some dedicated IKM to the keyblob, which would allow replication. Also has the advantage that it's completely clear for the entire repository whether it's backwards-compatible. Also allows to just use this scheme for everything incl. the manifest.
"""

Yeah, guess we want to explicitly enable this, add the extra ikm key and lock out old clients. Maybe the #747 related changes will require that also when switching over to the better stuff.

@DemiMarie
Copy link
Author

  • 32bit MessageIV (message counter): if we have big chunks, this would suffice. but what if rather small chunks are used, could we overflow that?

I suggest using 64 bits or more.

@ozppupbg
Copy link

@ThomasWaldmann : no objections from my side.

how does indexing (including rebuilding a lost index) on the repo level work if the chunk ID is not contained in the output format (see bottom row)? A: Repository.write_put(id, data) adds the crc32, put header, chunk id in front of the data.

Why is that even necessary? In the worst-case the ID could be reconstructed from the decrypted data.
It would be much more expensive, but for most AEAD schemes it would work (e.g. Chacha20-Poly1305, AES-GCM, AES-OCB). It would not work with schemes, for which the additional authenticated data changes the state of the encryption, for example Sponge-Duplex based constructions.

do we need to change according to soatok's article (we have random IKM, so guess we are not required to)?

I don't seen any issues there. If, for some reason, we need more keys/randomness there is also the option (which I think was not mentioned in the article) to generate a longer HKDF output and split the result into different parts as needed.

32bit MessageIV (message counter): if we have big chunks, this would suffice. but what if rather small chunks are used, could we overflow that?

I suggest using 64 bits or more.

Why? Yes, it could overflow, but we are in total control of the counter:

  • It is restarted at zero for every new session
  • We can easily detect, if it overflows
  • So it should be easy to just start a new session (at least encryption-wise) in the rare cases this happens

Key generation using the TAM scheme means that replication is hindered

Which "replication" is mentioned here?

@DemiMarie
Copy link
Author

DemiMarie commented Mar 18, 2022

how does indexing (including rebuilding a lost index) on the repo level work if the chunk ID is not contained in the output format (see bottom row)? A: Repository.write_put(id, data) adds the crc32, put header, chunk id in front of the data.

Why is that even necessary? In the worst-case the ID could be reconstructed from the decrypted data. It would be much more expensive, but for most AEAD schemes it would work (e.g. Chacha20-Poly1305, AES-GCM, AES-OCB). It would not work with schemes, for which the additional authenticated data changes the state of the encryption, for example Sponge-Duplex based constructions.

This is a really bad idea. Authentication should be done first, then decryption. (Yes, that means that streaming AEAD APIs are bad!). If unauthenticated plaintext is disclosed to an attacker there is a security vulnerability. While using it to compute a message ID is probably safe, on the grounds that the message ID leaks no information about the data, it is still highly ill-advised.

Just stick the message ID in front of the data and treat the AEAD algorithm as a black box.

@ozppupbg
Copy link

@DemiMarie
I agree, it would have to be carefully designed, e.g. it must check, if the authentication tag matches with the newly calculated id, before any further processing.

I was not recommending this, just pointing out options.

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Mar 18, 2022

@ozppupbg for better efficiency, borg check (low level repository part) runs locally on the server side (no crypto key there) and we need to be able to rebuild the repo index within that check (maps chunkid -> (segment, offset) so we'll be able to quickly find a chunk's location within the segment files in the repo.

Also, when compacting the repo (also done server-side), the index must be updated about the new chunks' locations.

32bit overflow: correct, we could detect and start a new session. Or use a bit longer counter and have simpler code.

About replication: guess what he meant is being able to efficiently transfer archives (or just chunks) from one repo to another.

@DemiMarie yeah, 64bit would solve it for sure, but guess that would be a bit overkill also. But while thinking about layout, alignment and maybe also keeping some bits reserved for later came to my mind.

How about:

# note: | means 64bit alignment
suite:4 keytype:4 reserved:8 messageIV:48 | sessionID:192 | AuthTag:128 | PayLoad...

About the small chunks case: there are other limits related to this, e.g. all the chunks have an entry in the in-memory chunks index hashtable. So, while running against the 32bit boundary is somehow thinkable, that would aready use 4G * sizeof(hashtable_entry) of RAM (== at least 160GB for 40B entries). And you'ld need 4G small chunks / small files for that.

@DemiMarie
Copy link
Author

# note: | means 64bit alignment
suite:4 keytype:4 reserved:8 messageIV:48 | sessionID:192 | AuthTag:128 | PayLoad...

That’s fine, but be sure to check for messageIV overflowing 48 bits and return an error

@ThomasWaldmann
Copy link
Member

borg-crypto

Changes:

  • more bit-difference in suite (old: 0000, new: 1010)
  • 8bit reserved (00h for now)
  • 48bit MessageIV
  • rearranged for 64bit alignment
  • cosmetic changes

@ThomasWaldmann
Copy link
Member

Hmm, just noticed: we don't have a single bit there which we could use by server-side code, because all is protected by the auth tag and that stuff can only be changed client-side.

@DemiMarie
Copy link
Author

Hmm, just noticed: we don't have a single bit there which we could use by server-side code, because all is protected by the auth tag and that stuff can only be changed client-side.

That’s a good thing 😆. Server-side metadata should be separate.

@ozppupbg
Copy link

Hmm, just noticed: we don't have a single bit there which we could use by server-side code, because all is protected by the auth tag and that stuff can only be changed client-side.

That’s a good thing 😆. Server-side metadata should be separate.

I agree, what would this be good for?

@ThomasWaldmann
Copy link
Member

Was just a vague idea. Like maybe some flags for bookkeeping of long running operations.

OTOH, that would need to modify the segment files in-place, which is something we intentionally do not do (we only append at the end and delete unused stuff, but never modify).

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Mar 22, 2022

Would be cool if PR #6463 could get some review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants