Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add flexible encryption chunk header #1031

Closed
ThomasWaldmann opened this issue May 7, 2016 · 20 comments · Fixed by #6463
Closed

add flexible encryption chunk header #1031

ThomasWaldmann opened this issue May 7, 2016 · 20 comments · Fixed by #6463
Assignees
Milestone

Comments

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented May 7, 2016

I've looked at the experimental branch, thinking about what to get from there and how.

Compression

The flexible compression first done there was meanwhile implemented in borg - a bit differently. The idea here was adding a 2-byte header in front of the compressed data that tells the (de)compression algorithm. Plus the small hack that we can keep existing zlib chunks without header, because we can determine zlib from first 2 zlib-output bytes also. We just must not use zlib-looking 2 type bytes for all other compression algorithms.

Old: N bytes zlib,6 output (always, attic heritage)

New: 2 bytes compression type (except for zlib) + N bytes compressor output

This is nice for layering, we just give the decompressor the whole bytestring and it can make up everything from that (choosing the right algo internally). The compressor just prepends the right 2 bytes when returning its output.

Encryption

I'ld like to add flexible encryption in a similar way - see link in next post.

@enkore
Copy link
Contributor

enkore commented May 7, 2016

Mmmhh, as I mentioned in that thing it imho makes more sense to think in terms of cipher suites that provide AEAD in a black-box fashion, without depending on having separable MAC and ciphers. Which also removes possible issues with what's MACed and what is not. For example, in the example you give the IV looks un-MACed, which is very problematic for CBC.

I dunno if I wrote it correctly in the above link, but in a format like key_type(1) + dek_id(4) + payload(n) the ciphersuite needs to know both the key_type and dek_id so they can be authenticated as well. Not doing that is problematic for encrypt-then-mac (less for the key_type, possibly very much for dek_id or equivalent kinds of fields).

@enkore
Copy link
Contributor

enkore commented May 7, 2016

Yes, so why not push all that (IV, envelope MAC) out of the chunk format into blackbox-ciphersuite-do-it-yourself? ;)

Adding a field (not necessarily using it right now) to identify different keys (associated with a ciphersuite, since reusing a key for different ciphersuites is potentially dangerous[1]) makes sense also (because otherwise we wouldn't really change anything)

[1] ... using the "master" key for any other crypto than what's there right now is IMHO unsafe. So we actually really need separate keys to go with crypto flexibility. So, yeah.

@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented May 8, 2016

cipher, hmac type could be stored with the key, right, but the IV needs to be stored with the chunk as it is different starting value per chunk (either a counter or random).

we maybe could also generate a new random "chunk key" per chunk, always start with IV=0 at chunk start and store the key with the chunk (encrypted by the master key), no need to store the IV in this case. No counter/IV management problems, even with multithreading.

@enkore
Copy link
Contributor

enkore commented May 8, 2016

but the IV needs to be stored with the chunk as it is

Absolutely. I'm just saying that that should be the responsibility of the ciphersuite, not some metadata prepended to the encrypted payload (output of the ciphersuite). I.e. only have key_type + key_id + envelope; ciphersuite identified by key_id then knows key_type, key_id and the payload for encryption and handles all that itself. This doesn't mix data and responsibility of different layers ("finding keys" and "doing crypto-yt things").

@ new key per chunk: would need lots of high quality randomness though. (so I think this would wander very quickly into unsafe territory). Would only work with ciphersuites who have key length == master cipher block length; otherwise would have to keep the IV for that encrypted block as well. It would also need to be authenticated. I also don't really see a good reason to do this, even the most "demanding" ideas for crypto don't want per-chunk key management.

I don't see any problems with IV management for multithreading. If you are able to use multiple keys, then each thread would allocate one key (no problems); if you don't only one thread can encrypt (or some API gymnastics with a shared CTR and you update that before you encrypt some data and it won't work and is probably utterly unsafe).

@enkore
Copy link
Contributor

enkore commented May 8, 2016

I updated the link above with some ideas from here.

  • Use msgpack + structured data for keys instead of passwd-like stuff
  • 16 byte key IDs instead of 4 byte --> maybe just use UUIDv4 format.
  • Clarified that the key ID must be authenticated in payload blocks
  • Added an API proposal
  • Removed "user interface" section

w.r.t. to algorithms (offtopic to the proposal):

  • Keep AES-256-CTR + HMAC-SHA256 (it works™)
  • Add AES-256-GCM (much faster than any other AEAD when hardware accelerated)
    • Quick test: aes-256-gcm seems to be a little bit faster than two-pass CTR+HMAC even without hardware accel.
  • Asap: chacha20 + poly1305 for us poor people with no hw crypto ;D
    • Depends on: OpenSSL 1.1 (or alternative crypto lib)

@ThomasWaldmann
Copy link
Member Author

hmm, not sure if i like msgpack for the keys - something readable feels better there. json?

key IDs: yes, random IDs are easier than "finding next free number", esp. when dealing with multiple clients. alternatively, if could be a function of the key, like crc32 (maybe we want something short?) or md5. One could use that to find bitflips in keys.

i agree about the algos.

@enkore
Copy link
Contributor

enkore commented May 8, 2016

Since the "DKID" is in plain-text I don't feel to comfy deriving them directly from the crypto keys. Maybe encrypt a well-known phrase with the initial state when creating new keys and use the result as the ID? [I.e. in the proposed API CipherSuite.create() would make up the ID in a pseudo-random way the cipher suite sees fit].

Alternatively: ID = HMAC_of_master_key(packed_dek). We use that function all the time anyway.

msgpack vs readable: For the keyblob it doesn't matter, since it's an encrypted|authenticated blob anyway. When we (later) want to be able to import/export them JSON makes sense (most of it would still be a blob, but the ID + cipher suite would be readily readable, and it's much easier to integrate with higher-level tools).

While JSON is not injective (<-- there is a correct term for this specific property, that there is only one exact encoding for some data [=injective] or multiple encodings [=not injective]) our usual way of packing things with msgpack is, so an imported DEK can still be validated (pack JSON, HMAC_of_master_key, done).

@ThomasWaldmann
Copy link
Member Author

btw, with a "none" ciphersuite (no encryption, no mac), Plaintext mode could be integrated into this mechanism.

@enkore
Copy link
Contributor

enkore commented May 8, 2016

w.r.t. asymmetric crypto I think that the approach I took in the link is sufficient, i.e. have Borg not support it directly, but having create-and-exportable keys that can be used with any higher-level key management (GPG, walletd, whatever).

@enkore
Copy link
Contributor

enkore commented May 8, 2016

btw, with a "none" ciphersuite (no encryption, no mac), Plaintext mode could be integrated into this mechanism.

Yes, I think that makes sense. In the mean time it shouldn't be used by default, though, keeping compatibility with old versions. I'd also say that by default the archive meta chunk and the items chunk stream should be encrypted with the master key unless explicitly told to not do that, since that also keeps compatibility (for borg create,list,..., but not check,extract,..) with old versions.

For 2.0 we can then remove the "legacy" way.

@ThomasWaldmann
Copy link
Member Author

there could be some predefined DKIDs, like:
old type byte plaintext -> dkid 0 -> ciphersuite "none"
old type bytes keyfile/repokey/passphrase -> dkid 1 -> ciphersuite "aes-ctr-256-hmac-sha-256" (default key)
any other dkid -> whatever

@enkore enkore added this to the 1.2 sequential threading milestone May 8, 2016
@enkore
Copy link
Contributor

enkore commented May 9, 2016

Since the "DKID" is in plain-text I don't feel to comfy deriving them directly from the crypto keys. Maybe encrypt a well-known phrase with the initial state when creating new keys and use the result as the ID? [I.e. in the proposed API CipherSuite.create() would make up the ID in a pseudo-random way the cipher suite sees fit].

Alternatively: ID = HMAC_of_master_key(packed_dek). We use that function all the time anyway.

Hm. Having deterministic, repo-independent DKIDs can have some nice uses (e.g. fast copies between repos sharing the same set of keys). So, let's say we make them deterministic. We don't set the scheme, but have CipherSuite.create (referring to the draft spec) do it. I don't see any problems with it; when sharing keys it can be seen in the data, but that's okay I'd say.

Contract:

  1. DKID is deterministic, only depends on the DEK (and all constant fields of it)
  2. The implementation / cipher suite must be able to
    1. Verify the DKID against a DEK later
    2. Ensure that no key data is leaked through the DKID, since the DKID is considered public knowledge. This includes the algorithms used.

AES-CTR...HMAC... could then do something like mentioned above: Pack the DEK, do the normal payload encryption with IV=very_very_large (remember that the IV has 16 bytes, but only 8 are stored in normal chunks -- at least currently, but I see no reason to change that -- so using a very large, well-known IV deters IV reuse attacks if they would even be possible somehow here), then the DKID = envelope HMAC.

For other cipher suites this might look totally different (aes-gcm, chacha+poly is pretty much the same story as aes+hmac).

@enkore
Copy link
Contributor

enkore commented May 9, 2016

That would also solve the "predefined DEKs" problem. For encrypted repos the DKID of a (converted) master key would be random (no problem there), things like none/plaintext have the same DKID everywhere (since it's the same key). Bingo.

@textshell
Copy link
Member

textshell commented Jul 3, 2016

I believe storing changing data like last_iv is the keyblob is a very bad idea. Users are supposed to have backups of the keyblobs to be able to restore. While a outdated last_iv will not prevent a restore this still makes the keyblob a mutable data structure which means users can’t even reliably check if their offsite keyblob backups are ok.
In general i think we really shouldn’t have chipers that need last_iv at all. But for the legacy chiper we are going to need some workaround. But i believe it is better to design it as a deprecated workaround from the start and don’t repeat that mistake with other cipher suites.
In short: Make changing operations on the keyblob explicit (change-passphrase, add-key, remove-key) and have it immutable in all other cases. Maybe even print a message if these operations are used to remind the user to update all backups of the keyblob.

@enkore
Copy link
Contributor

enkore commented May 24, 2017

https://gist.github.com/enkore/d16849b9e2eecdab0903bcd37bd0ee27

Requirements

(2016-09-02), R1...R5

  1. Algorithm agility
  2. No IV issues, specifically, no global counters, no IV reuse and/or IV spaces smaller than 96 bits
  3. Apt for multithreaded en- and decryption (decryption should never be a problem here)
  4. The keyblob (file/repokey) has to be time-invariant unless the user specifically invokes an action that has the clear consequence of changing it. This specifically prohibits changing the default semantics of existing commands.
  5. Any new crypto has to be authenticated encryption

Thus

session-keys-v2

  • Linear extension of the current type tagging:
    • Key type bytes so far are all <0x0f, therefore we can use the upper nibble.
    • Upper nibble becomes suite_id. 0x0 is the current value and is associated with current crypto. Easy!
    • Nibbles are big enough. And when not, we can always add 0xf to mean that extra byte(s) identify the type or suite.
    • (In the pic above the two nibbles are swapped --- you get what I mean)
  • Fixes borg check --verify-data always decompresses #1573 by encoding the chunk ID in the auth tag
  • Less overhead than current crypto
  • No IV problems. Uses 32 bit message IV (counter); Chacha20-poly1305 appends it's own 32 bit block-counter to the external IV (just like AES-GCM or AES-OCB).
  • Uses existing mechanism for session key derivation (_tam_key) (avoids R4)

However:

  • Key generation using the TAM scheme means that replication is hindered

    Therefore, adjusting the design slightly from "very compatible [but conditionally blows up with old versions that don't understand new formats and maybe it's hard to tell which things are affected by that]" to "explicitly enable it, locking older versions out". In that case we can add some dedicated IKM to the keyblob, which would allow replication. Also has the advantage that it's completely clear for the entire repository whether it's backwards-compatible. Also allows to just use this scheme for everything incl. the manifest.


I think only Chacha20-Poly1305 should be added (for now):

  • We know that AES-GCM is problematic unless hardware accelerated. So we need CPU detection code there if we want to support GCM. But we don't want to exclude some CPUs from decryption, so we'd always have problematic decryption.
  • Chapoly has very good performance on all platforms (from ARM to x86). With AES-NI GCM and OCB are 20-30 % faster, but all of them reach about two GB/s or more. That's just fast enough™. Especially considering multi-threading. On ARM GCM is far slower than Chapoly (even with NEON which specifically accelerates AES).
  • Both fill the exact same parameter spot (256 bit key, 128 bit auth tag).

@ThomasWaldmann
Copy link
Member Author

It's not just about max. possible througput, but also cpu load.

I'ld guess hw accelerated AES-GCM/OCB produces much less load than (non-accelerated) cacha/poly.

@enkore
Copy link
Contributor

enkore commented May 25, 2017

It's the same metric.
If it's limited at the encryption stage [1] then Chapoly would have ~25 % less throughput than GCM which in turn would have 20 to >50 % less throughput than OCB (as we've already found the difference between GCM and OCB varies a lot between µarchs).
If it's not limited there, then the stage's CPU load is conversely lower (lessened by bloat and other overhead).

[1] Which is simply not possible for Borg with the anticipated 1.2 design with any of them. Therein lies the rub. All the other stuff, be it ID hashes, compressors or buzhash has only a fraction of the performance of any of these AEAD constructions. The overall CPU% spent on them will be very minor, not very different from the AES-CPU% we see now. And that percentage is tiny. (with AES-NI. Bonus points for chapoly for being faster without AES-NI).

@enkore
Copy link
Contributor

enkore commented May 25, 2017

https://camo.githubusercontent.com/26f3c276b3e8daf969d515113785994f7f8de233/687474703a2f2f692e696d6775722e636f6d2f7338794c75774f2e706e67

AES: 2.4%, HMAC: 7.8%, Total: 10.2%

If we'd use chapoly that would change to approx. Chapoly: 4.8%, HMAC: 0.0%, Total: 4.8%
GCM ~3.3 %
OCB ~2.5 %

In 1.2 we'll have an entire thread for any of these.

(btw. in 1.1 we already reduced the 5% crc32 block to <1 %)

@ozppupbg
Copy link

If I understand this right, the consensus here is, that the best option is to introduce a compatibility breaking change?

If yes, I would recommend also changing the ID MAC for BLAKE2 to the keyed hash mode. This would allow using the OpenSSL 3.0 EVP_MAC variants for HMAC and BLAKE2 in the future and clean up the interface and code.

Also, I would recommend to remove the id_key from the ikm input, just because of a separation of concern.

@ThomasWaldmann
Copy link
Member Author

Would be cool if PR #6463 could get some review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants