new id hash, coexistence of id hash algos #1562

ThomasWaldmann · 2016-09-02T13:52:59Z

Could we use sha512-256 or blake2b-256 (after we require OpenSSL 1.1 or otherwise make sure we have it - maybe via Python 3.6?) as ID hash?

Usage of ID:

write: id = H(data), put(id, data)
read: we already know the id (e.g. is in some chunks list), data = get(id); verify(id, data)

In both cases, it does not really matter whether we use sha256, sha512-256 or blake2b-256 (or a mix of them in same repo as long as we know which was used for some specific data). I'ld guess sha256(A)-blake2b(B) collisions should be about as likely as sha256(a)-sha256(b) collisions.

Of course one loses dedup between chunks stored using different id hashes. borg diff also loses some functionality as it asserts identical file contents based on identicals chunk id lists (and vice versa). But that is not much different from switching chunksize, which we also support in same repo.

For chosing the right hash (mac) algorithm to verify data integrity (authenticity), we ofc. need to know which algo was used for some specific storage object.

Old way: use the type byte of the chunk.

New way: Use DKID, DEKs, ciphersuites. We could add the id hash/mac to the ciphersuite. When using DEKs, we store the ciphersuite name together with key material - then we could also use blake2b as a id MAC instead of just a id HASH (even in modes that do not use encryption).

enkore · 2017-05-24T11:05:14Z

I'd say that this adds a lot of complexity and makes deduplication even less predictable. Since the only difference between the id hashes is their performance; just create a new repository if it's important.

Close?

ThomasWaldmann · 2017-05-24T12:18:29Z

Well, one result of the thought experiment was that it does not add a lot of complexity - one just has to know the hash/mac algorithm that has been used to verify data.

But I agree just starting a new repo is simpler.

ThomasWaldmann changed the title ~~blake2b as id hash, coexistence of id hash algos~~ new id hash, coexistence of id hash algos Sep 2, 2016

enkore mentioned this issue Nov 3, 2016

Crypto roadmap #1044

Closed

7 tasks

enkore closed this as completed May 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new id hash, coexistence of id hash algos #1562

new id hash, coexistence of id hash algos #1562

ThomasWaldmann commented Sep 2, 2016 •

edited

Loading

enkore commented May 24, 2017

ThomasWaldmann commented May 24, 2017

new id hash, coexistence of id hash algos #1562

new id hash, coexistence of id hash algos #1562

Comments

ThomasWaldmann commented Sep 2, 2016 • edited Loading

enkore commented May 24, 2017

ThomasWaldmann commented May 24, 2017

ThomasWaldmann commented Sep 2, 2016 •

edited

Loading