Missing Message Agreement #55

lumaier · 2024-09-24T15:05:58Z

Problem: The current protocol draft has no message agreement. In general, message agreement is concerned whether the sender and receiver have a shared understanding of the messages exchanged. Non-injective agreement guarantees that if $B$ receives a message $msg$ from $A$, then $A$ has sent $msg$ to $B$.

The attack works as follows: Source $S$ wants to send a message $m$ to journalist $J_1$ of newsroom $NR_1$. If the longterm signing key $sk_1$ of $J_1$ was leaked, $sk$ can be used by an adversary to sign an ephemeral encryption key $ek$ of a different journalist $J_2$ (enrolled at a different newsroom $NR_2$). This key $ek$ - which the source believes belongs to $J_1$ is used to encrypt the message $msg$. The ciphertext is then relayed to $J_2$ by an active network adversary. Hence $J_2$ of $NR_2$ receives and decrypts the message, even tough $S$ intended to send the message to $J_1$ of $NR_1$.

We propose two possible approaches: Both work with the assumption that from the POV of the source, a particular newsroom $NR$ is the intended receiver (not a journalist - journalists only act on behalf of the newsroom).

Here how the protocol encrypts a message $msg$ using an ephemeral key $m$:

Variant 1: Incorporate the newsroom identity in the message and use the source's long-term key $s$ as part of the encryption key. (In red our changes)

The encryption key incorporates the source's long-term secret $s$ but masks it using the ephemeral key $m$ (identity of source is not leaked with $\hat{m}$). The journalist checks whether $g^s$ and $m$ were used to encrypt the message (gives origin authentication) and by including the intended newsroom $NR$ (since no adversary can tamper with the message without knowing $k$), the journalist can verify the source's intention. Without including $NR$, the ciphertext can still be relayed to a different journalist.

Variant 2: Incorporate the newsroom identity in the message and let the source sign the message using its longterm-secret $s$.

The journalist first decrypts the message and then checks whether the source with knowledge of $s$ has sent the message.

Security: We were able to prove that both variants guarantee non-injective and injective agreement between sources and newsrooms on messages in the symbolic model.

lsd-cat · 2024-10-05T00:39:01Z

Thank you, this is very valuable input. As discussed, I think this attack is a demonstration of the underlying problem of the missing message agreement. I like more variant 1, because in avoiding signature we both avoid to introduce another cryptographic primitive, and keep DH related message repudiation (deniability).

While thinking about variant 1, I am wondering if the more common way to achieve the same is by doing the partial DH shares, and then using them to derive a key, as X3DH does. In trying to understand that, I have taken a stab at demoing the protocol using PQXDH (which has the side benefit of introducing PQ resistance for message secrecy). Due to the asymmetry in the protocol, there are some limitations as we will see, and I am uncertain if we would inherit the same properties due to this changes.

pqxdh_securedrop.txt

Test 1: Source to Journalist
{"Dir: <class '__main__.Source'> -> <class '__main__.Journalist'>"}
DH1: 
DH2: 1ac1a247d2eb55ba48673b03a5f85104103920cb7b63e84aa156c8d8fc4daf26
DH3: 61fb424b5c5d2f574224b376d508b4fe6ff8eda02d95b58e48f9268b2aeb4d2d
DH4: 0cc8a6f78a3ed68d34a4c76e6a7b5391ef2cc0fa052f9fbef2f3bcad527ada66
 SS: 1c57fd0c60c6f2f4616a82a4f0d0f3091a3d797c3f2a3b704068d48f61260f97
KEY: 92117d36dc8a258cc35e804dd32a078bcb065b3731f0620ffafdad8abef198ce
DH1: 
DH2: 1ac1a247d2eb55ba48673b03a5f85104103920cb7b63e84aa156c8d8fc4daf26
DH3: 61fb424b5c5d2f574224b376d508b4fe6ff8eda02d95b58e48f9268b2aeb4d2d
DH4: 0cc8a6f78a3ed68d34a4c76e6a7b5391ef2cc0fa052f9fbef2f3bcad527ada66
 SS: 1c57fd0c60c6f2f4616a82a4f0d0f3091a3d797c3f2a3b704068d48f61260f97
KEY: 92117d36dc8a258cc35e804dd32a078bcb065b3731f0620ffafdad8abef198ce
Success!


Test 2: Journalist to Source
{"Dir: <class '__main__.Journalist'> -> <class '__main__.Source'>"}
DH1: 8b233fe20a51e15dc719d1d07e82373459189a7a135eb1870ade9065d3b8fd36
DH2: 75b60a29a82ddc1b48242788297c594dfd28c87a14ac3ff760ea1c37e8005130
DH3: ed89b89afde975e5eec0dc4638f4dee136d391f7031a84d47975936a016d8551
DH4: 
 SS: 2d1072d3258944f98ee0b7b96fe4eb9757207e642f4eb64d2a26078c3dc2e562
KEY: 905492e3c82caa0cd880e33403c90fd68f73e5c605aa5b90069243583f5080b1
DH1: 8b233fe20a51e15dc719d1d07e82373459189a7a135eb1870ade9065d3b8fd36
DH2: 75b60a29a82ddc1b48242788297c594dfd28c87a14ac3ff760ea1c37e8005130
DH3: ed89b89afde975e5eec0dc4638f4dee136d391f7031a84d47975936a016d8551
DH4: 
 SS: 2d1072d3258944f98ee0b7b96fe4eb9757207e642f4eb64d2a26078c3dc2e562
KEY: 905492e3c82caa0cd880e33403c90fd68f73e5c605aa5b90069243583f5080b1
Success!


Test 3: Journalist to Journalist
{"Dir: <class '__main__.Journalist'> -> <class '__main__.Journalist'>"}
DH1: f569399114e5a20e4fa4186c5f5f7a78dfe9c8d5cbf7bc4d3e729eeb8f3c9874
DH2: 06e6b25d848381d2b4310e1af7e710c68a1b648de05695e40c55ed2dd51f0b2d
DH3: bbd41415989395f60091140e0fa3c2e38e9e6d146bf787a1094b71892c137719
DH4: 77261e96ee8ea1193a90822c709213fe569c23cbd2943a3b9f6422cd12372923
 SS: 223b39718f0cfbf91cf3bba6d0db823b4b408daa5e74225d5490d4305ab29e69
KEY: e06a4fa0ae35ada34aeea06a11f5d9d1b32a6120d3584919208c5c6b7aa72d4b
DH1: f569399114e5a20e4fa4186c5f5f7a78dfe9c8d5cbf7bc4d3e729eeb8f3c9874
DH2: 06e6b25d848381d2b4310e1af7e710c68a1b648de05695e40c55ed2dd51f0b2d
DH3: bbd41415989395f60091140e0fa3c2e38e9e6d146bf787a1094b71892c137719
DH4: 77261e96ee8ea1193a90822c709213fe569c23cbd2943a3b9f6422cd12372923
 SS: 223b39718f0cfbf91cf3bba6d0db823b4b408daa5e74225d5490d4305ab29e69
KEY: e06a4fa0ae35ada34aeea06a11f5d9d1b32a6120d3584919208c5c6b7aa72d4b
Success!


Test 4: Source to Source
{"Dir: <class '__main__.Source'> -> <class '__main__.Source'>"}
DH1: 
DH2: 2e18c8ccd684870e13379add7ecd1b409dc4bdc5ca1e6c00fc888689c9b0ca41
DH3: cde4df9507d53d591bd5b91335b49dee9d4c9090b8487874af814b677bf78f75
DH4: 
 SS: ec3ea58924d00131c94f2c990f405b585abd838423ee3215629743a5e159ff6e
KEY: dfd543a5097d752fbeb0558580b83abf55565fda31b7f390b9218ed845210e33
DH1: 
DH2: 2e18c8ccd684870e13379add7ecd1b409dc4bdc5ca1e6c00fc888689c9b0ca41
DH3: cde4df9507d53d591bd5b91335b49dee9d4c9090b8487874af814b677bf78f75
DH4: 
 SS: ec3ea58924d00131c94f2c990f405b585abd838423ee3215629743a5e159ff6e
KEY: dfd543a5097d752fbeb0558580b83abf55565fda31b7f390b9218ed845210e33
Success!

It has been implemented manually directly from the official specification. As we can see in Test 1, when sending from a source to a journalist we cannot do DH1 because the public key of the source cannot be advertised (and thus we cannot satisfy the spec and attach AD = EncodeEC(IKA) || EncodeEC(IKB). If we did it, a journalist could never decrypt, because they cannot learn the sender public key before the first contact message.

Similarly, when when a journalist is sending to a source, such as in Test 2 we cannot do DH4, because sources do not have ephemeral (one time keys).

When doing journalist to journalist, Test 3 PQXDH should be complete as per spec.

Both Test 2 and Test 3 should inherit full PQXDH properties. I am unsure of the consequences of removing DH1 from Test 1.

A returning source could potentially do a full PQXDH too, since now the source is known

Furthermore, this makes decryption more complex I am quite sure the journalist will have to do more decryption attempts (such as, all the ephemeral keys for every know source already known).

lsd-cat · 2024-10-05T10:21:23Z

In better readable format with the requirements to run it here.

Memo: for simplicity now I am using a single long term PQ key, which would not provide forward secrecy in the PQ domain. Let's think about this after :), Signal uses them interchangeably, as they are used only for encryption and not for authentication

The consideration here, is that we already have 3 set of key (plus the PQ one) for every participant, and the one time (or ephemeral keys) for the journalist. This actually matches 1:1 PQXDH, if we use all of them including the fetching key.

Then way I applied it is by is by considering:

Signal participant short key name	Signal description	SD Journalist	SD Source	SD description
IK	Identity key	J	S	Long term identity key
EK	Ephemeral key	ME	ME	Per-message ephemeral key
SPK	Signed prekey	JC	SC	Fetching key
(OPK₁, OPK₂, …)	Set of one-time prekeys	(JE₁, JE₂, ...)		Journalist ephemeral keys
PQPK	PQ signed prekey	JPQPK	SPKQP	Post quantum public key

This is how a run of the protocol should work with this set of keys.

Now with these matching, let's picture the 3 different types of exchanges.

Source to Journalist

This is a first contact message between a secret party and a public party.

Source shared key computation

~~DH₁ = DH(S, JC)~~ -> The Journalist will not be able to decrypt, because they do not know S.
DH₂ = DH(ME, J)
DH₃ = DH(ME, JC)
DH₄ = DH(ME, JE_i)
(CT, SS) = PQKEM-ENC(JPQPK)
SK = KDF(~~DH₁~~ || DH₂ || DH₃ || DH₄ || SS)

Journalist shared key computation (trial decryption with the set of (JE, ...))

~~DH₁ = DH(JC, S)~~
DH₂ = DH(J, ME)
DH₃ = DH(JC, ME)
DH₄ = DH(JE_i, ME)
(SS) = PQKEM-DEC(JPQPK, CT)
SK = KDF(~~DH₁~~ || DH₂ || DH₃ || DH₄ || SS)

The journalist can compute the shared secret by knowing only the ME public key, and the PQ CT.

Questions

What do we lose when not doing DH₁? I would say sender authentication, but to be verified
Are CT values unlinkable? Does sending them to the server paired with the messages and the other values break other guarantees?
Does using the fetching key for multiple purposes, which is now encryption and fetching weaken something else?

Test 1: Source to Journalist
{"Dir: <class '__main__.Source'> -> <class '__main__.Journalist'>"}
DH1: 
DH2: 1ac1a247d2eb55ba48673b03a5f85104103920cb7b63e84aa156c8d8fc4daf26
DH3: 61fb424b5c5d2f574224b376d508b4fe6ff8eda02d95b58e48f9268b2aeb4d2d
DH4: 0cc8a6f78a3ed68d34a4c76e6a7b5391ef2cc0fa052f9fbef2f3bcad527ada66
 SS: 1c57fd0c60c6f2f4616a82a4f0d0f3091a3d797c3f2a3b704068d48f61260f97
KEY: 92117d36dc8a258cc35e804dd32a078bcb065b3731f0620ffafdad8abef198ce

Journalist to source

DH₁ = DH(J, SC)
DH₂ = DH(ME, S)
DH₃ = DH(ME, SC)
~~DH₄ = DH(ME, ..)~~ -> Sources do not have ephemeral (one-time) keys
(CT, SS) = PQKEM-ENC(SPQPK)
SK = KDF(DH₁ || DH₂ || DH₃ || ~~DH₄~~ || SS)

Test 2: Journalist to Source
{"Dir: <class '__main__.Journalist'> -> <class '__main__.Source'>"}
DH1: 8b233fe20a51e15dc719d1d07e82373459189a7a135eb1870ade9065d3b8fd36
DH2: 75b60a29a82ddc1b48242788297c594dfd28c87a14ac3ff760ea1c37e8005130
DH3: ed89b89afde975e5eec0dc4638f4dee136d391f7031a84d47975936a016d8551
DH4: 
 SS: 2d1072d3258944f98ee0b7b96fe4eb9757207e642f4eb64d2a26078c3dc2e562
KEY: 905492e3c82caa0cd880e33403c90fd68f73e5c605aa5b90069243583f5080b1

Journalist to Journalist

What a joy:

DH₁ = DH(J_A, JC_B)
DH₂ = DH(ME, J_B)
DH₃ = DH(ME, JC_B)
DH₄ = DH(ME, JE_iB)
(CT, SS) = PQKEM-ENC(JPQPK)
SK = KDF(DH₁ || DH₂ || DH₃ || DH₄ || SS)

There is a subtlety here, that of course the journalist does not know from whom the message is coming, thus they have to try. So they have to try as if the sender was a source, all n JE, and then all n JE * the number of journalists.

If we assume that replying sources, can also add their DH1, because now their identity is known, then the jourenalist would also have to try the total number of sources * n JE.

Test 3: Journalist to Journalist
{"Dir: <class '__main__.Journalist'> -> <class '__main__.Journalist'>"}
DH1: f569399114e5a20e4fa4186c5f5f7a78dfe9c8d5cbf7bc4d3e729eeb8f3c9874
DH2: 06e6b25d848381d2b4310e1af7e710c68a1b648de05695e40c55ed2dd51f0b2d
DH3: bbd41415989395f60091140e0fa3c2e38e9e6d146bf787a1094b71892c137719
DH4: 77261e96ee8ea1193a90822c709213fe569c23cbd2943a3b9f6422cd12372923
 SS: 223b39718f0cfbf91cf3bba6d0db823b4b408daa5e74225d5490d4305ab29e69
KEY: e06a4fa0ae35ada34aeea06a11f5d9d1b32a6120d3584919208c5c6b7aa72d4b

If we want to use ephemeral or semi-ephemeral PQ keys, we have to pair them in couple with the classical journalist ephemeral keys, otherwise we'd get another quadratic increase in trial decryption complexity.

Note: noticed that what I called a clue in the code is not consistent with the README and even the blog post nomenclature, but the message fetching part is demoed just to show everything can work together, and it is not really the point. But we should really start fixing the docs.

Also doing this would finally close #48 (partially), #31, #30.

lsd-cat added protocol research Issues for tracking protocol research and choices security Potential and confirmed security issues labels Oct 4, 2024

lsd-cat mentioned this issue Oct 5, 2024

Refactor to be KEM-oriented instead of DH-oriented #48

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing Message Agreement #55

Missing Message Agreement #55

lumaier commented Sep 24, 2024

lsd-cat commented Oct 5, 2024

lsd-cat commented Oct 5, 2024 •

edited

Loading

Missing Message Agreement #55

Missing Message Agreement #55

Comments

lumaier commented Sep 24, 2024

lsd-cat commented Oct 5, 2024

lsd-cat commented Oct 5, 2024 • edited Loading

Source to Journalist

Source shared key computation

Journalist shared key computation (trial decryption with the set of (JE, ...))

Questions

Journalist to source

Journalist to Journalist

lsd-cat commented Oct 5, 2024 •

edited

Loading