Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC1756: cross-signing devices using a master identity key #1756

Merged
merged 21 commits into from
Nov 26, 2019

Conversation

uhoreg
Copy link
Member

@uhoreg uhoreg commented Dec 14, 2018

@uhoreg uhoreg added the proposal A matrix spec change proposal label Dec 14, 2018
@uhoreg uhoreg changed the title [WIP] MSC: cross-signing devices using a master identity key [WIP] MSC1756: cross-signing devices using a master identity key Dec 15, 2018
@erikjohnston
Copy link
Member

erikjohnston commented Dec 16, 2018

Some observations from a security perspective:

  1. The power to attest new devices resides in a single place, the master key. In MSC 1680 all devices have this power.
  2. Having a separate key for attestations means that a user has flexibility over how to store the master key, and the trade-off they wish to make between security and convenience. For example, they can choose to have the master key on every device, or instead store the master key offline and only use it when signing new devices.
  3. If only the server has the power to distribute new attestations across the network, then attacker needs both the master key and the ability to pass UIA (i.e. have the account's password). This reduces the consequences of the master key being compromised, as they would not be able to create new devices (or rotate master key, etc) without also compromising the user account.
  4. Even if a user decides to distribute the master key across all devices, we should probably strongly recommend that the master key is stored in an encrypted manner. While an attacker would be able to eventually crack the password, it would give time for users to rotate the master key if they became aware their key was compromised.
  5. In this model there is a separation between a single device being compromised and the identity being compromised. For the former a compromised device's attestation can simply be revoked, while for the latter the entire master key needs to be revoked, meaning other users need to verify a new master key for the compromised user. In MSC 1680 these two cases are muddled together due to the graph nature of the attestations, making it harder for a user to figure out the correct response to a device being compromised.

@erikjohnston
Copy link
Member

erikjohnston commented Dec 16, 2018

I wonder if the POST /_matrix/client/r0/keys/query API can be simply changed to:

{
  "failures": {},
  "device_keys": {
    "@alice:example.com": {
      "JLAFKJWSCS": { ... },
        "unsigned": {
          "device_display_name": "Alice's mobile phone",
          "attestations": {
              "base64+encoded+public+key": "base64+encoded+signature"
          }
        }
      }
    }
  },
  "verified_master_keys": {
       "@user:example.com": {
           "key": "base64+encoded+public+key",
           "signature": "base64+encoded+signature",
       }
  }
}

Where:

  • "attestations" is a map from master key to the signature of the device's public key signed by the master key
  • "verified_master_keys" is a map of users' master keys signed by our master key

In particular, I'm leaning towards simply signing the actual keys, rather than signed JSON of a bunch of stuff. This is because a) its the keys we care about, and b) signed JSON is a bit of a PITA and I'd prefer it if we avoid it where possible.

@ara4n
Copy link
Member

ara4n commented Dec 16, 2018

At a high level this is looking good to me - thanks for writing it up.

My main concerns are:

  • How does this interact with incremental key backups? It feels like we're solving a very similar problem: creating a single keypair which is used to encrypt a backup, storing it encrypted on the server behind a passphrase, and then loading it & unencrypting it on clients on demand. Should it be the same key so we don't double the complexity (which is already pretty bad for the incremental keybackup stuff?) and to keep a simpler security model? i.e. "This is the one true key which allows the owner to impersonate me by creating new devices... and also access all my online history. So keep it safe."
  • The proposal of using an m.master algorithm to identify the master key feels very weird & hacky. I think this would go away if the master key isn't a device but instead the same as your online backups key? The disadvantage being that we drift further from the shape of the current API.

@erikjohnston
Copy link
Member

Agree with @ara4n, and thanks again for writing it up! 👍

  • How does this interact with incremental key backups? It feels like we're solving a very similar problem: creating a single keypair which is used to encrypt a backup, storing it encrypted on the server behind a passphrase, and then loading it & unencrypting it on clients on demand. Should it be the same key so we don't double the complexity (which is already pretty bad for the incremental keybackup stuff?) and to keep a simpler security model? i.e. "This is the one true key which allows the owner to impersonate me by creating new devices... and also access all my online history. So keep it safe."

Yup. Though we'll probably still want a separate backup that is instead simply encrypted by the master key when stored on the server, as:

  1. We want to be able to rotate the master key, and I doesn't seem feasible to re-encrypt all the backups.
  2. The backup key is stored unencrypted on devices, whereas we'd want the master key to be encrypted, etc

@ara4n
Copy link
Member

ara4n commented Dec 17, 2018

Though we'll probably still want a separate backup that is instead simply encrypted by the master key when stored on the server

I think this is how the incremental keybackup stuff works already? (as per #1703 and its predecessors)

@uhoreg
Copy link
Member Author

uhoreg commented Dec 18, 2018

@ara4n, @erikjohnston: I've updated the proposal with the alternative API. I'll do some more thinking about how to integrate with the key backup stuff.

@erikjohnston
Copy link
Member

Looking good! FTR, I much prefer proposal 2 where we differentiate between devices and the master key, as really the only thing devices and master keys have in common are they both share a key ID namespace

@ara4n
Copy link
Member

ara4n commented Dec 21, 2018

@richvdh is there any chance you could take a quick sanity check over this (so we can avoid another situation where we get valuable but last minute feedback after the impl has already happened O:-)

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks broadly good to me. My main concern is around how we get hold of the private master key so that we can sign other users' keys, without the user having to type in his recovery password every 30 seconds.

proposals/1756-cross-signing.md Outdated Show resolved Hide resolved
proposals/1756-cross-signing.md Outdated Show resolved Hide resolved
proposals/1756-cross-signing.md Outdated Show resolved Hide resolved
proposals/1756-cross-signing.md Outdated Show resolved Hide resolved
proposals/1756-cross-signing.md Outdated Show resolved Hide resolved
@caev
Copy link

caev commented Jan 1, 2019

As I understand it, the user must choose, (a) "store master key on device" vs. (b) "store wrapped master key on server."

(b) is a non-starter because most users won't participate in the feature. The overall feature only benefits a users' friends, not themselves, so they will not go to any effort to learn about it, nor ever achieve full understanding of the problem you are solving even while they are annoyed by the problem.

In ~all cases of a new login, the proposed feature should be used, so it should happen as part of the regular login flow similar to Google's "did you sign in from device X?" notifications on Android. At most using the feature can require a quick "[confirmed]" on an old device to add a new one. It can't require a tier 2 emergency password to do something that will happen exactly as often as the tier 1 ordinary password gets used; if optional it won't happen, and if mandatory the tiers are meaningless.

IMHO, (b) should not be implemented, optionally or otherwise. The alternate use-cases will be too confusing on our side, and we will tend to be overly generous in evaluating ourselves if the escape hatch exists, then be surprised when things don't typically go well in the wild.

(a) doesn't handle revocations well.

1 Alice has 3 devices in the normal state (master key on all of them, in accordance with (a)).
2 Alice loses 1 device.
3 Alice notices the device is gone and uses one of the remaining two devices to revoke it.
4 A hacker recovers the lost device and extracts key material from it.

For a protocol to have "revocation" in a meaningful way, (3) must mitigate (4). Yes, there are other scenarios to worry about, for example where Alice doesn't do revocation at all, or where (4) and (3) are inverted in time. But the perfect should not be allowed to be the enemy of the good. Baseline "meaningful" revocation is valuable because the gap between 2 and 4 is likely large, ex. a recycled hard drive.

In this proposal, (3) seems to have become basically meaningless because the master key can't be revoked at all, or can't be revoked without nuking Alice's trust graph anchored to the non-lost devices, which in practice will often mean worse security than ignoring the lost device because in many cases the lost device will never be recovered by an actual hacker (it's probably something that got wiped), and nuking accounts frequently has a cost of teaching people to accept unverified keys, which become the more realistic actual attack than trawling for lost devices.

In my opinion, a good revocation step would have this basic property:

  • A revoked device can't sign new devices with the master key. A master key can't be extracted from a revoked device and then used, somehow, to authenticate a new device, whether by OOB upload, injection of messages by spoofing the server, new-device-login followed by new-device-master-signing, mischevious backup-restore, or any other nefarious means.

and these advanced properties:

  • Revocation respects a quorum. For example, in the 3 & 4 inverted case where the hacker uses Alice's lost device's credentials before Alice notices and revokes them, the hacker will only be able to revoke one of Alice's two non-lost devices. Alice can then use the surviving device to revoke the hacker's device, then add back more devices so she can retain ultimate control of the account.

    • Actions taken by a device between when it was lost and when it was revoked can be undone. For example, at revocation time, the user declares the last time she's certain she had the device, either by timestamp, or by choosing a message she remembers sending from it. Then,
    • a lost device in the hands of a hacker can't add 10 devices, establish quorum, then delete Alice's remaining two devices. This implies there has to be a vesting period before a device contributes to quorum and has revocation privileges, and a staleness period for users who destroy their own qorum by spamming their accounts with fake web devices they no longer actually have ex. by clearing cookies repeatedly and logging in again, or some such flail.
    • Any device signed by a lost device's master key after it was lost can be conveniently revoked along with the lost device (this is just UI sugar, suggestions of what other devices should be marked "lost" based on the last-message-I-remember-sending watermark, plus the "vesting" rules).
    • Any messages sent by a revoked device between when it was lost and when it was revoked can be retroactively marked untrusted in Alice's friends' histories.

I don't know the best way to hit these revocation goals, especially the stuff around quorum.

One degenerate implementation that falls short, but is an improvement:

  • Two levels of master key like GnuPG. The true master key (L1) can be stored on a paper wallet and is required for "survivable" revocation. The regular master key (L2) is stored on all devices.
    • without the paper key: non-"survivable" revocation. Any device may revoke the entire identity. If that has not happened yet, any device may add another device. Revoking a single device without the paper key is not possible; it's all or nothing.
    • with the paper key: it's possible to create an "L2-master-key rotation blob."
      • L1 signature of a fresh L2 key
      • private half of L2 key, wrapped to the device key of every device that is NOT revoked
      • high-water mark for each device IS revoked, saying when it was lost, "messages older than this are still ok".

The workflow with the paper key is:

1 Alice has 3 devices in the normal state (master key on all of them, in accordance with (a)).
2 Alice loses 1 device.
3 Alice notices the device is gone and revokes her entire account.
4 A hacker recovers the lost device and extracts key material from it, but it's useless because Alice revoked herself from matrix entirely.
5 Alice finds her paper key and recovers her account.

One thing to watch for is whether timestamps can be forged. It's probably better to express the watermark as a position in the crypto ratchet, not a timestamp, regardless of how the UI surfaces the feature, which is why I mention using a previously-sent message as a marker. This means device cross-signing needs to be put on the same ratchet somehow so any signatures made by the hacker can be revoked without the hacker evading that by pre-dating the signature, which might not be possible. : / Timestamp attacks vs ratchets may also apply to the quorum timers. : (

I think this degenerate scheme won't work well for users compared to the "quorum" rules because either users won't keep the paper key, or they will lose the paper key (which can't be itself rotated), but at least the degenerate scheme degrades gracefully enough to be strictly better than the existing proposal. I'm afraid I would make a mistake if I tried to implement the quorum rules.

@erikjohnston
Copy link
Member

Thanks @caev for the thoughts, especially around how this would actually be used in the real world. Its taken me a while to digest it, but a few quick notes:

As I understand it, the user must choose, (a) "store master key on device" vs. (b) "store wrapped master key on server."

Technically there's also option c) store master key in a safe and only take it out on special occasions like adding a new device or recovering you account, etc. Though this is more for power user folks, so doesn't really change anything re your following points.

(b) is a non-starter because most users won't participate in the feature. The overall feature only benefits a users' friends, not themselves, so they will not go to any effort to learn about it, nor ever achieve full understanding of the problem you are solving even while they are annoyed by the problem.

This is a very interesting way of looking at it. Certainly I've been assuming that clients would suitably prompt users to Do The Right Thing, e.g. prompting users to save their master key offline and/or upload an encrypted version to the server, with suitable UX to push people into not just skipping those steps. However this raises two questions: 1) is this enough to get people to use it and 2) will client implementors get it right (assuming Riot gets it right so can be used for reference)?

Certainly its a bit unfortunate that there hasn't been a greater discussion around likely real world UX in this MSC, I know its being considered elsewhere.

In ~all cases of a new login, the proposed feature should be used, so it should happen as part of the regular login flow similar to Google's "did you sign in from device X?" notifications on Android. At most using the feature can require a quick "[confirmed]" on an old device to add a new one. It can't require a tier 2 emergency password to do something that will happen exactly as often as the tier 1 ordinary password gets used; if optional it won't happen, and if mandatory the tiers are meaningless.

Yup, I believe the idea is to do exactly this 👍. Again its unfortunate that the UX/UI proposals aren't linked to this MSC.

IMHO, (b) should not be implemented, optionally or otherwise. The alternate use-cases will be too confusing on our side, and we will tend to be overly generous in evaluating ourselves if the escape hatch exists, then be surprised when things don't typically go well in the wild.

Hopefully it will get used in the wild, as above.

(Revocation...)

I think the ideas in this comment and elsewhere help with a lot of those concerns, though isn't as fully featured as your excellent suggestions. Basically, if you can pass user interactive auth (UIA) you can always revoke the master key and either a) blow away your master key and start again or b) if you have the old master key you can rotate to a new one.

I'm not so sure about the quorum proposal, as it sounds easy to game by just adding enough new devices.

Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Written in a nice, understandable format, and I don't see any red flags as a server developer, so lgtm.

@mscbot mscbot added the final-comment-period This MSC has entered a final comment period in interest to approval, postpone, or delete in 5 days. label Nov 21, 2019
@mscbot
Copy link
Collaborator

mscbot commented Nov 21, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

@mscbot mscbot added finished-final-comment-period and removed proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. final-comment-period This MSC has entered a final comment period in interest to approval, postpone, or delete in 5 days. labels Nov 21, 2019
@mscbot
Copy link
Collaborator

mscbot commented Nov 26, 2019

The final comment period, with a disposition to merge, as per the review above, is now complete.

@turt2live
Copy link
Member

Feature flag for this is org.matrix.e2e_cross_signing per matrix-org/synapse#6712

@richvdh
Copy link
Member

richvdh commented Jan 16, 2020

should the feature flag be in the MSC?

@turt2live
Copy link
Member

probably, yes.

@Thatoo
Copy link

Thatoo commented Jan 17, 2020

In which version will it be release?
Are Riot-web, Riot Android, RiotX ready for it?

@ara4n
Copy link
Member

ara4n commented Feb 3, 2020

it is now on develop branches of Riot/Web, Riot/iOS & RiotX/Android (but we're still debugging it).

@turt2live turt2live added the kind:core MSC which is critical to the protocol's success label Apr 20, 2020
@turt2live turt2live added spec-pr-in-review A proposal which has been PR'd against the spec and is in review and removed finished-final-comment-period labels Jul 25, 2020
@uhoreg
Copy link
Member Author

uhoreg commented Dec 15, 2020

Merged! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge e2e feature:e2e-cross-signing kind:core MSC which is critical to the protocol's success merged A proposal whose PR has merged into the spec! proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.