Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3884: Sliding Sync Extension: E2EE #3884

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions proposals/3884-sliding-sync-e2ee.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# MSC3884: Sliding Sync Extension: E2EE

This MSC is an extension to [MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575)
which includes support for end-to-end encrypted room fields in the `/sync` response.

## Proposal

MSC3575 does not include support for end-to-end encrypted rooms. This extension will add support for
end-to-end encrypted fields, specifically one-time keys, changed devices and fallback key types.

The prosposal is to introduce a new extension called `e2ee` with the following request parameters:
```json5
{
"enabled": true, // sticky
}
```

_Note: This extension ignores the core `lists` and `rooms` parameters to extensions, because none of the
data returned by this extension is scoped to a particular room._

If `enabled` is `true`, then the sliding sync response MAY include the following response fields in
the `e2ee` extension response:
```json
{
"device_one_time_keys_count": {
"signed_curve25519": 50
},
"device_lists": {
"changed": ["@alice:example.com"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the semantics for this? When do I get a "changed" event? Only for device lists that changed during this connection? Or does this work on the delta token? Maybe this should have its own since and next batch token? Key data is actually a LOT of data, so invalidating that on every reconnect would be pretty scary. And not being told your device list is possibly missing changed events, because the delta token expired is equally scary.

Copy link
Member Author

@kegsay kegsay Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semantics are the same as sync v2 regarding getting a "changed" or "left" event.

Currently, the proxy will aggregate changed/left responses from sync v2 then deliver this to the client, only purging the arrays server-side once the client has acknowledged the response. This does mean that if you're offline for a while, these arrays can be large. This means we will always tell you of a change in a user's device lists, as this aggregation process happens outside the connection scope (it's constantly accumulated in the proxy from the moment you first start syncing until the end of time).

We don't currently have delta token support in this extension, but it might be worthwhile to add it if we can get the semantics right.

Also worth noting that in practice the size of these arrays isn't a large part of the overall sync response, even if you have been offline for a while.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is mostly, that it would be easy to get this wrong on the client side, if you don't treat all devices as changed after a reconnection, where the server doesn't send you change lists from before you reconnected. Currently clients mostly rely on getting all change events as long as they don't do a new initial sync, but with sliding sync the "initial sync" could possibly happen quite often and by accident and in that case you would need to treat ALL keys as stale. You can probably implement that correctly in a client, but some more explicit notification, that your delta token wasn't used and your whole lists should be treated as stale might be harder to get wrong.

"left": ["@bob:example.com"]
},
"device_unused_fallback_key_types": [
"signed_curve25519"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field has to always be included, otherwise there is no way for the client to tell, if the fallback key got used.

HOWEVER:

You can actually probably solve that with the sliding sync API, by having the client explicitly tell the server about what fallback keys it wants to know about, when they are used. Then the server would have to include them everytime when they are used, but could not include them in the common case of them being unused. The same could be done for the one time keys count, where you specify a minimum number you want to have and if you have those, the server doesn't update you about it.

This is however a bit racy, as a client might be told about a used key, while it is still uploading a key. But I think that is already an issue we have today.

Basically the request parameters would then look something like this:

{
  "enabled": true,
  "minimum_device_one_time_keys_count": {
    "signed_curve25519": 50 // tell the server to notify the client, if the OTKs fall below this limit
  },
  "fallback_key_types": [ "signed_curve25519" ] // tell the server to notify the client, if no such key is present
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably requires the server to tell the client, if a key type is unsupported though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related spec issue: #3298

Unsupported keys are hard to distinguish from "no change".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea overall, especially with the "min" values. I'd be interested to hear if this makes things any easier on clients, or just saves a bit of bandwidth.

]
}
```

All keys are optional and clients MUST check if they exist prior to use. If there are zero changes to
every field then the server MAY omit sending back an `e2ee` extension entirely.

The semantics of these fields is exactly the same as the current `/sync` implementation, as implemented
in v1.3 of the Client-Server Specification. `device_lists` may be omitted if there are no users who
have changed/left.

For sliding sync, an initial response will include all fields. When there are updates, only the
_changed_ fields are returned. That is to say, if `device_one_time_keys_count` has not changed between
requests, it will be omitted which means to use the previous value. This deviates from the current
`/sync` implementation which always includes this field. Likewise for `device_unused_fallback_key_types`.

Particular care must be taken when a fallback key is used, as this will cause the response to be:
```json
"device_unused_fallback_key_types": []
```
which is not the same as the field being omitted/null. The empty array means the key was used. Omitted
fields means no changes.

## Potential issues

It's unclear if `device_unused_fallback_key_types` and `device_one_time_keys_count` should always be
included or not, as this extension deviates from the logic as v1.3 of the Specification which is not
clear on this. If it is always included, this adds extra bytes and therefore consumes needless
bandwidth. In practice, Synapse _always_ includes these fields, when this is probably not needed.
Changing this behaviour may break clients which expect these fields to always exist.

### Concurrent connections

The sliding sync protocol allows multiple concurrent connections disambiguated
by a `conn_id`: see "Concurrent connections" of [MSC3575](
https://github.com/matrix-org/matrix-spec-proposals/pull/3575
). Clients using multiple concurrent connections MUST enable the E2EE
extension on at most one connection. Using the E2EE extension on multiple
concurrent connections is not supported; doing so risks data loss and E2EE
messaging failures.

## Alternatives

The alternative is to not include this extension at all, making it impossible to include reliable
E2EE support in Sliding Sync. As this extension is opt-in, as all Sliding Sync extensions are, it
feels undesirable to not provide this extension.

## Security considerations

No additional security considerations beyond what the current `/sync` implementation provides.

## Unstable prefix

No unstable prefix as Sliding Sync is still in review. To enable this extension, just add this to
your request JSON:
```json5
{
"extensions": {
"e2ee": {
"enabled": true,
}
}
}
```

## Dependencies

This MSC builds on MSC3575, which at the time of writing has not yet been accepted into the spec.
Loading