Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2747: VoIP call transfers #2747

Open
wants to merge 12 commits into
base: old_master
Choose a base branch
from
206 changes: 206 additions & 0 deletions proposals/2747-voip-call-transfer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
# MSC2747: Transferring VoIP Calls

[MSC2746](https://github.com/matrix-org/matrix-doc/pull/2746) extends the Matrix
Voice over IP functionality with more reliability, hold/resume and DTMF. The ability
to transfer a call to another destination is absent from the current Matrix VoIP spec
and is not covered by MSC2746.

Adding this will allow for scenarios such as:
* A customer service agent receiving a call using a Matrix client, then transferring
the customer to another department in the company.
* A personal assistant or switchboard operator calling another party on behalf of a
user, then connecting the user directly to their destination.

This MSC builds on [MSC2746](https://github.com/matrix-org/matrix-doc/pull/2746), making
use of the `invitee` field on `m.call.invite` in particular.

## Nomenclature

Throughout this MSC, industry standard nomenclature is used to refer to parties involved
in the call transfer:
* Transferee: The party who is being transferred
* Transferor: The party initiating the transfer.
* Transfer target: The party that the transferee is being transferred to.

## Proposal
This proposal introduces the `m.call.replaces` event which signals the intent of a
participant in a call to replace the call with another, such that the other participant
ends up in a call with a new user. This should appear as one, seamless call to the user
being transferred, with the possible exception of a permission prompt and some UI to
indicate that they are being transferred.

An `m.call.replaces` event has fields:
* `call_id`: The ID of the call that the transferor intends to replace
* `party_id`: The transferor's client's party ID for the call that it intends to replace.
* `replacement_id`: An identifier for the call replacement itself, generated by the
transferor.
* `target_room`: Optional. If specified, the transferee client waits for an invite
to this room and joins it (possibly waiting for user confirmation) and then continues
the transfer in this room. If absent, the transferee contacts the Matrix User ID
given in the `target_user` field in a room of its choosing.
Comment on lines +37 to +40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A room may not always be joinable by ID, i.e. if the room was created on a server, that doesn't exist anymore. In general, I don't think you should join a room by the server part from the ID. I suggest adding an additional via field here, so that the room will be joined via one of the servers in that list. You should probably put the server of the target user there. Alternatively, the server to join via could be extracted from the target users mxid.

If this is not present and the users don't share a room, should a new one be created? Then you probably need to wait for them to join the room to send the call events?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - hence them having to wait for an invite to the room. There shouldn't be any need to wait for them to join the room before calling: in fact in future we'll probably need to include metadata in the invite to say the inviter is trying to call, so the invitee has some way of knowing someone's trying to call them, which right now they don't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, I missed that part, my bad, sorry about the noise :D

* `target_user`: An object giving information about the transfer target:
* `id`: The matrix user ID of the transfer target
* `display_name`: (Optional) The display name of the transfer target.
* `avatar_url`: (Optional) The avatar URL of the transfer target.
* `create_call`: If specified, gives the call ID for the transferee's client to use
when placing the replacement call. Mutually exclusive with `await_call`.
* `await_call`: If specified, gives the call ID that the transferee's client should wait
for. Mutually exclusive with `create_call`.

The display name and avatar URL of the transfer target in the `target_user` field
are purely informational and given by the transferor, so should be treated as such for
trust purposes. They should be omitted if the target has no display name or avatar URL set,
respectively. It is recommended that the transferor uses the transfer target's global
display name and avatar URL, or potentially those from the target room if available,
rather than details from a direct message with the transfer target: the display name and
avatar URL in the direct message room should be treated as private.

From the transferor's point of view, a call transfer starts when they are in active calls
with both the transferee and the transfer target. One or both calls could be on hold and
the call with the transfer target may have not yet been answered (a 'blind transfer').

It also introduces an event to reject the transfer, `m.call.reject_replacement`, which has
fields:
* `call_id`: The ID of the call that was intended to be replaced
* `party_id`: The party ID of the client rejecting the replacement
* `replacement_id`: The replacement ID of the replacement that is being rejected
* `reason`: The reason a replacement is being rejected. One of:
* `declined`: Either the user has declined the transfer, or the client has done so on
their behalf (eg. due to a policy set in their client).
* `failed_room_invite`: The transferee's client timed out whilst waiting for the room
invite to arrive
* `failed_call_invite`: The transferee's client timed out whilst waiting for the invite
for the replacement call to arrive.
* `failed_call`: The replacement call itself could not be made. The `call_failure_reason`
field may be used to give the reason the replacement call failed.
* `call_failure_reason`: (Optional) May be present if `reason` is `failed_call`, in which
case it gives the `reason` field from the replacement call's hangup event.

To initiate a call transfer, the transferor's client:
* Attempts to find a suitable room. This should be a room that contains at least all three
users (and generally no others unless there is a specific reason to use a certain room).
* If a suitable room cannot be found, it should create one, but it should not yet invite
the users, otherwise the transferee will receive the room invite before they receive the
call replace event.
* Once it has created a new room or found an existing one, it then sends two `m.call.replace`
events. One to the room for its call with the transfer target and one to the room for its
call with the transferee, each giving user information for the other and with the
`call_id` field set to the call ID of the respective call. The `target_room` field
is the newly created or chosen room in both cases. The transferor generates a new call ID and
puts this call ID in the `create_call` field in one replace event and in the `await_call`
field of the other. These can be either way around although it is suggested that the
transferee is instructed to create the new call.
* Once each event has been sent to each user, it can invite the corresponding user to the
target room (or may choose to wait for both replace events to send and invite both users
with a single API call).
* Additionally, once each replace event has been sent, it may choose to end the respective
call, although it would generally wait for the other parties to end them unless it is
explicitly intending to perform a blind transfer.
* The client may monitor the target room to observe the progress of the replacement call
being established.

Upon receving an `m.call.replaces` event, a client behaves as follows:
* Checks that it is currently active in a call with call ID given in the `call_id`
field, that the other party in the call matches the sender of the replaces event and
that signalling for the call is being exchanged in the same room as the replaces event.
If any of these are not the case, the client ignores the event.
* Makes a decision on whether to act on the call transfer. How the client makes this decision
is not defined in this MSC. A client may, for example, wish to trust any user on specific
homeservers or in specific rooms or communities to transfer the user, or it may wish to
prompt the user, bearing in mind the display name and avatar of the transfer target supplied
by the transferor could be falsified.
* Once it has decided to act on the call transfer, it should continue to show the original call as
active (or represented in a 'transferring state') in the UI, even if the original call is hung up.
It continues to do so until the original call has either been replaced by the new call or the
replacement has failed.
* If the replace event has a `target_room` specified and the user is not already in the specified
room, it waits for an invite to that room to arrive, then accepts the invite. Once in the room,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the invite never arrives (e.g. federation/networking/implementation bug)? Should we specify a lifetime in m.call.replaces as well or should we just advise clients to stop waiting after a "reasonable" amount of time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - a lifetime field feels more like the time that specific event is valid for (eg. if an invite exceeds its lifetime after you've sent an answer but before the call's connected, you'd continue with the call). Feels OK to let clients decide how long to wait, although we might need to add a way for the transferee to say, "I tried, but it didn't work".

if the `m.call.replaces` event had `create_call`, it sends an `m.call.invite` in the target room,
setting the `call_id` to the value of the `create_call` field and the `invitee` field to the
`id` field of `target_user`. If the replaces event contained `await_call`, the client waits
for a call with ID equal to that in the `await_call` field. It is up to the transferee's client
to decide how long to wait for each invite before timing out. If it times out, it sends an
`m.call.reject_replacement` event in the original room to signal that the replcaement has failed.
* If this call is sucessfully answred by the invitee, the client sends a hangup event in the
room for the original call, ending the call.

The `m.call.reject_replacement` is sent if the client does not accept the call transfer (eg.
it decides that the transferor is not sufficiently trustworthy, or it prompted the user and the
user chose to reject the transfer). The event has `replacement_id` equal to the `replacement_id`
of the `m.call.replaces` event that initiated the transfer.

On receiving this, the transferor aborts the transfer process and informs the transferor user
that the call transfer was rejected, and by which party. There is no explicit event to accept
the transfer.

### Capability Advertisment
This proposal also introduces a field on `m.call.invite` and `m.call.answer` events at the top
level with the key `capabilities`, whose value is an object. We define the key,
`m.call.transferee` which, if set to true, states that the sender of the event supports the
`m.call.replaces` event and therefore supports being transferred to another destination.
For example:

```
{
"type": "m.call.invite",
"room_id": "!rO0m_1d:example.org",
"content": {
"call_id": "123456",
"lifetime": 60000,
"capabilities": {
"m.call.transferee": true,
},
"offer": {
"type": "offer",
"sdp": [...],
},
dbkr marked this conversation as resolved.
Show resolved Hide resolved
"version": 1,
},
}
```

If this key is absent or set to anything other than the boolean, `true`, or if
the `capabilities` object is missing altogether, it should be assumed that the
sender of the invite or answer does not support call transfers and clients should
reflect this in the UI accordingly.

We also define a capability called `m.call.dtmf`. Clients should only display UI for sending
DTMF during a call if the other party advertises this capability (boolean value `true`).

## Potential issues
A call transfer is fairly complex and involves a lot of round-trips and state on clients, and
is fairly complex for clients to implement, in comparison to the rest of the VoIP spec which
is reasonably lightweight. If there were a PBX or soft switch on the path, this may potentially
handle the logic of doing the actual transfer meaning that the transferor would just need to
send a n `m.call.replaces` event to initiate the transfer, and clients would not have to
implement the rest of the protocol for being transferred if their leg of the call remained with
the PBX / soft switch.

## Alternatives
No provision is made for a transferor to prompt a transferee to place a call to a
transfer target without there being an existing active call between the transferor
and the transferee. SIP does have this capability using the REFER method. This would
require a mechanism for the transferor to identify the transferee's individual devices,
akin to a GRUU in SIP, and be able to direct a specfic one of them to place the call.

Equivalently, this could be achieved in a different way, for example, all the transferee's
devices could ring, and when they 'answer' on one of them, it places the call to the transfer
target. Similar behaviour can be achieved with the mechanisms described by this MSC, apart
from the fact that the initial incoming call to the transferee would be, and would appear as,
a normal incoming call from the transferor rather than being presented as a call to the
transfer target.

Consideration was given to using a more generic event to refer conversations in general
Copy link
Member

@ara4n ara4n Oct 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Breakout rooms" are becoming a more and more requested feature thanks to COVID and how useful they are in Zoom and BBB. I continue to wonder if it would be better to have a generic "please take this room somewhere else" mechanism rather than having it VoIP specific. What are the specific VoIP semantics that justify making this VoIP specific?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly that we specify call IDs for the call to create or wait for, plus party IDs.

between rooms as well as calls given the overlap in functionality. With threading support,
this could also transparently move threads between rooms. However, there are a number of
specific semantics associated with transferring calls specifically, and `m.call.replace`
better captures the behaviour of replacing the current call with a new one, so this MSC
opts to use a specific event for transferring calls.

## Security considerations
The `target_user` field of the `m.call.replaces` event could be fabricated by the transferor,
as mentioned above. The transferee's client would have to present it to the user in this context.

It would be up to clients to decide when to honour an incoming transfer request. If they accepted
any instruction to transfer the call, it would be possible to cause a user to place a VoIP call
to any Matrix user just by establishing a call to them and sending an `m.call.replaces` event.