From f8f32f73d181ea4436cb643f1063ba16efeed024 Mon Sep 17 00:00:00 2001 From: Max Inden Date: Wed, 12 Apr 2023 09:30:21 +0200 Subject: [PATCH] feat(webrtc): add WebRTC (prev. browser-to-browser) spec (#497) Introduces the webrtc protocol - a libp2p transport protocol enabling two private nodes (e.g. two browsers) to establish a direct connection. --- README.md | 2 +- webrtc/README.md | 344 +++------------------------------------- webrtc/webrtc-direct.md | 312 ++++++++++++++++++++++++++++++++++++ webrtc/webrtc.md | 122 ++++++++++++++ 4 files changed, 458 insertions(+), 322 deletions(-) create mode 100644 webrtc/webrtc-direct.md create mode 100644 webrtc/webrtc.md diff --git a/README.md b/README.md index 8155b8a65..da1b0486c 100644 --- a/README.md +++ b/README.md @@ -99,7 +99,7 @@ see [#465](https://github.com/libp2p/specs/issues/465). - [secio][spec_secio] - SECIO, a transport security protocol for libp2p - [tls][spec_tls] - The libp2p TLS Handshake (TLS 1.3+) - [quic][spec_quic] - The libp2p QUIC Handshake -- [webrtc][spec_webrtc] - The libp2p WebRTC transport +- [webrtc][spec_webrtc] - The libp2p WebRTC transports - [WebTransport][spec_webtransport] - Using WebTransport in libp2p diff --git a/webrtc/README.md b/webrtc/README.md index 8aff87811..a44d46495 100644 --- a/webrtc/README.md +++ b/webrtc/README.md @@ -1,8 +1,8 @@ # WebRTC -| Lifecycle Stage | Maturity | Status | Latest Revision | -|-----------------|---------------------------|--------|-----------------| -| 2A | Candidate Recommendation | Active | r0, 2022-10-14 | +| Lifecycle Stage | Maturity | Status | Latest Revision | +|-----------------|--------------------------|--------|-----------------| +| 2A | Candidate Recommendation | Active | r1, 2023-04-12 | Authors: [@mxinden] @@ -11,171 +11,27 @@ Interest Group: [@marten-seemann] [@marten-seemann]: https://github.com/marten-seemann [@mxinden]: https://github.com/mxinden/ - -**Table of Contents** +WebRTC flavors in libp2p: -- [WebRTC](#webrtc) - - [Motivation](#motivation) - - [Addressing](#addressing) - - [Connection Establishment](#connection-establishment) - - [Browser to public Server](#browser-to-public-server) - - [Multiplexing](#multiplexing) - - [Ordering](#ordering) - - [Head-of-line blocking](#head-of-line-blocking) - - [`RTCDataChannel` negotiation](#rtcdatachannel-negotiation) - - [`RTCDataChannel` label](#rtcdatachannel-label) - - [Connection Security](#connection-security) - - [Previous, ongoing and related work](#previous-ongoing-and-related-work) - - [Test vectors](#test-vectors) - - [Noise prologue](#noise-prologue) - - [Both client and server use SHA-256](#both-client-and-server-use-sha-256) -- [FAQ](#faq) +1. [WebRTC](./webrtc.md) - + libp2p transport protocol enabling two private nodes (e.g. two browsers) to + establish a direct connection. -## Motivation +2. [WebRTC Direct](./webrtc-direct.md) -1. **No need for trusted TLS certificates.** Enable browsers to connect to - public server nodes without those server nodes providing a TLS certificate - within the browser's trustchain. Note that we can not do this today with our - Websocket transport as the browser requires the remote to have a trusted TLS - certificate. Nor can we establish a plain TCP or QUIC connection from within - a browser. We can establish a WebTransport connection from the browser (see - [WebTransport specification](../webtransport)). + libp2p transport protocol **without the need for trusted TLS certificates.** + Enable browsers to connect to public server nodes without those server nodes + providing a TLS certificate within the browser's trustchain. Note that we can + not do this today with our Websocket transport as the browser requires the + remote to have a trusted TLS certificate. Nor can we establish a plain TCP or + QUIC connection from within a browser. We can establish a WebTransport + connection from the browser (see [WebTransport + specification](../webtransport)). -## Addressing +## Shared concepts -WebRTC multiaddresses are composed of an IP and UDP address component, followed -by `/webrtc` and a multihash of the certificate that the node uses. - -Examples: - -- `/ip4/192.0.2.0/udp/1234/webrtc/certhash//p2p/` -- `/ip6/fe80::1ff:fe23:4567:890a/udp/1234/webrtc/certhash//p2p/` - -The TLS certificate fingerprint in `/certhash` is a -[multibase](https://github.com/multiformats/multibase) encoded -[multihash](https://github.com/multiformats/multihash). - -For compatibility implementations MUST support hash algorithm -[`sha-256`](https://github.com/multiformats/multihash) and base encoding -[`base64url`](https://github.com/multiformats/multibase). Implementations MAY -support other hash algorithms and base encodings, but they may not be able to -connect to all other nodes. - -## Connection Establishment - -### Browser to public Server - -Scenario: Browser _A_ wants to connect to server node _B_ where _B_ is publicly -reachable but _B_ does not have a TLS certificate trusted by _A_. - -1. Server node _B_ generates a TLS certificate, listens on a UDP port and - advertises the corresponding multiaddress (see [#addressing]) through some - external mechanism. - - Given that _B_ is publicly reachable, _B_ acts as a [ICE - Lite](https://www.rfc-editor.org/rfc/rfc5245) agent. It binds to a UDP port - waiting for incoming STUN and SCTP packets and multiplexes based on source IP - and source port. - -2. Browser _A_ discovers server node _B_'s multiaddr, containing _B_'s IP, UDP - port, TLS certificate fingerprint and optionally libp2p peer ID (e.g. - `/ip6/2001:db8::/udp/1234/webrtc/certhash//p2p/`), through some - external mechanism. - -3. _A_ instantiates a `RTCPeerConnection`. See - [`RTCPeerConnection()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/RTCPeerConnection). - - _A_ (i.e. the browser) SHOULD NOT reuse the same certificate across - `RTCPeerConnection`s. Reusing the certificate can be used to identify _A_ - across connections by on-path observers given that WebRTC uses TLS 1.2. - -4. _A_ constructs _B_'s SDP answer locally based on _B_'s multiaddr. - - _A_ generates a random string prefixed with "libp2p+webrtc+v1/". The prefix - allows us to use the ufrag as an upgrade mechanism to role out a new version - of the libp2p WebRTC protocol on a live network. While a hack, this might be - very useful in the future. _A_ sets the string as the username (_ufrag_ or _username fragment_) - and password on the SDP of the remote's answer. - - _A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning - [multiplexing](#multiplexing) for rational. - - Finally _A_ sets the remote answer via - [`RTCPeerConnection.setRemoteDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setRemoteDescription). - -5. _A_ creates a local offer via - [`RTCPeerConnection.createOffer()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createOffer). - _A_ sets the same username and password on the local offer as done in (4) on - the remote answer. - - _A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning - [multiplexing](#multiplexing) for rational. - - Finally _A_ sets the modified offer via - [`RTCPeerConnection.setLocalDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setLocalDescription). - - Note that this process, oftentimes referred to as "SDP munging" is disallowed - by the specification, but not enforced across the major browsers (Safari, - Firefox, Chrome) due to use-cases in the wild. See also - - -6. Once _A_ sets the SDP offer and answer, it will start sending STUN requests - to _B_. _B_ reads the _ufrag_ from the incoming STUN request's _username_ - field. _B_ then infers _A_'s SDP offer using the IP, port, and _ufrag_ of the - request as follows: - - 1. _B_ sets the the `ice-ufrag` and `ice-pwd` equal to the value read from - the `username` field. - - 2. _B_ sets an arbitrary sha-256 digest as the remote fingerprint as it does - not verify fingerprints at this point. - - 3. _B_ sets the connection field (`c`) to the IP and port of the incoming - request `c=IN `. - - 4. _B_ sets the `a=max-message-size:16384` SDP attribute. See reasoning - [multiplexing](#multiplexing) for rational. - - _B_ sets this offer as the remote description. _B_ generates an answer and - sets it as the local description. - - The _ufrag_ in combination with the IP and port of _A_ can be used by _B_ - to identify the connection, i.e. demultiplex incoming UDP datagrams per - incoming connection. - - Note that this step requires _B_ to allocate memory for each incoming STUN - message from _A_. This could be leveraged for a DOS attack where _A_ is - sending many STUN messages with different ufrags using different UDP source - ports, forcing _B_ to allocate a new peer connection for each. _B_ SHOULD - have a rate limiting mechanism in place as a defense measure. See also - . - -7. _A_ and _B_ execute the DTLS handshake as part of the standard WebRTC - connection establishment. - - At this point _B_ does not know the TLS certificate fingerprint of _A_. Thus - _B_ can not verify _A_'s TLS certificate fingerprint during the DTLS - handshake. Instead _B_ needs to _disable certificate fingerprint - verification_ (see e.g. [Pion's `disableCertificateFingerprintVerification` - option](https://github.com/pion/webrtc/blob/360b0f1745c7244850ed638f423cda716a81cedf/settingengine.go#L62)). - - On success of the DTLS handshake the connection provides confidentiality and - integrity but not authenticity. The latter is guaranteed through the - succeeding Noise handshake. See [Connection Security - section](#connection-security). - -8. Messages on each `RTCDataChannel` are framed using the message - framing mechanism described in [Multiplexing](#multiplexing). - -9. The remote is authenticated via an additional Noise handshake. See - [Connection Security section](#connection-security). - -WebRTC can run both on UDP and TCP. libp2p WebRTC implementations MUST support -UDP and MAY support TCP. - -## Multiplexing +### Multiplexing The WebRTC browser APIs do not support half-closing of streams nor resets of the sending part of streams. @@ -234,14 +90,14 @@ limits"](https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Using_data_ Implementations MAY choose to send smaller messages, e.g. to reduce delays sending _flagged_ messages. -### Ordering +#### Ordering Implementations MAY expose an unordered byte stream abstraction to the user by overriding the default value of `ordered` `true` to `false` when creating a new data channel via [`RTCPeerConnection.createDataChannel`](https://www.w3.org/TR/webrtc/#dom-peerconnection-createdatachannel). -### Head-of-line blocking +#### Head-of-line blocking WebRTC datachannels and the underlying SCTP is message-oriented and not stream-oriented (e.g. see @@ -272,7 +128,7 @@ IPv4 and 1224 bytes on IPv6. Long term we hope to be able to give better recommendations based on real-world experiments. -### `RTCDataChannel` negotiation +#### `RTCDataChannel` negotiation `RTCDataChannel`s are negotiated in-band by the WebRTC user agent (e.g. Firefox, Pion, ...). In other words libp2p WebRTC implementations MUST NOT change the @@ -294,7 +150,7 @@ containing user data without waiting for the reception of the corresponding DATA_CHANNEL_ACK message", thus using `negotiated: false` does not imply an additional round trip for each new `RTCDataChannel`. -### `RTCDataChannel` label +#### `RTCDataChannel` label `RTCPeerConnection.createDataChannel()` requires passing a `label` for the to-be-created `RTCDataChannel`. When calling `createDataChannel` implementations @@ -303,66 +159,6 @@ MUST pass an empty string. When receiving an `RTCDataChannel` via an empty string. This allows future versions of this specification to make use of the `RTCDataChannel` `label` property. -## Connection Security - -Note that the below uses the message framing described in -[multiplexing](#multiplexing). - -While WebRTC offers confidentiality and integrity via TLS, one still needs to -authenticate the remote peer by its libp2p identity. - -After [Connection Establishment](#connection-establishment): - -1. _A_ and _B_ open a WebRTC data channel with `id: 0` and `negotiated: true` - ([`pc.createDataChannel("", {negotiated: true, id: - 0});`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createDataChannel)). - -2. _B_ starts a Noise `XX` handshake on the new channel. See - [noise-libp2p](https://github.com/libp2p/specs/tree/master/noise). - - _A_ and _B_ use the [Noise - Prologue](https://noiseprotocol.org/noise.html#prologue) mechanism. More - specifically _A_ and _B_ set the Noise _Prologue_ to - `` before starting the actual Noise - handshake. `` is the UTF-8 byte representation of the string - `libp2p-webrtc-noise:`. `` is the concatenation - of the two TLS fingerprints of _A_ (Noise handshake responder) and then _B_ - (Noise handshake initiator), in their multihash byte representation. - - On Chrome _A_ can access its TLS certificate fingerprint directly via - `RTCCertificate#getFingerprints`. Firefox does not allow _A_ to do so. Browser - compatibility can be found - [here](https://developer.mozilla.org/en-US/docs/Web/API/RTCCertificate). In - practice, this is not an issue since the fingerprint is embedded in the local - SDP string. - -3. On success of the authentication handshake, the used datachannel is - closed and the plain WebRTC connection is used with its multiplexing - capabilities via datachannels. See [Multiplexing](#multiplexing). - -Note: WebRTC supports different hash functions to hash the TLS certificate (see -). The hash function used -in WebRTC and the hash function used in the multiaddr `/certhash` component MUST -be the same. On mismatch the final Noise handshake MUST fail. - -_A_ knows _B_'s fingerprint hash algorithm through _B_'s multiaddr. _A_ MUST use -the same hash algorithm to calculate the fingerprint of its (i.e. _A_'s) TLS -certificate. _B_ assumes _A_ to use the same hash algorithm it discovers through -_B_'s multiaddr. For now implementations MUST support sha-256. Future iterations -of this specification may add support for other hash algorithms. - -Implementations SHOULD setup all the necessary callbacks (e.g. -[`ondatachannel`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/datachannel_event)) -before starting the Noise handshake. This is to avoid scenarios like one where -_A_ initiates a stream before _B_ got a chance to set the `ondatachannel` -callback. This would result in _B_ ignoring all the messages coming from _A_ -targeting that stream. - -Implementations MAY open streams before completion of the Noise handshake. -Applications MUST take special care what application data they send, since at -this point the peer is not yet authenticated. Similarly, the receiving side MAY -accept streams before completion of the handshake. - ## Previous, ongoing and related work - Completed implementations of this specification: @@ -375,68 +171,7 @@ accept streams before completion of the handshake. WASM): - WebRTC using STUN and TURN: -## Test vectors - -### Noise prologue - -All of these test vectors represent hex-encoded bytes. - -#### Both client and server use SHA-256 - -Here client is _A_ and server is _B_. - -``` -client_fingerprint = "3e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870" -server_fingerprint = "30fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99" - -prologue = "6c69627032702d7765627274632d6e6f6973653a12203e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870122030fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99" -``` - -# FAQ - -- _Why exchange the TLS certificate fingerprint in the multiaddr? Why not - base it on the libp2p public key?_ - - Browsers do not allow loading a custom certificate. One can only generate a - certificate via - [rtcpeerconnection-generatecertificate](https://www.w3.org/TR/webrtc/#dom-rtcpeerconnection-generatecertificate). - -- _Why not embed the peer ID in the TLS certificate, thus rendering the - additional "peer certificate" exchange obsolete?_ - - Browsers do not allow editing the properties of the TLS certificate. - -- _How about distributing the multiaddr in a signed peer record, thus rendering - the additional "peer certificate" exchange obsolete?_ - - Signed peer records are not yet rolled out across the many libp2p protocols. - Making the libp2p WebRTC protocol dependent on the former is not deemed worth - it at this point in time. Later versions of the libp2p WebRTC protocol might - adopt this optimization. - - Note, one can role out a new version of the libp2p WebRTC protocol through a - new multiaddr protocol, e.g. `/webrtc-2`. - -- _Why exchange fingerprints in an additional authentication handshake on top of - an established WebRTC connection? Why not only exchange signatures of ones TLS - fingerprints signed with ones libp2p private key on the plain WebRTC - connection?_ - - Once _A_ and _B_ established a WebRTC connection, _A_ sends - `signature_libp2p_a(fingerprint_a)` to _B_ and vice versa. While this has the - benefit of only requring two messages, thus one round trip, it is prone to a - key compromise and replay attack. Say that _E_ is able to attain - `signature_libp2p_a(fingerprint_a)` and somehow compromise _A_'s TLS private - key, _E_ can now impersonate _A_ without knowing _A_'s libp2p private key. - - If one requires the signatures to contain both fingerprints, e.g. - `signature_libp2p_a(fingerprint_a, fingerprint_b)`, the above attack still - works, just that _E_ can only impersonate _A_ when talking to _B_. - - Adding a cryptographic identifier of the unique connection (i.e. session) to - the signature (`signature_libp2p_a(fingerprint_a, fingerprint_b, - connection_identifier)`) would protect against this attack. To the best of our - knowledge the browser does not give us access to such identifier. +## FAQ - _Why use Protobuf for WebRTC message framing. Why not use our own, potentially smaller encoding schema?_ @@ -451,44 +186,11 @@ prologue = "6c69627032702d7765627274632d6e6f6973653a12203e79af40d6059617a0d83b83 going forward. Using Protobuf is consistent with the many other libp2p protocols. These benefits outweigh the drawback of additional overhead. -- _Can a browser know upfront its UDP port which it is listening for incoming - connections on? Does the browser reuse the UDP port across many WebRTC - connections? If that is the case one could connect to any public node, with - the remote telling the local node what port it is perceived on. Thus one could - use libp2p's identify and AutoNAT protocol instead of relying on STUN._ - - No, a browser uses a new UDP port for each `RTCPeerConnection`. - -- _Why not load a remote node's certificate into one's browser trust-store and - then connect e.g. via WebSocket._ - - This would require a mechanism to discover remote node's certificates upfront. - More importantly, this does not scale with the number of connections a typical - peer-to-peer application establishes. - - _Why not use a central TURN servers? Why rely on libp2p's Circuit Relay v2 instead?_ As a peer-to-peer networking library, libp2p should rely as little as possible on central infrastructure. -- _Can an attacker launch an amplification attack with the STUN endpoint of - the server?_ - - We follow the reasoning of the QUIC protocol, namely requiring: - - > an endpoint MUST limit the amount of data it sends to the unvalidated - > address to three times the amount of data received from that address. - - - - This is the case for STUN response messages which are only slight larger than - the request messages. See also - . - -- _Why does B start the Noise handshake and not A?_ - - Given that WebRTC uses DTLS 1.2, _B_ is the one that can send data first. - [QUIC RFC]: https://www.rfc-editor.org/rfc/rfc9000.html [uvarint-spec]: https://github.com/multiformats/unsigned-varint diff --git a/webrtc/webrtc-direct.md b/webrtc/webrtc-direct.md new file mode 100644 index 000000000..3e7ddd11c --- /dev/null +++ b/webrtc/webrtc-direct.md @@ -0,0 +1,312 @@ +# WebRTC Direct + +| Lifecycle Stage | Maturity | Status | Latest Revision | +|-----------------|---------------------------|--------|-----------------| +| 2A | Candidate Recommendation | Active | r1, 2023-04-12 | + +Authors: [@mxinden] + +Interest Group: [@marten-seemann] + +[@marten-seemann]: https://github.com/marten-seemann +[@mxinden]: https://github.com/mxinden/ + +## Motivation + +**No need for trusted TLS certificates.** Enable browsers to connect to public +server nodes without those server nodes providing a TLS certificate within the +browser's trustchain. Note that we can not do this today with our Websocket +transport as the browser requires the remote to have a trusted TLS certificate. +Nor can we establish a plain TCP or QUIC connection from within a browser. We +can establish a WebTransport connection from the browser (see [WebTransport +specification](../webtransport)). + +## Addressing + +WebRTC Direct multiaddresses are composed of an IP and UDP address component, followed +by `/webrtc-direct` and a multihash of the certificate that the node uses. + +Examples: +- `/ip4/1.2.3.4/udp/1234/webrtc-direct/certhash//p2p/` +- `/ip6/fe80::1ff:fe23:4567:890a/udp/1234/webrtc-direct/certhash//p2p/` + +The TLS certificate fingerprint in `/certhash` is a +[multibase](https://github.com/multiformats/multibase) encoded +[multihash](https://github.com/multiformats/multihash). + +For compatibility implementations MUST support hash algorithm +[`sha-256`](https://github.com/multiformats/multihash) and base encoding +[`base64url`](https://github.com/multiformats/multibase). Implementations MAY +support other hash algorithms and base encodings, but they may not be able to +connect to all other nodes. + +## Connection Establishment + +### Browser to public Server + +Scenario: Browser _A_ wants to connect to server node _B_ where _B_ is publicly +reachable but _B_ does not have a TLS certificate trusted by _A_. + +1. Server node _B_ generates a TLS certificate, listens on a UDP port and + advertises the corresponding multiaddress (see [#addressing]) through some + external mechanism. + + Given that _B_ is publicly reachable, _B_ acts as a [ICE + Lite](https://www.rfc-editor.org/rfc/rfc5245) agent. It binds to a UDP port + waiting for incoming STUN and SCTP packets and multiplexes based on source IP + and source port. + +2. Browser _A_ discovers server node _B_'s multiaddr, containing _B_'s IP, UDP + port, TLS certificate fingerprint and optionally libp2p peer ID (e.g. + `/ip6/2001:db8::/udp/1234/webrtc-direct/certhash//p2p/`), through some + external mechanism. + +3. _A_ instantiates a `RTCPeerConnection`. See + [`RTCPeerConnection()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/RTCPeerConnection). + + _A_ (i.e. the browser) SHOULD NOT reuse the same certificate across + `RTCPeerConnection`s. Reusing the certificate can be used to identify _A_ + across connections by on-path observers given that WebRTC uses TLS 1.2. + +4. _A_ constructs _B_'s SDP answer locally based on _B_'s multiaddr. + + _A_ generates a random string prefixed with "libp2p+webrtc+v1/". The prefix + allows us to use the ufrag as an upgrade mechanism to role out a new version + of the libp2p WebRTC protocol on a live network. While a hack, this might be + very useful in the future. _A_ sets the string as the username (_ufrag_ or _username fragment_) + and password on the SDP of the remote's answer. + + _A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning + [multiplexing] for rational. + + Finally _A_ sets the remote answer via + [`RTCPeerConnection.setRemoteDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setRemoteDescription). + +5. _A_ creates a local offer via + [`RTCPeerConnection.createOffer()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createOffer). + _A_ sets the same username and password on the local offer as done in (4) on + the remote answer. + + _A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning + [multiplexing] for rational. + + Finally _A_ sets the modified offer via + [`RTCPeerConnection.setLocalDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setLocalDescription). + + Note that this process, oftentimes referred to as "SDP munging" is disallowed + by the specification, but not enforced across the major browsers (Safari, + Firefox, Chrome) due to use-cases in the wild. See also + https://bugs.chromium.org/p/chromium/issues/detail?id=823036 + +6. Once _A_ sets the SDP offer and answer, it will start sending STUN requests + to _B_. _B_ reads the _ufrag_ from the incoming STUN request's _username_ + field. _B_ then infers _A_'s SDP offer using the IP, port, and _ufrag_ of the + request as follows: + + 1. _B_ sets the the `ice-ufrag` and `ice-pwd` equal to the value read from + the `username` field. + + 2. _B_ sets an arbitrary sha-256 digest as the remote fingerprint as it does + not verify fingerprints at this point. + + 3. _B_ sets the connection field (`c`) to the IP and port of the incoming + request `c=IN `. + + 4. _B_ sets the `a=max-message-size:16384` SDP attribute. See reasoning + [multiplexing] for rational. + + _B_ sets this offer as the remote description. _B_ generates an answer and + sets it as the local description. + + The _ufrag_ in combination with the IP and port of _A_ can be used by _B_ + to identify the connection, i.e. demultiplex incoming UDP datagrams per + incoming connection. + + Note that this step requires _B_ to allocate memory for each incoming STUN + message from _A_. This could be leveraged for a DOS attack where _A_ is + sending many STUN messages with different ufrags using different UDP source + ports, forcing _B_ to allocate a new peer connection for each. _B_ SHOULD + have a rate limiting mechanism in place as a defense measure. See also + https://datatracker.ietf.org/doc/html/rfc5389#section-16.1.2. + +7. _A_ and _B_ execute the DTLS handshake as part of the standard WebRTC + connection establishment. + + At this point _B_ does not know the TLS certificate fingerprint of _A_. Thus + _B_ can not verify _A_'s TLS certificate fingerprint during the DTLS + handshake. Instead _B_ needs to _disable certificate fingerprint + verification_ (see e.g. [Pion's `disableCertificateFingerprintVerification` + option](https://github.com/pion/webrtc/blob/360b0f1745c7244850ed638f423cda716a81cedf/settingengine.go#L62)). + + On success of the DTLS handshake the connection provides confidentiality and + integrity but not authenticity. The latter is guaranteed through the + succeeding Noise handshake. See [Connection Security + section](#connection-security). + +8. Messages on each `RTCDataChannel` are framed using the message + framing mechanism described in [Multiplexing]. + +9. The remote is authenticated via an additional Noise handshake. See + [Connection Security section](#connection-security). + +WebRTC can run both on UDP and TCP. libp2p WebRTC implementations MUST support +UDP and MAY support TCP. + + +## Connection Security + +Note that the below uses the message framing described in +[multiplexing]. + +While WebRTC offers confidentiality and integrity via TLS, one still needs to +authenticate the remote peer by its libp2p identity. + +After [Connection Establishment](#connection-establishment): + +1. _A_ and _B_ open a WebRTC data channel with `id: 0` and `negotiated: true` + ([`pc.createDataChannel("", {negotiated: true, id: + 0});`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createDataChannel)). + +2. _B_ starts a Noise `XX` handshake on the new channel. See + [noise-libp2p](https://github.com/libp2p/specs/tree/master/noise). + + _A_ and _B_ use the [Noise + Prologue](https://noiseprotocol.org/noise.html#prologue) mechanism. More + specifically _A_ and _B_ set the Noise _Prologue_ to + `` before starting the actual Noise + handshake. `` is the UTF-8 byte representation of the string + `libp2p-webrtc-noise:`. `` is the concatenation + of the two TLS fingerprints of _A_ (Noise handshake responder) and then _B_ + (Noise handshake initiator), in their multihash byte representation. + + On Chrome _A_ can access its TLS certificate fingerprint directly via + `RTCCertificate#getFingerprints`. Firefox does not allow _A_ to do so. Browser + compatibility can be found + [here](https://developer.mozilla.org/en-US/docs/Web/API/RTCCertificate). In + practice, this is not an issue since the fingerprint is embedded in the local + SDP string. + +3. On success of the authentication handshake, the used datachannel is + closed and the plain WebRTC connection is used with its multiplexing + capabilities via datachannels. See [Multiplexing]. + +Note: WebRTC supports different hash functions to hash the TLS certificate (see +https://datatracker.ietf.org/doc/html/rfc8122#section-5). The hash function used +in WebRTC and the hash function used in the multiaddr `/certhash` component MUST +be the same. On mismatch the final Noise handshake MUST fail. + +_A_ knows _B_'s fingerprint hash algorithm through _B_'s multiaddr. _A_ MUST use +the same hash algorithm to calculate the fingerprint of its (i.e. _A_'s) TLS +certificate. _B_ assumes _A_ to use the same hash algorithm it discovers through +_B_'s multiaddr. For now implementations MUST support sha-256. Future iterations +of this specification may add support for other hash algorithms. + +Implementations SHOULD setup all the necessary callbacks (e.g. +[`ondatachannel`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/datachannel_event)) +before starting the Noise handshake. This is to avoid scenarios like one where +_A_ initiates a stream before _B_ got a chance to set the `ondatachannel` +callback. This would result in _B_ ignoring all the messages coming from _A_ +targeting that stream. + +Implementations MAY open streams before completion of the Noise handshake. +Applications MUST take special care what application data they send, since at +this point the peer is not yet authenticated. Similarly, the receiving side MAY +accept streams before completion of the handshake. + +## Test vectors + +### Noise prologue + +All of these test vectors represent hex-encoded bytes. + +#### Both client and server use SHA-256 + +Here client is _A_ and server is _B_. + +``` +client_fingerprint = "3e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870" +server_fingerprint = "30fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99" + +prologue = "6c69627032702d7765627274632d6e6f6973653a12203e79af40d6059617a0d83b83a52ce73b0c1f37a72c6043ad2969e2351bdca870122030fc9f469c207419dfdd0aab5f27a86c973c94e40548db9375cca2e915973b99" +``` + +# FAQ + +- _Why exchange the TLS certificate fingerprint in the multiaddr? Why not + base it on the libp2p public key?_ + + Browsers do not allow loading a custom certificate. One can only generate a + certificate via + [rtcpeerconnection-generatecertificate](https://www.w3.org/TR/webrtc/#dom-rtcpeerconnection-generatecertificate). + +- _Why not embed the peer ID in the TLS certificate, thus rendering the + additional "peer certificate" exchange obsolete?_ + + Browsers do not allow editing the properties of the TLS certificate. + +- _How about distributing the multiaddr in a signed peer record, thus rendering + the additional "peer certificate" exchange obsolete?_ + + Signed peer records are not yet rolled out across the many libp2p protocols. + Making the libp2p WebRTC protocol dependent on the former is not deemed worth + it at this point in time. Later versions of the libp2p WebRTC protocol might + adopt this optimization. + + Note, one can role out a new version of the libp2p WebRTC protocol through a + new multiaddr protocol, e.g. `/webrtc-direct-2`. + +- _Why exchange fingerprints in an additional authentication handshake on top of + an established WebRTC connection? Why not only exchange signatures of ones TLS + fingerprints signed with ones libp2p private key on the plain WebRTC + connection?_ + + Once _A_ and _B_ established a WebRTC connection, _A_ sends + `signature_libp2p_a(fingerprint_a)` to _B_ and vice versa. While this has the + benefit of only requring two messages, thus one round trip, it is prone to a + key compromise and replay attack. Say that _E_ is able to attain + `signature_libp2p_a(fingerprint_a)` and somehow compromise _A_'s TLS private + key, _E_ can now impersonate _A_ without knowing _A_'s libp2p private key. + + If one requires the signatures to contain both fingerprints, e.g. + `signature_libp2p_a(fingerprint_a, fingerprint_b)`, the above attack still + works, just that _E_ can only impersonate _A_ when talking to _B_. + + Adding a cryptographic identifier of the unique connection (i.e. session) to + the signature (`signature_libp2p_a(fingerprint_a, fingerprint_b, + connection_identifier)`) would protect against this attack. To the best of our + knowledge the browser does not give us access to such identifier. + +- _Can a browser know upfront its UDP port which it is listening for incoming + connections on? Does the browser reuse the UDP port across many WebRTC + connections? If that is the case one could connect to any public node, with + the remote telling the local node what port it is perceived on. Thus one could + use libp2p's identify and AutoNAT protocol instead of relying on STUN._ + + No, a browser uses a new UDP port for each `RTCPeerConnection`. + +- _Why not load a remote node's certificate into one's browser trust-store and + then connect e.g. via WebSocket._ + + This would require a mechanism to discover remote node's certificates upfront. + More importantly, this does not scale with the number of connections a typical + peer-to-peer application establishes. + +- _Can an attacker launch an amplification attack with the STUN endpoint of + the server?_ + + We follow the reasoning of the QUIC protocol, namely requiring: + + > an endpoint MUST limit the amount of data it sends to the unvalidated + > address to three times the amount of data received from that address. + + https://datatracker.ietf.org/doc/html/rfc9000#section-8 + + This is the case for STUN response messages which are only slight larger than + the request messages. See also + https://datatracker.ietf.org/doc/html/rfc5389#section-16.1.2. + +- _Why does B start the Noise handshake and not A?_ + + Given that WebRTC uses DTLS 1.2, _B_ is the one that can send data first. + +[multiplexing]: ./README.md#multiplexing diff --git a/webrtc/webrtc.md b/webrtc/webrtc.md new file mode 100644 index 000000000..e84e316c2 --- /dev/null +++ b/webrtc/webrtc.md @@ -0,0 +1,122 @@ +# WebRTC + +| Lifecycle Stage | Maturity | Status | Latest Revision | +|-----------------|--------------------------|--------|-----------------| +| 2A | Candidate Recommendation | Active | r0, 2023-04-12 | + +Authors: [@mxinden] + +## Motivation + +libp2p transport protocol enabling two private nodes (e.g. two browsers) to establish a direct connection. + +Browser _A_ wants to connect to Browser node _B_ with the help of server node _R_. +Both _A_ and _B_ cannot listen for incoming connections due to running in a constrained environment (i.e. a browser) with its only transport capability being the W3C WebRTC `RTCPeerConnection` API and being behind a NAT and/or firewall. +Note that _A_ and/or _B_ may as well be non-browser nodes behind NATs and/or firewalls. +However, for two non-browser nodes using TCP or QUIC hole punching with [DCUtR] will be the more efficient way to establish a direct connection. + +On a historical note, this specification replaces the existing [libp2p WebRTC star](https://github.com/libp2p/js-libp2p-webrtc-star) and [libp2p WebRTC direct](https://github.com/libp2p/js-libp2p-webrtc-direct) protocols. + +## Connection Establishment + +1. _B_ advertises support for the WebRTC browser-to-browser protocol by appending `/webrtc` to its relayed multiaddr, meaning it takes the form of `/webrtc/p2p/`. + +2. Upon discovery of _B_'s multiaddress, _A_ learns that _B_ supports the WebRTC transport and knows how to establish a relayed connection to _B_ to run the `/webrtc-signaling` protocol on top. + +3. _A_ establishes a relayed connection to _B_. + Note that further steps depend on the relayed connection to be authenticated, i.e. that data sent on the relayed connection can be trusted. + +4. _A_ (outbound side of relayed connection) creates an `RTCPeerConnection` provided by a W3C compliant WebRTC implementation (e.g. a browser). + See [STUN](#stun) section on what STUN servers to configure at creation time. + _A_ creates an SDP offer via `RTCPeerConnection.createOffer()`. + _A_ initiates the signaling protocol to _B_ via the relayed connection from (1), see [Signaling Protocol](#signaling-protocol) and sends the offer to _B_. + Note that _A_ being the initiator of the stream is merely a convention preventing both nodes to simultaneously initiate a new connection thus potentially resulting in two WebRTC connections. + _A_ MUST as well be able to handle an incoming signaling protocol stream to support the case where _B_ initiates the signaling process. + +5. On reception of the incoming stream, _B_ (inbound side of relayed connection) creates an `RTCPeerConnection`. + Again see [STUN](#stun) section on what STUN servers to configure at creation time. + _B_ receives _A_'s offer sent in (2) via the signaling protocol stream and provides the offer to its `RTCPeerConnection` via `RTCPeerConnection.setRemoteDescription`. + _B_ then creates an answer via `RTCPeerConnection.createAnswer` and sends it to _A_ via the existing signaling protocol stream (see [Signaling Protocol](#signaling-protocol)). + +6. _A_ receives _B_'s answer via the signaling protocol stream and sets it locally via `RTCPeerConnection.setRemoteDescription`. + +7. _A_ and _B_ send their local ICE candidates via the existing signaling protocol stream to enable trickle ICE. + Both nodes continuously read from the stream, adding incoming remote candidates via `RTCPeerConnection.addIceCandidate()`. + +8. On successful establishment of the direct connection, _B_ and _A_ close the signaling protocol stream. + On failure _B_ and _A_ reset the signaling protocol stream. + + Behavior for transferring data on a relayed connection, in the case where the direct connection failed, is out of scope for this specification and dependent on the application. + +9. Messages on `RTCDataChannel`s on the established `RTCPeerConnection` are framed using the message framing mechanism described in [multiplexing]. + +## STUN + +A node needs to discover its public IP and port, which is forwarded to the remote node in order to connect to the local node. +On non-browser libp2p nodes doing a hole punch with TCP or QUIC, the libp2p node discovers its public address via the [identify] protocol. +One cannot use the [identify] protocol on browser nodes to discover ones public IP and port given that the browser uses a new port for each connection. +For example say that the local browser node establishes a WebRTC connection C1 via browser-to-server to a server node and runs the [identify] protocol. +The returned observed public port P1 will most likely (depending on the NAT) be a different port than the port observed on another connection C2. +The only browser supported mechanism to discover ones public IP and port for a given WebRTC connection is the non-libp2p protocol STUN. +This is why this specification depends on STUN, and thus the availability of one or more STUN servers for _A_ and _B_ to discovery their public addresses. + +Implementations MAY use one of the publicly available STUN servers, or deploy a dedicated server for a given libp2p network. +Further specification of the usage of STUN is out of scope for this specifitcation. + +It is not necessary for _A_ and _B_ to use the same STUN server when establishing a WebRTC connection. + +## Signaling Protocol + +The protocol id is `/webrtc-signaling`. +Messages are sent prefixed with the message length in bytes, encoded as an unsigned variable length integer as defined by the [multiformats unsigned-varint spec][uvarint-spec]. + +``` protobuf +syntax = "proto3"; + +message Message { + // Specifies type in `data` field. + enum Type { + // String of `RTCSessionDescription.sdp` + SDP_OFFER = 0; + // String of `RTCSessionDescription.sdp` + SDP_ANSWER = 1; + // String of `RTCIceCandidate.toJSON()` + ICE_CANDIDATE = 2; + } + + optional Type type = 1; + optional string data = 2; +} +``` + +## FAQ + +- Why is there no additional Noise handshake needed? + + This specification (browser-to-browser) requires _A_ and _B_ to exchange their SDP offer and answer over an authenticated channel. + Offer and answer contain the TLS certificate fingerprint. + The browser validates the TLS certificate fingerprint through the DTLS handshake during the WebRTC connection establishment. + + In contrast, the browser-to-server specification allows exchange of the server's multiaddr, containing the server's TLS certificate fingerprint, over unauthenticated channels. + In other words, the browser-to-server specification does not consider the TLS certificate fingerprint in the server's multiaddr to be trusted. + +- Why use a custom signaling protocol? Why not use [DCUtR]? + + DCUtR offers time synchronization through a two-step protocol (first `Connect`, then `Sync`). + This is not needed for WebRTC. + + DCUtR does not provide a mechanism to trickle local address candidates to the remote as they are discovered. + Trickling candidates just-in-time allows for faster WebRTC connection establishment. + +- Why does _A_ and not _B_ initiate the signaling protocol? + + In [DCUtR] _B_ (inbound side of the relayed connection) initiates the [DCUtR] protocol by opening the [DCUtR] protocol stream. + The reason is that in case _A_ is publicly reachable, _B_ might be able to use connection reversal to connect to _A_ directly. + This reason does not apply to the WebRTC browser-to-browser protocol. + Given that _A_ and _B_ at this point already have a relayed connection established, they might as well use it to exchange SDP, instead of using connection reversal and WebRTC browser-to-server. + Thus, for the WebRTC browser-to-browser protocol, _A_ initiates the signaling protocol by opening the signaling protocol stream. + +[DCUtR]: ./../relay/DCUtR.md +[identify]: ./../identify/README.md +[multiplexing]: ./README.md#multiplexing +[uvarint-spec]: https://github.com/multiformats/unsigned-varint