From bc1aa592185fb2ee5554152df07cab45320f25ee Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Sun, 22 Jan 2023 14:19:29 +1300 Subject: [PATCH 01/35] add HTTP spec --- http/README.md | 161 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 161 insertions(+) create mode 100644 http/README.md diff --git a/http/README.md b/http/README.md new file mode 100644 index 000000000..c9b2d7dca --- /dev/null +++ b/http/README.md @@ -0,0 +1,161 @@ +# libp2p + HTTP: the spec + +| Lifecycle Stage | Maturity | Status | Latest Revision | +|-----------------|--------------------------|--------|-----------------| +| 1A | Working Draft | Active | r0, 2023-01-23 | + +Authors: [@marten-seemann] + +Interest Group: [@MarcoPolo] + +[@marten-seemann]: https://github.com/marten-seemann +[@MarcoPolo]: https://github.com/MarcoPolo + +## Introduction + +This document defines how libp2p nodes can offer a HTTP endpoint next to (or instead of) their full libp2p node. Services can be offered both via traditional libp2p protocols and via HTTP, allowing a wide variety of nodes to access these services. Crucially, this for the first time, allows browsers to access libp2p services without spinning up a Web{Socket, Transport, RTC} connection first. It also allows interacting with libp2p services from environments where plain HTTP is the only option, e.g. curl from the command line, and certain cloud edge workers and lambdas. + +At the same time, nodes that are already connected via a libp2p connection, will be able to (re)use this connection to issue the same kind of requests, without dialing a dedicated HTTP connection. + +Any protocol that follows request-response semantics can easily be mapped onto HTTP (mapping protocols that don’t follow a request-response flow can be more challenging). Protocols are encouraged to follow best practices for building REST APIs. Once a mapping has been defined, a single implementation can be used to serve both traditional libp2p as well as libp2p-HTTP clients. + +## Addressing + +Nodes may advertise HTTP multiaddresses to signal support for libp2p over HTTP. An address might look like this: `/ip4/1.2.3.4/tcp/443/tls/sni/example.com/http/p2p/` (for HTTP/1.1 and HTTP/2), or `/ip4/1.2.3.4/udp/443/quic/sni/example.com/http/p2p/` (for HTTP/3). + +Nodes MUST use HTTPS (i.e. they MUST NOT use unencrypted HTTP). It is RECOMMENDED to use HTTP/2 and HTTP/3, but the protocols also work over HTTP/1.1. + +Note that the peer ID in this address is 1. optional and 2. advisory and not (necessarily) verified during the HTTP handshake (depending on the HTTP client). If and when desired, clients can cryptographically verify the peer ID once the HTTP connection has been established, see [Authentication] for details on peer authentication. + +Nodes can also link to a specific resource directly, similar to how a URL includes a path. This will require us to resolve [https://github.com/multiformats/multiaddr/issues/63](https://github.com/multiformats/multiaddr/issues/63) first. For example, the URL of a specific CID might be: `/ip4/1.2.3.4/tcp/443/tls/sni/example.com/http//{/path/to/}`. + +## Namespace + +libp2p does not squat the global namespace. By convention, all libp2p services are located at a well-known URL: `http://example.com/.well-known/libp2p//`. + +Putting the service name into the URL allows for future extensibility. It is easy to define new protocols, and the replace existing protocols by newer versions. + +Applications MAY expose services under different URIs. For example, an application might decide to generate nicer-looking (and probably more SEO-friendly) URLs, and map paths under `[https://example.com/dht/](https://example.com/dht/)` to `https://example.com/.well-known/libp2p/kad-dht-v1/`. + +### Service Names + +Traditionally, libp2p protocols have used path-like protocol identifiers, e.g. `/libp2p/autonat/1.0.0`. Due to the use of `/`s, this doesn’t work well with the naming convention defined above. + +Protocols that wish to use the libp2p request-response mechanism MUST define a service name that is a valid URI component (according to RFC 8820). + +In practice, this isn’t expected cause too much friction, since current libp2p protocols were not designed to use the request-reponse mechanism, and will need to make arrangements to support it anyway (e.g. define how requests and responses are serialized). + +### Privacy Properties + +This leads to some very desirable properties: + +1. It is possible to run libp2p alongside a normal HTTP web service, i.e. on the same domain and port, without having to worry about collisions. + 1. As an on-path observer only sees SNI and ALPN, this effectively hides the fact that a client is establishing a connection in order to speak libp2p. +2. Since authentication is flexible (see below), this enables servers to + 1. require authentication to (some) paths below `.well-known/libp2p`, and to enforce ACLs + 2. stealth mode: return 404 for paths below `.well-known/libp2p`, *unless* the client has already authenticated itself, thereby hiding the fact that it runs a libp2p server, even if probed explicitly + +## Certificates + +libp2p doesn’t prescribe how nodes obtain the TLS certificate to secure the HTTPS connection. Since browsers are expected to connect to the node, the certificate’s trust chain must end in the browser’s trust store. + +This is somewhat tricky in a p2p context, as nodes might not have a (sub)domain, which for many CAs is a requirement to obtain a certificate. Specifically, Let’s Encrypt doesn’t support IP certificates at the moment. ZeroSSL does, however, this requires setting up a (free) account. + +To speed of server authentication, a node MAY include the libp2p TLS extension in its certificate. Note that this is currently not possible when using Let’s Encrypt, since the libp2p TLS extension is not whitelisted by LE. Not every HTTP client will have access to the TLS certificate (for example, browsers usually don’t expose an API for that), but if an HTTP client does, it SHOULD use that information. + +## Authentication + +Traditionally, libp2p was built on the assumption that both peers authenticate each other during the libp2p handshake. libp2p+HTTP acknowledges that this isn’t always possible, or even desirable, and that different use cases call for different authentication modes. For example, a server might offer a certain set of services to any client, like a HTTP webserver does. + +### Server Authentication + +This document defines a simple request-response protocol to authenticate a server. This protocol is run on an already established connection. The service name of the protocol is `server-auth` (and the URI therefore is `.well-known/libp2p/server-auth`). + +The client send a POST request containing a random payload of at least 8 and up to 1024 bytes: + +```json +{ + "random": +} +``` + +The server signs the concatenation of `libp2p-server-auth:` (in ASCII) and that payload using its host key. It then send the following response: + +```json +{ + "peer-id": , + "signature": +} +``` + +TODO: is this the best way to encode the information? We could also put `peer-id` and / or `signature` into an HTTP header. + +### Client Authentication + +When an unauthenticated client tries to access a resource that requires authentication, the server SHOULD use a [401 HTTP status code](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/401). The client MAY then authenticate itself using the protocol described below, and then retry the request. + +Support for client authentication is an optional feature. It is expected that only a subsection of clients will implement it. For example, browser simply retrieving a few (elements of) web pages from IPFS probably won’t have any need to even generate a libp2p identity in the first place. + +The protocol defined here takes 2 RTTs to authenticate the client. It is designed to be stateless on the server side. In the first round-trip, the client obtains a (pseudo-) random value from the server, which it then signs with its host key and sends back to the server, which then issues an authentication token (acting somewhat like a cookie) which can be included on future requests. + +The service name is `client-auth`. For the first step, the client sends a GET request to this HTTP endpoint, and the server responds with at least 8 and up to 1024 bytes of pseudorandom data: + +```json +{ + "random": +} +``` + +In order to keep this exchange stateless, the server SHOULD 1. include the current timestamp or an expiry data and 2. a signature in that data. This allows it to check in step 2 that it actually generated that data. + +The client signs the data received in step 1, and sends a POST request with the following JSON object to the server: + +```json +{ + "data": , + "peer-id": , + "signature": +} +``` + +The server verifies the signature and issues an authentication token. In order to allow stateless operation, at the very minimum, the authentication token SHOULD contain the peer ID. It SHOULD also contain an expiry date and it MAY be bound to the client’s IP address. The token is sent in the response body. + +The client uses the auth token on requests that require client authentication, by setting the `libp2p-auth-token` HTTP header. + +## Mapping to libp2p Streams + +libp2p services whose service is specified as request-response protocols can use a single protocol implementation to make the service available over HTTP as well as on top of libp2p streams. + +The libp2p protocol identifier is `/http1.1`. After negotiating this protocol using multistream-select, nodes treat the stream as a HTTP/1.1 stream for a single HTTP request (i.e. nodes MUST NOT use request pipelining). + +## Outlook: Interaction with Intermediaries + +One of the advantages of running HTTP is that there’s widely deployed caching infrastructure (CDNs). Content-addressed data is infinitely cacheable. Assuming a properly design data transfer protocol, retrieval for CIDs could be cached by the CDN and made available via a POP (geographically) close to the user, dramatically reducing retrieval latencies. + +Services SHOULD specify the caching properties (if any), and set the appropriate cache headers (according to RFC 9111). + +CDNs can also be used to increase censorship resistance, since the CDN effectively hides the IP address of the origin server. With the upcoming introduction of ECHO (Encrypted ClientHello) in TLS, all that an on-path observer will be able to see is that a client is establishing a connection to a certain CDN, but not to which domain name. + +The level of delegation between the origin node and the CDN can be adjusted. In the simplest configuration, the origin node is the only node that holds the libp2p private key, thus requests to the `server-auth` protocol would be forwarded from the CDN to the origin server. In a more advanced configuration, it would be possible to move the private key to a worker on the edge of the CDN, and perform the signing operation there (thereby reducing the request latency for `server-auth` requests). + +## FAQ + +### Why not gRPC? + +This would be the perfect fit, allowing both request-response schemes as well as variations with multiple requests and multiple responses. However, it’s not possible to use gRPC from the browser. + +### Why tie ourselves to HTTP when mapping onto libp2p? Can’t we have a more general serialization format? + +We could, but rolling our own serialization comes with some costs. First of all, we’d have define how HTTP request and response header, bodies, trailers are serialized onto the wire. Most likely, we’d define a Protobuf for that. Second, once we add more features to that format, they would need to be back-ported to HTTP, so that nodes that only speak HTTP can make use of them as well. + +It’s just simpler to commit to HTTP. + +### Why not use HTTP/3 for the libp2p mapping? + +I’d love to! This would allow us to use HTTP header compression using QPACK, and a binary format instead of a text-based one. However, HTTP/3 requires the peers to exchange HTTP/3 SETTINGS frames first, and it’s not immediately obvious when / how this would be done in libp2p. It’s also not clear how easy it would be to use HTTP/3 in JavaScript. + +The good news is that once we’ve come up with a solution for these two problems, it will be rather easy to add support for HTTP/3: nodes will just offer `/http3` in addition (and one day, instead of) `/http1.1`, and nodes that support can hit that endpoint. Nothing in the implementation of the protocols will need to change, since protocols only deal with (deserialized) HTTP requests and responses. + +### Can I run QUIC, WebTransport and an HTTP/3 server on the same IP and port? + +Yes, once [https://github.com/libp2p/specs/issues/507](https://github.com/libp2p/specs/issues/507) is resolved. From 1f075f60305616f42a2e323a133c4875865b18df Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Sun, 29 Jan 2023 22:18:24 +1300 Subject: [PATCH 02/35] 2nd attempt for server auth --- http/README.md | 24 ++++++++++-------------- 1 file changed, 10 insertions(+), 14 deletions(-) diff --git a/http/README.md b/http/README.md index c9b2d7dca..319c4d18f 100644 --- a/http/README.md +++ b/http/README.md @@ -69,26 +69,22 @@ Traditionally, libp2p was built on the assumption that both peers authenticate e ### Server Authentication -This document defines a simple request-response protocol to authenticate a server. This protocol is run on an already established connection. The service name of the protocol is `server-auth` (and the URI therefore is `.well-known/libp2p/server-auth`). +Since HTTP requests are independent from each other (they are not bound to a single connection, and when using HTTP/1.1, will actually use different connections), the server needs to authenticate itself on every single request. -The client send a POST request containing a random payload of at least 8 and up to 1024 bytes: +As browsers don’t expose an API to access details of the TLS certificate used, nor allow any access to the (an exporter to) the TLS master secret, server authentication is a bit more contrived than one might initially expect. + +To request the server to authenticate, the client sets the `libp2p-server-auth` HTTP header to a randomly generated ASCII string of at least 10 (and a maximum of 100) characters. The server signs the following string using its host key: -```json -{ - "random": -} +``` +"libp2p-server-auth:" || the value of the libp2p-server-auth header || "libp2p-server-domain:" || the domain (including subdomains) ``` -The server signs the concatenation of `libp2p-server-auth:` (in ASCII) and that payload using its host key. It then send the following response: +It then sets the following two HTTP headers on the response: -```json -{ - "peer-id": , - "signature": -} -``` +1. `libp2p-server-pubkey`: its public key (from the libp2p key pair) +2. `libp2p-server-auth-signature`: the signature derived as described above -TODO: is this the best way to encode the information? We could also put `peer-id` and / or `signature` into an HTTP header. +When requesting server authentication, the client MUST check that these two header fields are present, and MUST check the signature. It MUST NOT process the response if either one of these checks fails ### Client Authentication From 12f86b88a1249269ff9c93bcd78ea45363febd52 Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Sun, 29 Jan 2023 22:20:31 +1300 Subject: [PATCH 03/35] require client to authenticate the server when doing client auth --- http/README.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/http/README.md b/http/README.md index 319c4d18f..b76c97374 100644 --- a/http/README.md +++ b/http/README.md @@ -94,15 +94,17 @@ Support for client authentication is an optional feature. It is expected that on The protocol defined here takes 2 RTTs to authenticate the client. It is designed to be stateless on the server side. In the first round-trip, the client obtains a (pseudo-) random value from the server, which it then signs with its host key and sends back to the server, which then issues an authentication token (acting somewhat like a cookie) which can be included on future requests. -The service name is `client-auth`. For the first step, the client sends a GET request to this HTTP endpoint, and the server responds with at least 8 and up to 1024 bytes of pseudorandom data: +The service name is `client-auth`. For the first step, the client sends a GET request to this HTTP endpoint. As described in [server-authentication], the client MUST authenticate the server in this step. The server responds with at least 8 and up to 1024 bytes of pseudorandom data: ```json { - "random": + "random": , + "signature": } ``` In order to keep this exchange stateless, the server SHOULD 1. include the current timestamp or an expiry data and 2. a signature in that data. This allows it to check in step 2 that it actually generated that data. +The client MUST check that the signature obtained in the JSON response is correct and was generated using the same key that the server used to authenticate itself. The client signs the data received in step 1, and sends a POST request with the following JSON object to the server: From 146c09a6ebaf4e93bcc65ccf2bf7c5775815b1f0 Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Tue, 14 Feb 2023 17:52:34 +1300 Subject: [PATCH 04/35] better motivation for libp2p+HTTP (#515) * better motivation for libp2p+HTTP * incorporate review feedback --- http/README.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/http/README.md b/http/README.md index b76c97374..982b25c75 100644 --- a/http/README.md +++ b/http/README.md @@ -19,6 +19,16 @@ At the same time, nodes that are already connected via a libp2p connection, will Any protocol that follows request-response semantics can easily be mapped onto HTTP (mapping protocols that don’t follow a request-response flow can be more challenging). Protocols are encouraged to follow best practices for building REST APIs. Once a mapping has been defined, a single implementation can be used to serve both traditional libp2p as well as libp2p-HTTP clients. +Specifically, using libp2p+HTTP will allow: + +1. Defining services / protocols once, and run them both via HTTP and via libp2p +1. Leverage libp2p's connectivity story (incl. hole punching) to run these services on both public nodes and on nodes behind NATs / firewall +1. Use existing peer and content discovery mechanisms to advertise HTTP-enabled multiaddresses, which can then be accessed either via plain HTTP(S) or via HTTP on top of libp2p + 1. Support existing HTTP protocols like the S3 protocol. This would allow peers to fetch content seamlessly from an S3-compatible provider (S3, backblaze's B2, Cloudflare's R2) + 1. Support edge compute directly. Many edge compute environments build on top of HTTP since it’s a stateless request/response protocol. This includes services such as Cloudflare workers, AWS Lambda, Netflify Edge functions, and many more. +1. Use peer authentication (both client and server auth) for a subset of HTTP endpoints + + ## Addressing Nodes may advertise HTTP multiaddresses to signal support for libp2p over HTTP. An address might look like this: `/ip4/1.2.3.4/tcp/443/tls/sni/example.com/http/p2p/` (for HTTP/1.1 and HTTP/2), or `/ip4/1.2.3.4/udp/443/quic/sni/example.com/http/p2p/` (for HTTP/3). From 5398f5dfd88ead02b7012b8faa7399bbde167fbe Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Mon, 13 Feb 2023 20:58:08 -0800 Subject: [PATCH 05/35] fix a few typos Co-authored-by: Marcin Rataj --- http/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/http/README.md b/http/README.md index 982b25c75..f9f2411fa 100644 --- a/http/README.md +++ b/http/README.md @@ -41,11 +41,11 @@ Nodes can also link to a specific resource directly, similar to how a URL includ ## Namespace -libp2p does not squat the global namespace. By convention, all libp2p services are located at a well-known URL: `http://example.com/.well-known/libp2p//`. +libp2p does not squat the global namespace. By convention, all libp2p services are located at a well-known URL: `https://example.com/.well-known/libp2p//`. Putting the service name into the URL allows for future extensibility. It is easy to define new protocols, and the replace existing protocols by newer versions. -Applications MAY expose services under different URIs. For example, an application might decide to generate nicer-looking (and probably more SEO-friendly) URLs, and map paths under `[https://example.com/dht/](https://example.com/dht/)` to `https://example.com/.well-known/libp2p/kad-dht-v1/`. +Applications MAY expose services under different URIs. For example, an application might decide to generate nicer-looking (and probably more SEO-friendly) URLs, and map paths under [`https://example.com/dht/`](https://example.com/dht/) to `https://example.com/.well-known/libp2p/kad-dht-v1/`. ### Service Names @@ -121,7 +121,7 @@ The client signs the data received in step 1, and sends a POST request with the ```json { "data": , - "peer-id": , + "peer-id": , "signature": } ``` From b6c1bc207254f902a6f49db74412c1f209054671 Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Thu, 2 Mar 2023 15:06:05 -0800 Subject: [PATCH 06/35] http: use .well-known/libp2p.json for configuration --- http/README.md | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/http/README.md b/http/README.md index f9f2411fa..910db213e 100644 --- a/http/README.md +++ b/http/README.md @@ -41,19 +41,22 @@ Nodes can also link to a specific resource directly, similar to how a URL includ ## Namespace -libp2p does not squat the global namespace. By convention, all libp2p services are located at a well-known URL: `https://example.com/.well-known/libp2p//`. +libp2p does not squat the global namespace. By convention libp2p services can be discovered by accessing the configuration file at a well-known URL (RFC 8615): `.well-known/libp2p.json` (e.g. https://example.com/.well-known/libp2p.json). This allows server operators to dynamically change the URLs of the services offered, and to not hard-code any assumptions how a certain resource is meant to be interpreted. -Putting the service name into the URL allows for future extensibility. It is easy to define new protocols, and the replace existing protocols by newer versions. +The document contains a mapping of protocols to their respective URL. For example, this configuration file would tell the client -Applications MAY expose services under different URIs. For example, an application might decide to generate nicer-looking (and probably more SEO-friendly) URLs, and map paths under [`https://example.com/dht/`](https://example.com/dht/) to `https://example.com/.well-known/libp2p/kad-dht-v1/`. - -### Service Names +```json +{ + "/kad/1.0.0": "/kademlia/", + "server-auth": "/libp2p/server-auth" +} +``` -Traditionally, libp2p protocols have used path-like protocol identifiers, e.g. `/libp2p/autonat/1.0.0`. Due to the use of `/`s, this doesn’t work well with the naming convention defined above. +1. That the Kademlia protocol is available at https://example.com/kademlia and +2. The libp2p server auth protocol (see below) is available at https://example.com/libp2p/server-auth. -Protocols that wish to use the libp2p request-response mechanism MUST define a service name that is a valid URI component (according to RFC 8820). +It is valid expose a service at /. Applications then need to take special care to avoid collisions with other protocols. -In practice, this isn’t expected cause too much friction, since current libp2p protocols were not designed to use the request-reponse mechanism, and will need to make arrangements to support it anyway (e.g. define how requests and responses are serialized). ### Privacy Properties @@ -61,9 +64,7 @@ This leads to some very desirable properties: 1. It is possible to run libp2p alongside a normal HTTP web service, i.e. on the same domain and port, without having to worry about collisions. 1. As an on-path observer only sees SNI and ALPN, this effectively hides the fact that a client is establishing a connection in order to speak libp2p. -2. Since authentication is flexible (see below), this enables servers to - 1. require authentication to (some) paths below `.well-known/libp2p`, and to enforce ACLs - 2. stealth mode: return 404 for paths below `.well-known/libp2p`, *unless* the client has already authenticated itself, thereby hiding the fact that it runs a libp2p server, even if probed explicitly +2. Since authentication is flexible (see below), this enables servers to enforce ACLs to `.well-known/libp2p.json`, thereby hiding the fact that a server is running libp2p ## Certificates @@ -108,7 +109,7 @@ The service name is `client-auth`. For the first step, the client sends a GET re ```json { - "random": , + "random": , "signature": } ``` From 8a57943a151d891dfd47e06abddd770987ebd033 Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Fri, 3 Mar 2023 12:08:11 +1300 Subject: [PATCH 07/35] http: nest libp2p.json config to allow for future configuration --- http/README.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/http/README.md b/http/README.md index 910db213e..7a1bcfc45 100644 --- a/http/README.md +++ b/http/README.md @@ -47,8 +47,10 @@ The document contains a mapping of protocols to their respective URL. For exampl ```json { - "/kad/1.0.0": "/kademlia/", - "server-auth": "/libp2p/server-auth" + "services": { + "/kad/1.0.0": "/kademlia/", + "server-auth": "/libp2p/server-auth" + } } ``` From 946f51601f72b4a7ff0af3446dce973b349890fb Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Fri, 7 Jul 2023 15:10:56 -0700 Subject: [PATCH 08/35] Reformat the spec from the Point of View of an implementer --- http/README.md | 184 ++++++++++++++++--------------------------------- 1 file changed, 59 insertions(+), 125 deletions(-) diff --git a/http/README.md b/http/README.md index 7a1bcfc45..e99064a61 100644 --- a/http/README.md +++ b/http/README.md @@ -1,172 +1,106 @@ -# libp2p + HTTP: the spec +# HTTP -| Lifecycle Stage | Maturity | Status | Latest Revision | -|-----------------|--------------------------|--------|-----------------| -| 1A | Working Draft | Active | r0, 2023-01-23 | +| Lifecycle Stage | Maturity | Status | Latest Revision | +| --------------- | ------------- | ------ | --------------- | +| 1A | Working Draft | Active | r0, 2023-01-23 | -Authors: [@marten-seemann] +Authors: [@marten-seemann, @MarcoPolo] -Interest Group: [@MarcoPolo] +Interest Group: [@MarcoPolo, @marten-seemann] [@marten-seemann]: https://github.com/marten-seemann [@MarcoPolo]: https://github.com/MarcoPolo ## Introduction -This document defines how libp2p nodes can offer a HTTP endpoint next to (or instead of) their full libp2p node. Services can be offered both via traditional libp2p protocols and via HTTP, allowing a wide variety of nodes to access these services. Crucially, this for the first time, allows browsers to access libp2p services without spinning up a Web{Socket, Transport, RTC} connection first. It also allows interacting with libp2p services from environments where plain HTTP is the only option, e.g. curl from the command line, and certain cloud edge workers and lambdas. +This document defines how libp2p nodes can offer and use an HTTP transport alongside their other transports to support application protocols with HTTP semantics. This allows a wider variety of nodes to participate in the libp2p network, for example: -At the same time, nodes that are already connected via a libp2p connection, will be able to (re)use this connection to issue the same kind of requests, without dialing a dedicated HTTP connection. +- Browsers communicating with other libp2p nodes without needing a WebSocket, WebTransport, or WebRTC connection. +- HTTP only edge workers can run application protocols and respond to peers on the network. +- `curl` from the command line can make requests to other libp2p nodes. -Any protocol that follows request-response semantics can easily be mapped onto HTTP (mapping protocols that don’t follow a request-response flow can be more challenging). Protocols are encouraged to follow best practices for building REST APIs. Once a mapping has been defined, a single implementation can be used to serve both traditional libp2p as well as libp2p-HTTP clients. +As well as allowing application protocols to make use of HTTP middle boxes such as HTTP caching and layer 7 proxying and load balancing. This is all in addition to the existing features that libp2p provides such as: -Specifically, using libp2p+HTTP will allow: +- Connectivity – Work on top of WebRTC, WebTransport, QUIC, TCP, or an HTTP transport. +- Hole punching – Work with peers behind NATs. +- Peer ID Authentication – Authenticate your peer by their libp2p peer id. +- Peer discovery – Learn about a peer given their peer id. -1. Defining services / protocols once, and run them both via HTTP and via libp2p -1. Leverage libp2p's connectivity story (incl. hole punching) to run these services on both public nodes and on nodes behind NATs / firewall -1. Use existing peer and content discovery mechanisms to advertise HTTP-enabled multiaddresses, which can then be accessed either via plain HTTP(S) or via HTTP on top of libp2p - 1. Support existing HTTP protocols like the S3 protocol. This would allow peers to fetch content seamlessly from an S3-compatible provider (S3, backblaze's B2, Cloudflare's R2) - 1. Support edge compute directly. Many edge compute environments build on top of HTTP since it’s a stateless request/response protocol. This includes services such as Cloudflare workers, AWS Lambda, Netflify Edge functions, and many more. -1. Use peer authentication (both client and server auth) for a subset of HTTP endpoints +## HTTP Transport vs HTTP Semantics +HTTP is a bit of an overloaded term. This section aims to clarify what we’re talking about when we say “HTTP”. -## Addressing +*HTTP semantics* ([RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html)) is the stateless application-level protocol that you work with when writing HTTP apis (for example). -Nodes may advertise HTTP multiaddresses to signal support for libp2p over HTTP. An address might look like this: `/ip4/1.2.3.4/tcp/443/tls/sni/example.com/http/p2p/` (for HTTP/1.1 and HTTP/2), or `/ip4/1.2.3.4/udp/443/quic/sni/example.com/http/p2p/` (for HTTP/3). +*HTTP transport* is the thing that takes your high level request/response defined in terms of HTTP semantics and encodes it and sends it over the wire. -Nodes MUST use HTTPS (i.e. they MUST NOT use unencrypted HTTP). It is RECOMMENDED to use HTTP/2 and HTTP/3, but the protocols also work over HTTP/1.1. +When this document says *HTTP* it is generally referring to *HTTP semantics*. -Note that the peer ID in this address is 1. optional and 2. advisory and not (necessarily) verified during the HTTP handshake (depending on the HTTP client). If and when desired, clients can cryptographically verify the peer ID once the HTTP connection has been established, see [Authentication] for details on peer authentication. +## Interoperability with existing HTTP systems -Nodes can also link to a specific resource directly, similar to how a URL includes a path. This will require us to resolve [https://github.com/multiformats/multiaddr/issues/63](https://github.com/multiformats/multiaddr/issues/63) first. For example, the URL of a specific CID might be: `/ip4/1.2.3.4/tcp/443/tls/sni/example.com/http//{/path/to/}`. +A goal of this spec is to allow libp2p to be able to interoperate with existing HTTP servers and clients. Care is taken in this document to not introduce anything that would break interoperability with existing systems. -## Namespace - -libp2p does not squat the global namespace. By convention libp2p services can be discovered by accessing the configuration file at a well-known URL (RFC 8615): `.well-known/libp2p.json` (e.g. https://example.com/.well-known/libp2p.json). This allows server operators to dynamically change the URLs of the services offered, and to not hard-code any assumptions how a certain resource is meant to be interpreted. - -The document contains a mapping of protocols to their respective URL. For example, this configuration file would tell the client - -```json -{ - "services": { - "/kad/1.0.0": "/kademlia/", - "server-auth": "/libp2p/server-auth" - } -} -``` - -1. That the Kademlia protocol is available at https://example.com/kademlia and -2. The libp2p server auth protocol (see below) is available at https://example.com/libp2p/server-auth. - -It is valid expose a service at /. Applications then need to take special care to avoid collisions with other protocols. - - -### Privacy Properties - -This leads to some very desirable properties: - -1. It is possible to run libp2p alongside a normal HTTP web service, i.e. on the same domain and port, without having to worry about collisions. - 1. As an on-path observer only sees SNI and ALPN, this effectively hides the fact that a client is establishing a connection in order to speak libp2p. -2. Since authentication is flexible (see below), this enables servers to enforce ACLs to `.well-known/libp2p.json`, thereby hiding the fact that a server is running libp2p - -## Certificates - -libp2p doesn’t prescribe how nodes obtain the TLS certificate to secure the HTTPS connection. Since browsers are expected to connect to the node, the certificate’s trust chain must end in the browser’s trust store. - -This is somewhat tricky in a p2p context, as nodes might not have a (sub)domain, which for many CAs is a requirement to obtain a certificate. Specifically, Let’s Encrypt doesn’t support IP certificates at the moment. ZeroSSL does, however, this requires setting up a (free) account. - -To speed of server authentication, a node MAY include the libp2p TLS extension in its certificate. Note that this is currently not possible when using Let’s Encrypt, since the libp2p TLS extension is not whitelisted by LE. Not every HTTP client will have access to the TLS certificate (for example, browsers usually don’t expose an API for that), but if an HTTP client does, it SHOULD use that information. - -## Authentication - -Traditionally, libp2p was built on the assumption that both peers authenticate each other during the libp2p handshake. libp2p+HTTP acknowledges that this isn’t always possible, or even desirable, and that different use cases call for different authentication modes. For example, a server might offer a certain set of services to any client, like a HTTP webserver does. - -### Server Authentication +## HTTP Transport -Since HTTP requests are independent from each other (they are not bound to a single connection, and when using HTTP/1.1, will actually use different connections), the server needs to authenticate itself on every single request. +Nodes MUST use HTTPS (i.e. they MUST NOT use plaintext HTTP). It is RECOMMENDED to use HTTP/2 and HTTP/3. -As browsers don’t expose an API to access details of the TLS certificate used, nor allow any access to the (an exporter to) the TLS master secret, server authentication is a bit more contrived than one might initially expect. +Nodes signal support for their HTTP transport using the `/http` component in their multiaddr. e.g. `/dns4/example.com/tls/http` . See the [HTTP multiaddr component spec](https://github.com/libp2p/specs/pull/550) for more details. -To request the server to authenticate, the client sets the `libp2p-server-auth` HTTP header to a randomly generated ASCII string of at least 10 (and a maximum of 100) characters. The server signs the following string using its host key: - -``` -"libp2p-server-auth:" || the value of the libp2p-server-auth header || "libp2p-server-domain:" || the domain (including subdomains) -``` - -It then sets the following two HTTP headers on the response: - -1. `libp2p-server-pubkey`: its public key (from the libp2p key pair) -2. `libp2p-server-auth-signature`: the signature derived as described above - -When requesting server authentication, the client MUST check that these two header fields are present, and MUST check the signature. It MUST NOT process the response if either one of these checks fails - -### Client Authentication - -When an unauthenticated client tries to access a resource that requires authentication, the server SHOULD use a [401 HTTP status code](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/401). The client MAY then authenticate itself using the protocol described below, and then retry the request. - -Support for client authentication is an optional feature. It is expected that only a subsection of clients will implement it. For example, browser simply retrieving a few (elements of) web pages from IPFS probably won’t have any need to even generate a libp2p identity in the first place. - -The protocol defined here takes 2 RTTs to authenticate the client. It is designed to be stateless on the server side. In the first round-trip, the client obtains a (pseudo-) random value from the server, which it then signs with its host key and sends back to the server, which then issues an authentication token (acting somewhat like a cookie) which can be included on future requests. +## Namespace -The service name is `client-auth`. For the first step, the client sends a GET request to this HTTP endpoint. As described in [server-authentication], the client MUST authenticate the server in this step. The server responds with at least 8 and up to 1024 bytes of pseudorandom data: +libp2p does not squat the global namespace. libp2p application protocols can be discovered by the [well-known resource](https://www.rfc-editor.org/rfc/rfc8615) `.well-known/libp2p`. This allows server operators to dynamically change the URLs of the application protocols offered, and not hard-code any assumptions how a certain resource is meant to be interpreted. ```json -{ - "random": , - "signature": -} -``` - -In order to keep this exchange stateless, the server SHOULD 1. include the current timestamp or an expiry data and 2. a signature in that data. This allows it to check in step 2 that it actually generated that data. -The client MUST check that the signature obtained in the JSON response is correct and was generated using the same key that the server used to authenticate itself. - -The client signs the data received in step 1, and sends a POST request with the following JSON object to the server: -```json { - "data": , - "peer-id": , - "signature": + "services": { + "/kad/1.0.0": "/kademlia/", + "/ipfs-http/1.0.0": "/", + } } ``` -The server verifies the signature and issues an authentication token. In order to allow stateless operation, at the very minimum, the authentication token SHOULD contain the peer ID. It SHOULD also contain an expiry date and it MAY be bound to the client’s IP address. The token is sent in the response body. - -The client uses the auth token on requests that require client authentication, by setting the `libp2p-auth-token` HTTP header. - -## Mapping to libp2p Streams - -libp2p services whose service is specified as request-response protocols can use a single protocol implementation to make the service available over HTTP as well as on top of libp2p streams. - -The libp2p protocol identifier is `/http1.1`. After negotiating this protocol using multistream-select, nodes treat the stream as a HTTP/1.1 stream for a single HTTP request (i.e. nodes MUST NOT use request pipelining). +The resource contains a mapping of application protocols to their respective URL. For example, this configuration file would tell a client -## Outlook: Interaction with Intermediaries +1. That the Kademlia protocol is available at `/kademlia` and +2. The [IPFS Path Gateway API](https://specs.ipfs.tech/http-gateways/path-gateway/) is mounted at `/`. -One of the advantages of running HTTP is that there’s widely deployed caching infrastructure (CDNs). Content-addressed data is infinitely cacheable. Assuming a properly design data transfer protocol, retrieval for CIDs could be cached by the CDN and made available via a POP (geographically) close to the user, dramatically reducing retrieval latencies. +It is valid expose a service at `/`. It is RECOMMENDED that the server resolve more specific URLs before less specific ones. e.g. a path of `/kademlia/foo` should be routed to the Kademlia protocol rather than the IPFS HTTP API. -Services SHOULD specify the caching properties (if any), and set the appropriate cache headers (according to RFC 9111). +## Peer ID Authentication -CDNs can also be used to increase censorship resistance, since the CDN effectively hides the IP address of the origin server. With the upcoming introduction of ECHO (Encrypted ClientHello) in TLS, all that an on-path observer will be able to see is that a client is establishing a connection to a certain CDN, but not to which domain name. +When using the HTTP Transport, peer id authentication is optional. You only pay for it if you need it. This is benefits use cases that don’t need peer authentication (e.g. fetching content addressed data) or authenticate some other way (not tied to libp2p peer ids). -The level of delegation between the origin node and the CDN can be adjusted. In the simplest configuration, the origin node is the only node that holds the libp2p private key, thus requests to the `server-auth` protocol would be forwarded from the CDN to the origin server. In a more advanced configuration, it would be possible to move the private key to a worker on the edge of the CDN, and perform the signing operation there (thereby reducing the request latency for `server-auth` requests). +Peer ID authentication in the HTTP Transport follows a similar to pattern to how libp2p adds Peer ID authentication in WebTransport and WebRTC. We run the standard libp2p Noise handshake, but using `IX` for client and server authentication or `NX` for just server authentication. -## FAQ +### Authentication flow -### Why not gRPC? +1. The client initiates a request that it knows must be authenticated OR the client responds to a `401` with the header `www-authenticate: libp2p-noise` (The server MAY also include `libp2p-token` as an authentication scheme). +2. The client sets the `Authorization` [header](https://www.rfc-editor.org/rfc/rfc9110.html#section-11.6.2) to `libp2p-noise ` . This initiates the `IX` or `NX` handshake. + 1. The protobuf is multibase encoded, but clients MUST only use encodings that are HTTP header safe (refer to to the [token68 definition](https://www.rfc-editor.org/rfc/rfc9110.html#section-11.2)). To set the minimum bar for interoperability, clients and servers MUST support base32 encoding (”b” in the multibase table). + 2. When the server receives this request and `IX` was used, it can authenticate the client. +3. The server responds with `Authentication-Info` field set to `libp2p-noise `. + 1. The server MUST include the SNI used for the connection in the Noise extension (TODO link). + 2. The server MAY include a token that the client can use to avoid doing another Noise handshake in the future. The client would use this token by setting the `Authorization` header to `libp2p-token `. + 3. When the client receives this response, it can authenticate the server’s peer ID. +4. The client verifies the SNI in the Noise extension matches the one used to initiate the connection. The client MUST close the connection if they differ. + 1. The client SHOULD remember this connection is authenticated. + 2. The client SHOULD use the `libp2p-token` if provided for future authorized requests. -This would be the perfect fit, allowing both request-response schemes as well as variations with multiple requests and multiple responses. However, it’s not possible to use gRPC from the browser. +This costs one round trip, but can piggy back on an appropriate request. -### Why tie ourselves to HTTP when mapping onto libp2p? Can’t we have a more general serialization format? +### Authentication Endpoint -We could, but rolling our own serialization comes with some costs. First of all, we’d have define how HTTP request and response header, bodies, trailers are serialized onto the wire. Most likely, we’d define a Protobuf for that. Second, once we add more features to that format, they would need to be back-ported to HTTP, so that nodes that only speak HTTP can make use of them as well. +Because the client needs to make a request to authenticate the server, and the client may not want to make the real request before authenticating the server, the server MAY provide an authentication endpoint. This authentication endpoint is like any other application protocol, and it shows up in `.well-known/libp2p`, but it only does the authentication flow. It doesn’t send any other data besides what is defined in the above Authentication flow. The protocol id for the authentication endpoint is `/http-noise-auth/1.0.0`. -It’s just simpler to commit to HTTP. +## Using HTTP semantics over stream transports -### Why not use HTTP/3 for the libp2p mapping? +Application protocols using HTTP semantics can run over any libp2p stream transport. Clients open a new stream using `/http/1.1` as the protocol identifer. Clients encode their HTTP request as an HTTP/1.1 message and send it over the stream. Clients parse the response as an HTTP/1.1 message and then close the stream. -I’d love to! This would allow us to use HTTP header compression using QPACK, and a binary format instead of a text-based one. However, HTTP/3 requires the peers to exchange HTTP/3 SETTINGS frames first, and it’s not immediately obvious when / how this would be done in libp2p. It’s also not clear how easy it would be to use HTTP/3 in JavaScript. +HTTP/1.1 is chosen as the minimum bar for interoperability, but other encodings of HTTP semantics are possible as well and may be specified in a future update. -The good news is that once we’ve come up with a solution for these two problems, it will be rather easy to add support for HTTP/3: nodes will just offer `/http3` in addition (and one day, instead of) `/http1.1`, and nodes that support can hit that endpoint. Nothing in the implementation of the protocols will need to change, since protocols only deal with (deserialized) HTTP requests and responses. +## Using other request-response semantics (not HTTP) -### Can I run QUIC, WebTransport and an HTTP/3 server on the same IP and port? +This document has focused on using HTTP semantics, but HTTP may not be the common divisor amongst all transports (current and future). It may be desirable to use some other request-response semantics for your application-level protocol, perhaps something like rust-libp2p’s [request-response](https://docs.rs/libp2p/0.52.1/libp2p/request_response/index.html) abstraction. Nothing specified in this document prohibits mapping other semantics onto HTTP semantics to keep the benefits of using an HTTP transport. -Yes, once [https://github.com/libp2p/specs/issues/507](https://github.com/libp2p/specs/issues/507) is resolved. +To support the simple request-response semantics, for example, the request MUST be encoded within a `POST` request to the proper URL (as defined in the Namespace section). The response is read from the body of the HTTP response. The client MUST authenticate the server and itself **before** making the request. From 3681472fc53a6c020239257a60df874380a679b7 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Fri, 7 Jul 2023 15:23:03 -0700 Subject: [PATCH 09/35] Add link --- http/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index e99064a61..c0155341f 100644 --- a/http/README.md +++ b/http/README.md @@ -80,7 +80,7 @@ Peer ID authentication in the HTTP Transport follows a similar to pattern to how 1. The protobuf is multibase encoded, but clients MUST only use encodings that are HTTP header safe (refer to to the [token68 definition](https://www.rfc-editor.org/rfc/rfc9110.html#section-11.2)). To set the minimum bar for interoperability, clients and servers MUST support base32 encoding (”b” in the multibase table). 2. When the server receives this request and `IX` was used, it can authenticate the client. 3. The server responds with `Authentication-Info` field set to `libp2p-noise `. - 1. The server MUST include the SNI used for the connection in the Noise extension (TODO link). + 1. The server MUST include the SNI used for the connection in the [Noise extensions](https://github.com/libp2p/specs/blob/master/noise/README.md#noise-extensions). 2. The server MAY include a token that the client can use to avoid doing another Noise handshake in the future. The client would use this token by setting the `Authorization` header to `libp2p-token `. 3. When the client receives this response, it can authenticate the server’s peer ID. 4. The client verifies the SNI in the Noise extension matches the one used to initiate the connection. The client MUST close the connection if they differ. From dd5d07c7b3e85589bd987ec3be774dcde179eec5 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 10 Jul 2023 11:01:50 -0700 Subject: [PATCH 10/35] Merge comments --- http/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/http/README.md b/http/README.md index c0155341f..b58e9e05b 100644 --- a/http/README.md +++ b/http/README.md @@ -6,7 +6,7 @@ Authors: [@marten-seemann, @MarcoPolo] -Interest Group: [@MarcoPolo, @marten-seemann] +Interest Group: [todo] [@marten-seemann]: https://github.com/marten-seemann [@MarcoPolo]: https://github.com/MarcoPolo @@ -19,7 +19,7 @@ This document defines how libp2p nodes can offer and use an HTTP transport along - HTTP only edge workers can run application protocols and respond to peers on the network. - `curl` from the command line can make requests to other libp2p nodes. -As well as allowing application protocols to make use of HTTP middle boxes such as HTTP caching and layer 7 proxying and load balancing. This is all in addition to the existing features that libp2p provides such as: +As well as allowing application protocols to make use of HTTP intermediaries such as HTTP caching and layer 7 proxying and load balancing. This is all in addition to the existing features that libp2p provides such as: - Connectivity – Work on top of WebRTC, WebTransport, QUIC, TCP, or an HTTP transport. - Hole punching – Work with peers behind NATs. @@ -65,11 +65,11 @@ The resource contains a mapping of application protocols to their respective URL 1. That the Kademlia protocol is available at `/kademlia` and 2. The [IPFS Path Gateway API](https://specs.ipfs.tech/http-gateways/path-gateway/) is mounted at `/`. -It is valid expose a service at `/`. It is RECOMMENDED that the server resolve more specific URLs before less specific ones. e.g. a path of `/kademlia/foo` should be routed to the Kademlia protocol rather than the IPFS HTTP API. +It is valid to expose a service at `/`. It is RECOMMENDED that the server resolve more specific URLs before less specific ones. e.g. a path of `/kademlia/foo` should be routed to the Kademlia protocol rather than the IPFS HTTP API. ## Peer ID Authentication -When using the HTTP Transport, peer id authentication is optional. You only pay for it if you need it. This is benefits use cases that don’t need peer authentication (e.g. fetching content addressed data) or authenticate some other way (not tied to libp2p peer ids). +When using the HTTP Transport, peer id authentication is optional. You only pay for it if you need it. This benefits use cases that don’t need peer authentication (e.g. fetching content addressed data) or authenticate some other way (not tied to libp2p peer ids). Peer ID authentication in the HTTP Transport follows a similar to pattern to how libp2p adds Peer ID authentication in WebTransport and WebRTC. We run the standard libp2p Noise handshake, but using `IX` for client and server authentication or `NX` for just server authentication. From ebe612ca81d4c918207baa02684eef210115cb07 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 10 Jul 2023 11:18:41 -0700 Subject: [PATCH 11/35] Add note about how this is just one possible auth mechanism --- http/README.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index b58e9e05b..9ff225cbb 100644 --- a/http/README.md +++ b/http/README.md @@ -71,7 +71,14 @@ It is valid to expose a service at `/`. It is RECOMMENDED that the server resolv When using the HTTP Transport, peer id authentication is optional. You only pay for it if you need it. This benefits use cases that don’t need peer authentication (e.g. fetching content addressed data) or authenticate some other way (not tied to libp2p peer ids). -Peer ID authentication in the HTTP Transport follows a similar to pattern to how libp2p adds Peer ID authentication in WebTransport and WebRTC. We run the standard libp2p Noise handshake, but using `IX` for client and server authentication or `NX` for just server authentication. +Peer ID authentication in the HTTP Transport follows a similar to pattern to how +libp2p adds Peer ID authentication in WebTransport and WebRTC. We run the +standard libp2p Noise handshake, but using `IX` for client and server +authentication or `NX` for just server authentication. + +Note: This is just one form of Peer ID authentication. Other forms may be added +in the future (with a different `www-authenticate` value) or be added to the +application protocols themselves. ### Authentication flow From 7e5a077a509c1db2b08042ce68f8d4adcbb4a122 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Fri, 14 Jul 2023 15:26:39 -0700 Subject: [PATCH 12/35] Add lidel to interest group --- http/README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index 9ff225cbb..7b1e89aab 100644 --- a/http/README.md +++ b/http/README.md @@ -6,10 +6,11 @@ Authors: [@marten-seemann, @MarcoPolo] -Interest Group: [todo] +Interest Group: [@lidel] [@marten-seemann]: https://github.com/marten-seemann [@MarcoPolo]: https://github.com/MarcoPolo +[@lidel]: https://github.com/lidel ## Introduction From db2b3b5492d263a881bbc248009c434c0150e746 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 17 Jul 2023 14:05:23 -0700 Subject: [PATCH 13/35] Update http/README.md Co-authored-by: Thomas Eizinger --- http/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index 7b1e89aab..bd005df39 100644 --- a/http/README.md +++ b/http/README.md @@ -4,7 +4,7 @@ | --------------- | ------------- | ------ | --------------- | | 1A | Working Draft | Active | r0, 2023-01-23 | -Authors: [@marten-seemann, @MarcoPolo] +Authors: [@marten-seemann], [@MarcoPolo] Interest Group: [@lidel] From 6319458d7463bad25031cffa58614a406eed2a7d Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 17 Jul 2023 14:46:04 -0700 Subject: [PATCH 14/35] Formatting --- http/README.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/http/README.md b/http/README.md index bd005df39..d579a9927 100644 --- a/http/README.md +++ b/http/README.md @@ -27,13 +27,16 @@ As well as allowing application protocols to make use of HTTP intermediaries suc - Peer ID Authentication – Authenticate your peer by their libp2p peer id. - Peer discovery – Learn about a peer given their peer id. -## HTTP Transport vs HTTP Semantics +## HTTP Semantics vs HTTP Transport HTTP is a bit of an overloaded term. This section aims to clarify what we’re talking about when we say “HTTP”. -*HTTP semantics* ([RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html)) is the stateless application-level protocol that you work with when writing HTTP apis (for example). +- *HTTP semantics* ([RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html)) is + the stateless application-level protocol that you work with when writing HTTP + apis (for example). -*HTTP transport* is the thing that takes your high level request/response defined in terms of HTTP semantics and encodes it and sends it over the wire. +- *HTTP transport* is the thing that takes your high level request/response + defined in terms of HTTP semantics and encodes it and sends it over the wire. When this document says *HTTP* it is generally referring to *HTTP semantics*. From c7c9c432d48faa981c1dad581cf2e855116d4ccf Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 17 Jul 2023 14:56:21 -0700 Subject: [PATCH 15/35] Add thomas --- http/README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index d579a9927..e809c037c 100644 --- a/http/README.md +++ b/http/README.md @@ -6,11 +6,12 @@ Authors: [@marten-seemann], [@MarcoPolo] -Interest Group: [@lidel] +Interest Group: [@lidel], [@thomaseizinger] [@marten-seemann]: https://github.com/marten-seemann [@MarcoPolo]: https://github.com/MarcoPolo [@lidel]: https://github.com/lidel +[@thomaseizinger]: https://github.com/thomaseizinger ## Introduction From 454e25c6c2e5736e436e8806108c37a33862031e Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 17 Jul 2023 14:56:35 -0700 Subject: [PATCH 16/35] Use metadata map and call it protocols --- http/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/http/README.md b/http/README.md index e809c037c..07fafa9ea 100644 --- a/http/README.md +++ b/http/README.md @@ -58,9 +58,9 @@ libp2p does not squat the global namespace. libp2p application protocols can be ```json { - "services": { - "/kad/1.0.0": "/kademlia/", - "/ipfs-http/1.0.0": "/", + "protocols": { + "/kad/1.0.0": {"path": "/kademlia/"}, + "/ipfs-http/1.0.0": {"path": "/"}, } } ``` From a25267bea60d19dced2bb756df7795e9e3a50a1f Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 17 Jul 2023 15:41:01 -0700 Subject: [PATCH 17/35] Add mermaid diagrom for HTTP semantics vs transport --- http/README.md | 36 +++++++++++++++++++++++++++++++++--- 1 file changed, 33 insertions(+), 3 deletions(-) diff --git a/http/README.md b/http/README.md index 07fafa9ea..705be4797 100644 --- a/http/README.md +++ b/http/README.md @@ -28,16 +28,46 @@ As well as allowing application protocols to make use of HTTP intermediaries suc - Peer ID Authentication – Authenticate your peer by their libp2p peer id. - Peer discovery – Learn about a peer given their peer id. -## HTTP Semantics vs HTTP Transport +## HTTP Semantics vs Encodings vs Transport HTTP is a bit of an overloaded term. This section aims to clarify what we’re talking about when we say “HTTP”. + +```mermaid +graph TB + subgraph "HTTP Semantics" + HTTP + end + subgraph "Encoding" + HTTP1.1[HTTP/1.1] + HTTP2[HTTP/2] + HTTP3[HTTP/3] + end + subgraph "Transports" + Libp2p[libp2p streams] + HTTPTransport[HTTP transport] + end + HTTP --- HTTP1.1 + HTTP --- HTTP1.1 + HTTP1.1 --- Libp2p + HTTP --- HTTP2 + HTTP --- HTTP3 + HTTP1.1 --- HTTPTransport + HTTP2 --- HTTPTransport + HTTP3 --- HTTPTransport +``` + - *HTTP semantics* ([RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html)) is the stateless application-level protocol that you work with when writing HTTP apis (for example). -- *HTTP transport* is the thing that takes your high level request/response - defined in terms of HTTP semantics and encodes it and sends it over the wire. +- *HTTP encoding* is the thing that takes your high level request/response + defined in terms of HTTP semantics and encodes it into a form that can be sent + over the wire. + +- *HTTP transport* is the thing that takes your encoded reqeust/response and + sends it over the wire. For HTTP/1.1 and HTTP/2, this is a TCP+TLS connection. + For HTTP/3, this is a QUIC connection. When this document says *HTTP* it is generally referring to *HTTP semantics*. From 3014b2252792ddf6729ab1fe4c183cd82784ac84 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 17 Jul 2023 15:47:39 -0700 Subject: [PATCH 18/35] Grammar fixes --- http/README.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/http/README.md b/http/README.md index 705be4797..f163a0e1b 100644 --- a/http/README.md +++ b/http/README.md @@ -77,9 +77,11 @@ A goal of this spec is to allow libp2p to be able to interoperate with existing ## HTTP Transport -Nodes MUST use HTTPS (i.e. they MUST NOT use plaintext HTTP). It is RECOMMENDED to use HTTP/2 and HTTP/3. +Nodes MUST use HTTPS (i.e., they MUST NOT use plaintext HTTP). It is RECOMMENDED to use HTTP/2 and HTTP/3. -Nodes signal support for their HTTP transport using the `/http` component in their multiaddr. e.g. `/dns4/example.com/tls/http` . See the [HTTP multiaddr component spec](https://github.com/libp2p/specs/pull/550) for more details. +Nodes signal support for their HTTP transport using the `/http` component in +their multiaddr. E.g., `/dns4/example.com/tls/http`. See the [HTTP multiaddr +component spec](https://github.com/libp2p/specs/pull/550) for more details. ## Namespace @@ -100,11 +102,11 @@ The resource contains a mapping of application protocols to their respective URL 1. That the Kademlia protocol is available at `/kademlia` and 2. The [IPFS Path Gateway API](https://specs.ipfs.tech/http-gateways/path-gateway/) is mounted at `/`. -It is valid to expose a service at `/`. It is RECOMMENDED that the server resolve more specific URLs before less specific ones. e.g. a path of `/kademlia/foo` should be routed to the Kademlia protocol rather than the IPFS HTTP API. +It is valid to expose a service at `/`. It is RECOMMENDED that the server resolve more specific URLs before less specific ones. e.g., a path of `/kademlia/foo` should be routed to the Kademlia protocol rather than the IPFS HTTP API. ## Peer ID Authentication -When using the HTTP Transport, peer id authentication is optional. You only pay for it if you need it. This benefits use cases that don’t need peer authentication (e.g. fetching content addressed data) or authenticate some other way (not tied to libp2p peer ids). +When using the HTTP Transport, peer id authentication is optional. You only pay for it if you need it. This benefits use cases that don’t need peer authentication (e.g., fetching content addressed data) or authenticate some other way (not tied to libp2p peer ids). Peer ID authentication in the HTTP Transport follows a similar to pattern to how libp2p adds Peer ID authentication in WebTransport and WebRTC. We run the From f96359b63db807550ba8ca1d784bf1d8c40ee841 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 17 Jul 2023 15:50:04 -0700 Subject: [PATCH 19/35] Lidel suggestions --- http/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/http/README.md b/http/README.md index f163a0e1b..9fc14415e 100644 --- a/http/README.md +++ b/http/README.md @@ -100,9 +100,9 @@ libp2p does not squat the global namespace. libp2p application protocols can be The resource contains a mapping of application protocols to their respective URL. For example, this configuration file would tell a client 1. That the Kademlia protocol is available at `/kademlia` and -2. The [IPFS Path Gateway API](https://specs.ipfs.tech/http-gateways/path-gateway/) is mounted at `/`. +2. The [IPFS Trustless Gateway API](https://specs.ipfs.tech/http-gateways/trustless-gateway/) is mounted at `/`. -It is valid to expose a service at `/`. It is RECOMMENDED that the server resolve more specific URLs before less specific ones. e.g., a path of `/kademlia/foo` should be routed to the Kademlia protocol rather than the IPFS HTTP API. +It is valid to expose a service at `/`. It is RECOMMENDED that implementations facilitate the coexistence of different service endpoints by ensuring that more specific URLs are resolved before less specific ones. For example, when registering handlers, more specific paths like `/kademlia/foo` should take precedence over less specific handler, such as `/`. ## Peer ID Authentication From 1e8796035ea6517940902322c6f0dd86a707777b Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 17 Jul 2023 15:53:30 -0700 Subject: [PATCH 20/35] Define where the libp2p-token will be --- http/README.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/http/README.md b/http/README.md index 9fc14415e..05adbda20 100644 --- a/http/README.md +++ b/http/README.md @@ -114,22 +114,25 @@ standard libp2p Noise handshake, but using `IX` for client and server authentication or `NX` for just server authentication. Note: This is just one form of Peer ID authentication. Other forms may be added -in the future (with a different `www-authenticate` value) or be added to the +in the future (with a different `WWW-Authenticate` value) or be added to the application protocols themselves. ### Authentication flow -1. The client initiates a request that it knows must be authenticated OR the client responds to a `401` with the header `www-authenticate: libp2p-noise` (The server MAY also include `libp2p-token` as an authentication scheme). -2. The client sets the `Authorization` [header](https://www.rfc-editor.org/rfc/rfc9110.html#section-11.6.2) to `libp2p-noise ` . This initiates the `IX` or `NX` handshake. +1. The client initiates a request that it knows must be authenticated OR the client responds to a `401` with the header `WWW-Authenticate: Libp2p-Noise` (The server MAY also include `Libp2p-Token` as an authentication scheme). +2. The client sets the `Authorization` [header](https://www.rfc-editor.org/rfc/rfc9110.html#section-11.6.2) to `Libp2p-Noise ` . This initiates the `IX` or `NX` handshake. 1. The protobuf is multibase encoded, but clients MUST only use encodings that are HTTP header safe (refer to to the [token68 definition](https://www.rfc-editor.org/rfc/rfc9110.html#section-11.2)). To set the minimum bar for interoperability, clients and servers MUST support base32 encoding (”b” in the multibase table). 2. When the server receives this request and `IX` was used, it can authenticate the client. -3. The server responds with `Authentication-Info` field set to `libp2p-noise `. +3. The server responds with `Authentication-Info` field set to `Libp2p-Noise `. 1. The server MUST include the SNI used for the connection in the [Noise extensions](https://github.com/libp2p/specs/blob/master/noise/README.md#noise-extensions). - 2. The server MAY include a token that the client can use to avoid doing another Noise handshake in the future. The client would use this token by setting the `Authorization` header to `libp2p-token `. + 2. The server MAY include a token in the Noise extensions that the client + can use to avoid doing another Noise handshake in the future. The client + would use this token by setting the `Authorization` header to `Libp2p-Token + `. 3. When the client receives this response, it can authenticate the server’s peer ID. 4. The client verifies the SNI in the Noise extension matches the one used to initiate the connection. The client MUST close the connection if they differ. 1. The client SHOULD remember this connection is authenticated. - 2. The client SHOULD use the `libp2p-token` if provided for future authorized requests. + 2. The client SHOULD use the `Libp2p-Token` if provided for future authorized requests. This costs one round trip, but can piggy back on an appropriate request. From d0f0d93b488ffcca1712056923b8f4291c78721a Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 17 Jul 2023 15:56:06 -0700 Subject: [PATCH 21/35] Grammar fix --- http/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index 05adbda20..b34fa71e2 100644 --- a/http/README.md +++ b/http/README.md @@ -21,7 +21,7 @@ This document defines how libp2p nodes can offer and use an HTTP transport along - HTTP only edge workers can run application protocols and respond to peers on the network. - `curl` from the command line can make requests to other libp2p nodes. -As well as allowing application protocols to make use of HTTP intermediaries such as HTTP caching and layer 7 proxying and load balancing. This is all in addition to the existing features that libp2p provides such as: +The HTTP transport will also allow application protocols to make use of HTTP intermediaries such as HTTP caching, and layer 7 proxying and load balancing. This is all in addition to the existing features that libp2p provides such as: - Connectivity – Work on top of WebRTC, WebTransport, QUIC, TCP, or an HTTP transport. - Hole punching – Work with peers behind NATs. From 8fbd64a362489eded41607c6fe300311533351fb Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Wed, 19 Jul 2023 13:56:46 -0700 Subject: [PATCH 22/35] Specify IX vs NX in auth scheme --- http/README.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/http/README.md b/http/README.md index b34fa71e2..1eb0fc10b 100644 --- a/http/README.md +++ b/http/README.md @@ -119,11 +119,17 @@ application protocols themselves. ### Authentication flow -1. The client initiates a request that it knows must be authenticated OR the client responds to a `401` with the header `WWW-Authenticate: Libp2p-Noise` (The server MAY also include `Libp2p-Token` as an authentication scheme). -2. The client sets the `Authorization` [header](https://www.rfc-editor.org/rfc/rfc9110.html#section-11.6.2) to `Libp2p-Noise ` . This initiates the `IX` or `NX` handshake. +1. The client initiates a request that it knows must be authenticated OR the client responds to a `401` with the header `WWW-Authenticate: Libp2p-Noise-IX` (The server MAY also include `Libp2p-Token` as an authentication scheme). +2. The client sets the `Authorization` + [header](https://www.rfc-editor.org/rfc/rfc9110.html#section-11.6.2) to + `Libp2p-Noise-IX ` (or `Libp2p-Noise-NX` + if not doing client authentication). This initiates the + `IX` or `NX` handshake. 1. The protobuf is multibase encoded, but clients MUST only use encodings that are HTTP header safe (refer to to the [token68 definition](https://www.rfc-editor.org/rfc/rfc9110.html#section-11.2)). To set the minimum bar for interoperability, clients and servers MUST support base32 encoding (”b” in the multibase table). 2. When the server receives this request and `IX` was used, it can authenticate the client. -3. The server responds with `Authentication-Info` field set to `Libp2p-Noise `. +3. The server responds with `Authentication-Info` field set to + `Libp2p-Noise- `. Where + `` is either `IX` or `NX`. 1. The server MUST include the SNI used for the connection in the [Noise extensions](https://github.com/libp2p/specs/blob/master/noise/README.md#noise-extensions). 2. The server MAY include a token in the Noise extensions that the client can use to avoid doing another Noise handshake in the future. The client From 71415b01208a7555fc541331f5c72924e9df22db Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Wed, 19 Jul 2023 14:01:04 -0700 Subject: [PATCH 23/35] Add SNI and HTTP_libp2p_token to Noise extensions --- noise/README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/noise/README.md b/noise/README.md index 6277f97a0..5a93bac21 100644 --- a/noise/README.md +++ b/noise/README.md @@ -221,6 +221,8 @@ syntax = "proto2"; message NoiseExtensions { repeated bytes webtransport_certhashes = 1; repeated string stream_muxers = 2; + optional string SNI = 3; + optional string HTTP_libp2p_token = 4; } message NoiseHandshakePayload { From 4a03bb0f888120214b9aa951e476c6083affff25 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Wed, 2 Aug 2023 10:36:13 -0700 Subject: [PATCH 24/35] Reword Namespace section a bit --- http/README.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/http/README.md b/http/README.md index 1eb0fc10b..0c50d2741 100644 --- a/http/README.md +++ b/http/README.md @@ -85,7 +85,11 @@ component spec](https://github.com/libp2p/specs/pull/550) for more details. ## Namespace -libp2p does not squat the global namespace. libp2p application protocols can be discovered by the [well-known resource](https://www.rfc-editor.org/rfc/rfc8615) `.well-known/libp2p`. This allows server operators to dynamically change the URLs of the application protocols offered, and not hard-code any assumptions how a certain resource is meant to be interpreted. +libp2p does not squat the global namespace. libp2p application protocols can be +discovered by the [well-known resource](https://www.rfc-editor.org/rfc/rfc8615) +`.well-known/libp2p`. This allows server operators to dynamically change the +URLs of the application protocols offered, and not hard-code any assumptions how +a certain resource is meant to be interpreted. ```json @@ -97,9 +101,11 @@ libp2p does not squat the global namespace. libp2p application protocols can be } ``` -The resource contains a mapping of application protocols to their respective URL. For example, this configuration file would tell a client +The resource contains a mapping of application protocols to a URL namespace. For +example, this configuration file would tell a client -1. That the Kademlia protocol is available at `/kademlia` and +1. That the Kademlia application protocol is available with prefix `/kademlia` +and, 2. The [IPFS Trustless Gateway API](https://specs.ipfs.tech/http-gateways/trustless-gateway/) is mounted at `/`. It is valid to expose a service at `/`. It is RECOMMENDED that implementations facilitate the coexistence of different service endpoints by ensuring that more specific URLs are resolved before less specific ones. For example, when registering handlers, more specific paths like `/kademlia/foo` should take precedence over less specific handler, such as `/`. From 877899db82cddc5392a1da698fbe6a46a541f33e Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Wed, 2 Aug 2023 14:18:51 -0700 Subject: [PATCH 25/35] Remove SNI and token from extensions --- http/README.md | 46 +++++++--------------------------------------- noise/README.md | 2 -- 2 files changed, 7 insertions(+), 41 deletions(-) diff --git a/http/README.md b/http/README.md index 0c50d2741..6407d49ce 100644 --- a/http/README.md +++ b/http/README.md @@ -112,45 +112,13 @@ It is valid to expose a service at `/`. It is RECOMMENDED that implementations f ## Peer ID Authentication -When using the HTTP Transport, peer id authentication is optional. You only pay for it if you need it. This benefits use cases that don’t need peer authentication (e.g., fetching content addressed data) or authenticate some other way (not tied to libp2p peer ids). - -Peer ID authentication in the HTTP Transport follows a similar to pattern to how -libp2p adds Peer ID authentication in WebTransport and WebRTC. We run the -standard libp2p Noise handshake, but using `IX` for client and server -authentication or `NX` for just server authentication. - -Note: This is just one form of Peer ID authentication. Other forms may be added -in the future (with a different `WWW-Authenticate` value) or be added to the -application protocols themselves. - -### Authentication flow - -1. The client initiates a request that it knows must be authenticated OR the client responds to a `401` with the header `WWW-Authenticate: Libp2p-Noise-IX` (The server MAY also include `Libp2p-Token` as an authentication scheme). -2. The client sets the `Authorization` - [header](https://www.rfc-editor.org/rfc/rfc9110.html#section-11.6.2) to - `Libp2p-Noise-IX ` (or `Libp2p-Noise-NX` - if not doing client authentication). This initiates the - `IX` or `NX` handshake. - 1. The protobuf is multibase encoded, but clients MUST only use encodings that are HTTP header safe (refer to to the [token68 definition](https://www.rfc-editor.org/rfc/rfc9110.html#section-11.2)). To set the minimum bar for interoperability, clients and servers MUST support base32 encoding (”b” in the multibase table). - 2. When the server receives this request and `IX` was used, it can authenticate the client. -3. The server responds with `Authentication-Info` field set to - `Libp2p-Noise- `. Where - `` is either `IX` or `NX`. - 1. The server MUST include the SNI used for the connection in the [Noise extensions](https://github.com/libp2p/specs/blob/master/noise/README.md#noise-extensions). - 2. The server MAY include a token in the Noise extensions that the client - can use to avoid doing another Noise handshake in the future. The client - would use this token by setting the `Authorization` header to `Libp2p-Token - `. - 3. When the client receives this response, it can authenticate the server’s peer ID. -4. The client verifies the SNI in the Noise extension matches the one used to initiate the connection. The client MUST close the connection if they differ. - 1. The client SHOULD remember this connection is authenticated. - 2. The client SHOULD use the `Libp2p-Token` if provided for future authorized requests. - -This costs one round trip, but can piggy back on an appropriate request. - -### Authentication Endpoint - -Because the client needs to make a request to authenticate the server, and the client may not want to make the real request before authenticating the server, the server MAY provide an authentication endpoint. This authentication endpoint is like any other application protocol, and it shows up in `.well-known/libp2p`, but it only does the authentication flow. It doesn’t send any other data besides what is defined in the above Authentication flow. The protocol id for the authentication endpoint is `/http-noise-auth/1.0.0`. +When using the HTTP Transport, peer id authentication is optional. You only pay +for it if you need it. This benefits use cases that don’t need peer +authentication (e.g., fetching content addressed data) or authenticate some +other way (not tied to libp2p peer ids). + +Specific authentications schemes for authenticating Peer IDs will be defined in +a future spec. ## Using HTTP semantics over stream transports diff --git a/noise/README.md b/noise/README.md index 5a93bac21..6277f97a0 100644 --- a/noise/README.md +++ b/noise/README.md @@ -221,8 +221,6 @@ syntax = "proto2"; message NoiseExtensions { repeated bytes webtransport_certhashes = 1; repeated string stream_muxers = 2; - optional string SNI = 3; - optional string HTTP_libp2p_token = 4; } message NoiseHandshakePayload { From dc71f2cf4d0636ab73963b27fda9d64004863fd8 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Thu, 24 Aug 2023 16:20:03 -0700 Subject: [PATCH 26/35] Define the multiaddr URI --- http/README.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/http/README.md b/http/README.md index 6407d49ce..d5bc09f7d 100644 --- a/http/README.md +++ b/http/README.md @@ -126,6 +126,22 @@ Application protocols using HTTP semantics can run over any libp2p stream transp HTTP/1.1 is chosen as the minimum bar for interoperability, but other encodings of HTTP semantics are possible as well and may be specified in a future update. +## Multiaddr URI scheme + +In places where a URI is expected, implementations SHOULD accept a multiaddr URI +in addition to a standard http or https URI. A multiaddr URI is a +[URI](https://datatracker.ietf.org/doc/html/rfc3986) with the `multiaddr` +scheme. It is constructed by taking the "multiaddr:" string and appending the +string encoded representation of the multiaddr. E.g. the multiaddr +`/ip4/1.2.3.4/udp/54321/quic-v1` would be represented as +`multiaddr:/ip4/1.2.3.4/udp/54321/quic-v1`. + +This URI can be extended to include HTTP paths with the `/httppath` component. +This allows a user to make an HTTP request to a specific HTTP resource using a +multiaddr. For example, a user could make a GET request to +`multiaddr:/ip4/1.2.3.4/udp/54321/quic-v1/p2p/12D.../httppath/.well-known%2Flibp2p`. This also allows +an HTTP redirect to another host and another HTTP resource. + ## Using other request-response semantics (not HTTP) This document has focused on using HTTP semantics, but HTTP may not be the common divisor amongst all transports (current and future). It may be desirable to use some other request-response semantics for your application-level protocol, perhaps something like rust-libp2p’s [request-response](https://docs.rs/libp2p/0.52.1/libp2p/request_response/index.html) abstraction. Nothing specified in this document prohibits mapping other semantics onto HTTP semantics to keep the benefits of using an HTTP transport. From d8850aa6d0581558ac94fd1fe908ff4a72e80eda Mon Sep 17 00:00:00 2001 From: Marten Seemann Date: Wed, 4 Oct 2023 01:27:52 -0700 Subject: [PATCH 27/35] update protocol name for IPFS gateway Co-authored-by: Marcin Rataj --- http/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index 6407d49ce..e69a6c10e 100644 --- a/http/README.md +++ b/http/README.md @@ -96,7 +96,7 @@ a certain resource is meant to be interpreted. { "protocols": { "/kad/1.0.0": {"path": "/kademlia/"}, - "/ipfs-http/1.0.0": {"path": "/"}, + "/ipfs/gateway": {"path": "/"}, } } ``` From 78e8ca143c144009ddb49b536eb025225169fbac Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Thu, 14 Mar 2024 16:45:25 -0700 Subject: [PATCH 28/35] Be clear about no pipelining --- http/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index e69a6c10e..cca70c307 100644 --- a/http/README.md +++ b/http/README.md @@ -122,7 +122,7 @@ a future spec. ## Using HTTP semantics over stream transports -Application protocols using HTTP semantics can run over any libp2p stream transport. Clients open a new stream using `/http/1.1` as the protocol identifer. Clients encode their HTTP request as an HTTP/1.1 message and send it over the stream. Clients parse the response as an HTTP/1.1 message and then close the stream. +Application protocols using HTTP semantics can run over any libp2p stream transport. Clients open a new stream using `/http/1.1` as the protocol identifer. Clients encode their HTTP request as an HTTP/1.1 message and send it over the stream. Clients parse the response as an HTTP/1.1 message and then close the stream. Clients MUST NOT pipeline requests over a single stream. Servers SHOULD set the [`Connection: close` header](https://datatracker.ietf.org/doc/html/rfc2616#section-14.10) to signal to clients that this is not a persistent connection. HTTP/1.1 is chosen as the minimum bar for interoperability, but other encodings of HTTP semantics are possible as well and may be specified in a future update. From d30efdad0faafd1009555b7a2bff4a4addb59c10 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 18 Mar 2024 16:14:47 -0700 Subject: [PATCH 29/35] Use SHOULD instead of MUST --- http/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index cca70c307..1ea446f2c 100644 --- a/http/README.md +++ b/http/README.md @@ -122,7 +122,7 @@ a future spec. ## Using HTTP semantics over stream transports -Application protocols using HTTP semantics can run over any libp2p stream transport. Clients open a new stream using `/http/1.1` as the protocol identifer. Clients encode their HTTP request as an HTTP/1.1 message and send it over the stream. Clients parse the response as an HTTP/1.1 message and then close the stream. Clients MUST NOT pipeline requests over a single stream. Servers SHOULD set the [`Connection: close` header](https://datatracker.ietf.org/doc/html/rfc2616#section-14.10) to signal to clients that this is not a persistent connection. +Application protocols using HTTP semantics can run over any libp2p stream transport. Clients open a new stream using `/http/1.1` as the protocol identifer. Clients encode their HTTP request as an HTTP/1.1 message and send it over the stream. Clients parse the response as an HTTP/1.1 message and then close the stream. Clients SHOULD NOT pipeline requests over a single stream. Servers SHOULD set the [`Connection: close` header](https://datatracker.ietf.org/doc/html/rfc2616#section-14.10) to signal to clients that this is not a persistent connection. HTTP/1.1 is chosen as the minimum bar for interoperability, but other encodings of HTTP semantics are possible as well and may be specified in a future update. From 8628b5a185917226155784516eaf92f0e24f3a2e Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Wed, 3 Apr 2024 11:04:54 -0700 Subject: [PATCH 30/35] Update RFC for connection: close --- http/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index 1ea446f2c..894ee769d 100644 --- a/http/README.md +++ b/http/README.md @@ -122,7 +122,7 @@ a future spec. ## Using HTTP semantics over stream transports -Application protocols using HTTP semantics can run over any libp2p stream transport. Clients open a new stream using `/http/1.1` as the protocol identifer. Clients encode their HTTP request as an HTTP/1.1 message and send it over the stream. Clients parse the response as an HTTP/1.1 message and then close the stream. Clients SHOULD NOT pipeline requests over a single stream. Servers SHOULD set the [`Connection: close` header](https://datatracker.ietf.org/doc/html/rfc2616#section-14.10) to signal to clients that this is not a persistent connection. +Application protocols using HTTP semantics can run over any libp2p stream transport. Clients open a new stream using `/http/1.1` as the protocol identifer. Clients encode their HTTP request as an HTTP/1.1 message and send it over the stream. Clients parse the response as an HTTP/1.1 message and then close the stream. Clients SHOULD NOT pipeline requests over a single stream. Servers SHOULD set the [`Connection: close` header](https://datatracker.ietf.org/doc/html/rfc9112#section-9.6) to signal to clients that this is not a persistent connection. HTTP/1.1 is chosen as the minimum bar for interoperability, but other encodings of HTTP semantics are possible as well and may be specified in a future update. From 3c0ac4033dd640425e047ee448771dbca244ba68 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Wed, 3 Apr 2024 11:05:41 -0700 Subject: [PATCH 31/35] Rename well-known --- http/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index 894ee769d..a5d9a8fb9 100644 --- a/http/README.md +++ b/http/README.md @@ -87,7 +87,7 @@ component spec](https://github.com/libp2p/specs/pull/550) for more details. libp2p does not squat the global namespace. libp2p application protocols can be discovered by the [well-known resource](https://www.rfc-editor.org/rfc/rfc8615) -`.well-known/libp2p`. This allows server operators to dynamically change the +`.well-known/libp2p/protocols`. This allows server operators to dynamically change the URLs of the application protocols offered, and not hard-code any assumptions how a certain resource is meant to be interpreted. From 75bc63510917158688134a008b424064b6d6aa77 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Wed, 3 Apr 2024 11:10:51 -0700 Subject: [PATCH 32/35] Add sentence on why POST and other mappings --- http/README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/http/README.md b/http/README.md index a5d9a8fb9..06765f810 100644 --- a/http/README.md +++ b/http/README.md @@ -130,4 +130,6 @@ HTTP/1.1 is chosen as the minimum bar for interoperability, but other encodings This document has focused on using HTTP semantics, but HTTP may not be the common divisor amongst all transports (current and future). It may be desirable to use some other request-response semantics for your application-level protocol, perhaps something like rust-libp2p’s [request-response](https://docs.rs/libp2p/0.52.1/libp2p/request_response/index.html) abstraction. Nothing specified in this document prohibits mapping other semantics onto HTTP semantics to keep the benefits of using an HTTP transport. -To support the simple request-response semantics, for example, the request MUST be encoded within a `POST` request to the proper URL (as defined in the Namespace section). The response is read from the body of the HTTP response. The client MUST authenticate the server and itself **before** making the request. +As a simple example, to support the simple request-response semantics, the request MUST be encoded within a `POST` request to the proper URL (as defined in the Namespace section). The response is read from the body of the HTTP response. The client MUST authenticate the server and itself **before** making the request. The reason to chose `POST` is because this mapping makes no assumptions on whether the request is cacheable. If HTTP caching is desired users should either build on HTTP semantics or chose another mapping with different assumptions. + +Other mappings may also be valid and as long as nodes agree. From f95e4dbc6651770666772f7f9635ed9c5bf54c77 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 15 Apr 2024 11:14:02 -0700 Subject: [PATCH 33/35] Sukun's review comments --- http/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/http/README.md b/http/README.md index 06765f810..feecf4247 100644 --- a/http/README.md +++ b/http/README.md @@ -104,7 +104,7 @@ a certain resource is meant to be interpreted. The resource contains a mapping of application protocols to a URL namespace. For example, this configuration file would tell a client -1. That the Kademlia application protocol is available with prefix `/kademlia` +1. The Kademlia application protocol is available with prefix `/kademlia` and, 2. The [IPFS Trustless Gateway API](https://specs.ipfs.tech/http-gateways/trustless-gateway/) is mounted at `/`. @@ -112,17 +112,17 @@ It is valid to expose a service at `/`. It is RECOMMENDED that implementations f ## Peer ID Authentication -When using the HTTP Transport, peer id authentication is optional. You only pay +When using the HTTP Transport, Peer ID authentication is optional. You only pay for it if you need it. This benefits use cases that don’t need peer authentication (e.g., fetching content addressed data) or authenticate some other way (not tied to libp2p peer ids). -Specific authentications schemes for authenticating Peer IDs will be defined in +Specific authentication schemes for authenticating Peer IDs will be defined in a future spec. ## Using HTTP semantics over stream transports -Application protocols using HTTP semantics can run over any libp2p stream transport. Clients open a new stream using `/http/1.1` as the protocol identifer. Clients encode their HTTP request as an HTTP/1.1 message and send it over the stream. Clients parse the response as an HTTP/1.1 message and then close the stream. Clients SHOULD NOT pipeline requests over a single stream. Servers SHOULD set the [`Connection: close` header](https://datatracker.ietf.org/doc/html/rfc9112#section-9.6) to signal to clients that this is not a persistent connection. +Application protocols using HTTP semantics can run over any libp2p stream transport. Clients open a new stream using `/http/1.1` as the protocol identifer. Clients encode their HTTP request as an HTTP/1.1 message and send it over the stream. Clients parse the response as an HTTP/1.1 message and then close the stream. Clients SHOULD NOT pipeline requests over a single stream. Clients and Servers SHOULD set the [`Connection: close` header](https://datatracker.ietf.org/doc/html/rfc9112#section-9.6) to signal to clients that this is not a persistent connection. HTTP/1.1 is chosen as the minimum bar for interoperability, but other encodings of HTTP semantics are possible as well and may be specified in a future update. From e3eb9dc6bfadbb952888ef16cd482563873e6483 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 15 Apr 2024 11:18:40 -0700 Subject: [PATCH 34/35] Small typo fixes --- http/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/http/README.md b/http/README.md index feecf4247..e5c158a1f 100644 --- a/http/README.md +++ b/http/README.md @@ -65,7 +65,7 @@ graph TB defined in terms of HTTP semantics and encodes it into a form that can be sent over the wire. -- *HTTP transport* is the thing that takes your encoded reqeust/response and +- *HTTP transport* is the thing that takes your encoded request/response and sends it over the wire. For HTTP/1.1 and HTTP/2, this is a TCP+TLS connection. For HTTP/3, this is a QUIC connection. @@ -81,7 +81,7 @@ Nodes MUST use HTTPS (i.e., they MUST NOT use plaintext HTTP). It is RECOMMENDE Nodes signal support for their HTTP transport using the `/http` component in their multiaddr. E.g., `/dns4/example.com/tls/http`. See the [HTTP multiaddr -component spec](https://github.com/libp2p/specs/pull/550) for more details. +component spec](https://github.com/libp2p/specs/blob/master/http/transport-component.md) for more details. ## Namespace @@ -130,6 +130,6 @@ HTTP/1.1 is chosen as the minimum bar for interoperability, but other encodings This document has focused on using HTTP semantics, but HTTP may not be the common divisor amongst all transports (current and future). It may be desirable to use some other request-response semantics for your application-level protocol, perhaps something like rust-libp2p’s [request-response](https://docs.rs/libp2p/0.52.1/libp2p/request_response/index.html) abstraction. Nothing specified in this document prohibits mapping other semantics onto HTTP semantics to keep the benefits of using an HTTP transport. -As a simple example, to support the simple request-response semantics, the request MUST be encoded within a `POST` request to the proper URL (as defined in the Namespace section). The response is read from the body of the HTTP response. The client MUST authenticate the server and itself **before** making the request. The reason to chose `POST` is because this mapping makes no assumptions on whether the request is cacheable. If HTTP caching is desired users should either build on HTTP semantics or chose another mapping with different assumptions. +As a simple example, to support the simple request-response semantics, the request MUST be encoded within a `POST` request to the proper URL (as defined in the [Namespace](#namespace) section). The response is read from the body of the HTTP response. The client MUST authenticate the server and itself **before** making the request. The reason to chose `POST` is because this mapping makes no assumptions on whether the request is cacheable. If HTTP caching is desired users should either build on HTTP semantics or chose another mapping with different assumptions. Other mappings may also be valid and as long as nodes agree. From 95ffe6d4da0c03c0a833d4bff225915b12ba1243 Mon Sep 17 00:00:00 2001 From: Marco Munizaga Date: Mon, 3 Jun 2024 14:37:05 -0700 Subject: [PATCH 35/35] Update to http-path --- http/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/http/README.md b/http/README.md index d5bc09f7d..1038db280 100644 --- a/http/README.md +++ b/http/README.md @@ -136,10 +136,10 @@ string encoded representation of the multiaddr. E.g. the multiaddr `/ip4/1.2.3.4/udp/54321/quic-v1` would be represented as `multiaddr:/ip4/1.2.3.4/udp/54321/quic-v1`. -This URI can be extended to include HTTP paths with the `/httppath` component. +This URI can be extended to include HTTP paths with the `/http-path` component. This allows a user to make an HTTP request to a specific HTTP resource using a multiaddr. For example, a user could make a GET request to -`multiaddr:/ip4/1.2.3.4/udp/54321/quic-v1/p2p/12D.../httppath/.well-known%2Flibp2p`. This also allows +`multiaddr:/ip4/1.2.3.4/udp/54321/quic-v1/p2p/12D.../http-path/.well-known%2Flibp2p`. This also allows an HTTP redirect to another host and another HTTP resource. ## Using other request-response semantics (not HTTP)