Skip to content
This repository has been archived by the owner on Sep 9, 2022. It is now read-only.

p2p-circuit v2 #125

Merged
merged 66 commits into from
Jul 6, 2021
Merged

p2p-circuit v2 #125

merged 66 commits into from
Jul 6, 2021

Conversation

vyzo
Copy link
Contributor

@vyzo vyzo commented Feb 4, 2021

v2 of the protocol and Limited Relays.

Depends on:

TBD:

  • spec (duh!)
  • reservation vouchers
  • godocs
  • unit tests
  • Update and propagate released versions of deps.
  • test in the wild

An Overview of the p2p-circuit/v2 Protocol

The relay protocol has been split into two protcols: the Hop protocol and the Stop protocol.

  • the Hop protocol governs communication with the relay for reservations and circuit connection initiation.
  • the Stop protocol governs termination of circuit connections.

Establishing a relayed connection

Let's suppose that a peer A wants to connect to peer B through relay peer R.

Reservations

Peer B must first establish a reservation by connecting (and maintaining a connection) to R.

This is done using the Hop protocol and sending a Reserve message. The relay responds with OK if the reservation is accepted and has a specific TTL, after which the reservation will dissipate.

Peer B is responsible for refreshing its reservation when it reaches the end of the TTL.
In the relay logic, reservations are governed by strict resource limits and ACLs.

Connection Establishment

Peer A connects to Relay R, opens a Hop protocol stream and sends a CONNECT message.

Relay R must have a reservation and connection to peer B. Active relay has been deprecated in the code, but the protocol allows for implementation without any change in the wire protocol.

Once R receives the CONNECT message, it opens a Stop protocol stream to B and sends a CONNECT message.
Peer B responds with OK; the relay R responds with OK to peer A and bridges the two streams in order to establish the relayed connection.

Connection Limits

Relays are allowed (and encouraged! the relay implementation sets strict limits by default) to limit the relayed connection, by time and bytes relayed in each direction.
The limits are conveyed to both sides of a relayed connection; relayed connections are marked as transient by the transport.

Implementation Details

Implementation lives under v2.

The Stop protocol implementation lives in v2/client and provides a Transport suitable for attachment to a host. The implementation mounts the v1 protocol and provides backwards compatibility in dialing v1 peers and accepting relayed connections from v1 peers through v1 relays.

The Hop protocol implementation lives in v2/relay and provides a Relay object that can be instantiated on deman. When instantiated, the Relay registers a stream handler from the v2 hop protocol.

The two are completely independent. The client package implements the transport and is the point of integration with go-libp2p. The relay package is the actual relay and can be used in dedicated relays and (as planned for Project Flare) public DHT servers. The default resource limits are very much constrained which makes it safe for public nodes.

@vyzo
Copy link
Contributor Author

vyzo commented Feb 5, 2021

Summoning @Stebalien @raulk @yusefnapora @aarshkshah1992 -- this is ready for (initial) review!

@vyzo
Copy link
Contributor Author

vyzo commented Feb 12, 2021

Added tests and godocs.

Copy link
Contributor

@aarshkshah1992 aarshkshah1992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vyzo

Did a first pass on the Relay code. Will move to the client now.

Awesome stuff man !

v2/pb/circuit.proto Show resolved Hide resolved
v2/util/io.go Show resolved Hide resolved
@@ -0,0 +1,67 @@
package util

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to write unit tests for this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meh.... if I have nothing better to do.

v2/relay/acl.go Show resolved Hide resolved
v2/relay/acl.go Outdated Show resolved Hide resolved
v2/relay/relay.go Outdated Show resolved Hide resolved
v2/relay/relay.go Show resolved Hide resolved
v2/relay/relay.go Show resolved Hide resolved
v2/relay/relay.go Show resolved Hide resolved
v2/client/client.go Outdated Show resolved Hide resolved
Copy link
Contributor

@aarshkshah1992 aarshkshah1992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vyzo Finished first pass of client too.

Will give all this one more in-depth look tomorrow. But looking amazing so far.

v2/client/conn.go Show resolved Hide resolved
v2/relay/relay.go Show resolved Hide resolved
}

func (c *Client) Listen(addr ma.Multiaddr) (transport.Listener, error) {
// TODO connect to the relay and reserve slot if specified
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should happen in AutoRelay, right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a different topic -- it is for explicit relay address listening, when/if we ever support it.

v2/client/reservation.go Show resolved Hide resolved
}

// TODO: is it okay to cast c.Conn().RemotePeer() into a multiaddr? might be "user input"
func (c *Conn) RemoteMultiaddr() ma.Multiaddr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can we add a unit test for this one function ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh.... maybe, I am not inclined to do it right now.

v2/client/dial.go Show resolved Hide resolved
v2/client/dial.go Show resolved Hide resolved
v2/client/dial.go Show resolved Hide resolved
v2/client/handlers.go Show resolved Hide resolved
v2/client/dial.go Show resolved Hide resolved
Copy link
Contributor

@aarshkshah1992 aarshkshah1992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more changes.

Need to do one last round of an intense review including all tests and I'll be done.

v2/relay/acl.go Outdated Show resolved Hide resolved
v2/relay/resources.go Outdated Show resolved Hide resolved
v2/util/io.go Show resolved Hide resolved
v2/util/pbconv.go Show resolved Hide resolved
v2/util/pbconv.go Show resolved Hide resolved
v2/util/pbconv.go Show resolved Hide resolved
v2/util/pbconv.go Show resolved Hide resolved
@libp2p libp2p deleted a comment from vyzo Feb 16, 2021
@libp2p libp2p deleted a comment from vyzo Feb 16, 2021
Copy link
Contributor

@aarshkshah1992 aarshkshah1992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished last round of review.

Will take a look at the tests tomorrow but looks great otherwise.

}
}

if r.rc.Limit != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely add reference counting for all go-routines we are spawning here and ensure they all finish before we return from Close to ensure proper cleanup for the caller.

Copy link
Contributor Author

@vyzo vyzo Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's just not worth it and also the goroutines may never finish because it's an unlimited relay.
So, no.

Copy link
Contributor

@aarshkshah1992 aarshkshah1992 Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is the right approach to take. We can easily stop relaying on unlimited relays by closing connections to all peers we have opened circuits for.

When Close returns, the caller expects that the subsystem has cleaned-up after it itself. Not doing so can leak go-routines if the caller dosen't know this implementation detail.

cc @Stebalien Isn't this in-line with what Close does for other sub-systems in libp2p.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you insist on closing the connections it is much easier to tag the Hop/Stop streams and go and reset them; there is no tracking needed and it is much easier to implement (and doesn't suffer from the problem of unlimited relay conns which will never exit).

Waiting for the goroutines is irrelevant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we don't even have to tag streams, which could be a little problematic with the current APIs; we can just look for incoming Hop streams and outgoing Stop streams and reset them.

v2/relay/relay.go Show resolved Hide resolved
v2/relay/relay.go Show resolved Hide resolved
v2/relay/relay.go Show resolved Hide resolved
v2/client/dial.go Show resolved Hide resolved
var (
ErrNoIP = errors.New("no IP address associated with peer")
ErrTooManyPeersInIP = errors.New("too many peers in IP address")
ErrTooManyPeersInASN = errors.New("too many peers in ASN")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably don't need to export any of these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the only reason I exported them is so that the test could could live together with the other tests in the test directory.
It felt odd to have just one test in here.


enum Status {
OK = 100;
RESERVATION_REFUSED = 200;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about splitting Status into a HopMessage specific Status and a StopMessage specific Status like done with Type? The split would enforce protocol semantics at the datastructure level. E.g. it encodes the semantic that one could not receive a RESERVATION_REFUSED when sending a StopMessage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could I guess, but there would be a lot of duplication; I don't feel strong about it nonetheless.

@Stebalien
Copy link
Member

Relay R must have a reservation and connection to peer B. Active relay has been deprecated in the code, but the protocol allows for implementation without any change in the wire protocol.

Can we include a flag in the reservation response that specifies whether or not the relay supports active relays? That way the peer can disconnect if it sees that active relay is supported.

We can also just punt this and do it later as it should be backwards compatible.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Attacker connects.
  2. Attacker reserves.
  3. Attacker uses entire budget.
  4. Attacker disconnects.
  5. Attacker re-rolls their peer ID and starts at 1.

I think we need:

  1. A bit of local history (i.e., a per-peer/ip range cooldown period).
  2. A global rate-limit.
  3. Ideally, some form of "uses less than X bits/s" that can prevent large spikes.

v2/proto/voucher.go Outdated Show resolved Hide resolved
v2/proto/voucher.go Outdated Show resolved Hide resolved
v2/proto/voucher.go Outdated Show resolved Hide resolved
v2/proto/voucher.go Outdated Show resolved Hide resolved
v2/proto/voucher.go Outdated Show resolved Hide resolved
asn, _ := asnutil.Store.AsnForIPv6(ip)
peersInAsn = ipcs.asns[asn]
if len(peersInAsn) >= ipcs.asnlimit {
return ErrTooManyPeersInASN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 (but I can still just repeatedly reconnect)

buf := pool.Get(r.rc.BufferSize)
defer pool.Put(buf)

limitedSrc := io.LimitReader(src, limit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we had some form of rate-limit as well instead of just an absolute limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, although not exactly clear how to do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. The current limit is 128KiB (or something like that). That should be low enough as it is.

Comment on lines +353 to +371
dest.CloseWrite()
if count == limit {
// we've reached the limit, discard further input
src.CloseRead()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should reset both streams if there's an error here, not close them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

limit reached is not an error, but yes, we should check if there is one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not an error, but we should still reset. If we send a close, it'll look like a clean close. We should only send an EOF if we receive an EOF.

result := &Reservation{}
result.Expiration = time.Unix(rsvp.GetExpire(), 0)

for _, ab := range rsvp.GetAddrs() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about requiring at least one address? As far as I can tell a reservation without a single address of the relay node is useless as one could not announce its relayed address to anyone else, correct?

v2/relay/relay.go Outdated Show resolved Hide resolved
@marten-seemann
Copy link
Contributor

Changed the merge target from master to circuit-v2. Further development will happen in this feature branch.

@marten-seemann marten-seemann changed the title [WIP] p2p-circuit v2 p2p-circuit v2 Jul 6, 2021
@marten-seemann marten-seemann merged commit 9da0b93 into circuit-v2 Jul 6, 2021
@vyzo vyzo deleted the v2 branch August 5, 2021 19:36
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants