Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection Gating #872

Closed
8 tasks done
aarshkshah1992 opened this issue Apr 6, 2020 · 14 comments
Closed
8 tasks done

Connection Gating #872

aarshkshah1992 opened this issue Apr 6, 2020 · 14 comments
Assignees

Comments

@aarshkshah1992
Copy link
Contributor

aarshkshah1992 commented Apr 6, 2020

See libp2p/go-libp2p-core#99 for the motivation for this issue.

TODO (PRs to be merged)

TODO (Work to be done)

@aarshkshah1992 aarshkshah1992 self-assigned this Apr 6, 2020
@aarshkshah1992 aarshkshah1992 changed the title Connection Gating and the go-away protocol [WIP Draft] Connection Gating and the go-away protocol Apr 6, 2020
@willscott
Copy link
Contributor

A timestamp in the GoAway message might not be a bad idea. The UUID is useful to prevent replays, but if the message ends up delayed on a queue / otherwise shows up much in the future, the node receiving it may want to be able to ignore it as out-of-date.

@aarshkshah1992
Copy link
Contributor Author

aarshkshah1992 commented Apr 6, 2020

As discussed with @raulk offline, we are going to call it the Disconnect protocol instead of the GoAway protocol and designing it will need a separate focused async discussion.

For now, the first win is to get in the connection gating work.

@aarshkshah1992 aarshkshah1992 changed the title [WIP Draft] Connection Gating and the go-away protocol Connection Gating Apr 6, 2020
@raulk
Copy link
Member

raulk commented Apr 6, 2020

@willscott I don't think I heard about the UUID idea yet -- where can I find more info?


Some notes from @aarshkshah1992 and my discussion.

  • Is Disconnect an ordinary protocol? Do we mount it as any other protocols? Or is Disconnect a message in a more elemental libp2p wire protocol?

    • We lack a Hello message that advertises capabilities/protocols/metadata.
    • Identify has kind of taken that role in our stack. But it's paradoxical. We pretend it's optional, but it has become rather central to the extent that other protocols now depend on it (e.g. DHT). This is a smell. IMO, it's a sign that we're missing an elemental wire/control protocol.
    • If we decide to go the control protocol path, would we reserve stream 0 for it?
  • Disconnect protocol: let's keep it simple. Fields:

    • Reason:
      • over capacity (source: conn manager, or swarm on when connection gater denies inbound connection)
      • temporarily blacklisted (source: adding a peer to a host-wide blacklist)
      • permanently blacklisted (source: adding a peer to a host-wide blacklist)
      • shutting down (source: upon swarm.Close())
      • unused: no streams open (source: conn manager, when a conn has 0 streams)
      • use other connection: we have multiple connections with the same peer, and we're closing one)
      • connection corrupted, e.g. multiplexer corrupted (TODO: how would we even deliver this?)
      • unresponsive: if the peer appears unresponsive (e.g. to a ping message).
      • some places to find inspiration: HTTP status codes, ETH1 rlpx disconnect reasons: https://github.com/ethereum/devp2p/blob/master/rlpx.md#disconnect-0x01.
    • Backoff period: discard; this leaks information, and facilitates attacks.
  • Disconnect user API design: TBD.

  • Concerning the connection gater, we need to unify it with the Filters, currently an IP-based accept/deny firewall. The connection gater abstraction is a lot more powerful because it can drive functionality like:

    • "do not accept inbound connections until X condition is satisfied", e.g. when a node is starting up and bootsrapping its routing table, blocking inbound connections until the system is warmed up would reduce the exposure to eclipse/poisoning attacks.
    • "outright reject incoming connections if at capacity". Right now we accept all inbound connections and wait for the connmgr to tick and regulate. Let the users choose.
  • The Filters somehow found their way into the Upgrader. This is not only conceptually questionable, but it also creates blind spots, e.g. transports that don't use the Upgrader, like QUIC and Bluetooth. Filters won't get applied to them.

  • The Upgrader calls the Filters first, before securing the connection and layering the multiplexer. On the one hand, this means that it won't have access to the peer's identity, nor would we be able to send a DISCONNECT message under a stream. On the other hand, it allows us to intercept the connection early (before crypto and muxer negotiation complete), if all we wanted to do is blacklist by IP.

@willscott
Copy link
Contributor

willscott commented Apr 6, 2020

@willscott I don't think I heard about the UUID idea yet -- where can I find more info?

in the previous revision of the OP of this issue

@aarshkshah1992
Copy link
Contributor Author

aarshkshah1992 commented Apr 7, 2020

@raulk @willscott Apologies, it was a WIP spec that I wanted to refine after the discussion with @raulk and so got rid of it for now to not confuse readers.

@aarshkshah1992
Copy link
Contributor Author

aarshkshah1992 commented Apr 7, 2020

@raulk

The Filters somehow found their way into the Upgrader. This is not only conceptually questionable, but it also creates blind spots, e.g. transports that don't use the Upgrader, like QUIC and Bluetooth. Filters won't get applied to them.

While the filters wont get applied for QUIC etc. in the upgrader, they will get applied before we incorporate the connection into the Swarm.

func (s *Swarm) addConn(tc transport.CapableConn, dir network.Direction) (*Conn, error) {
	// The underlying transport (or the dialer) *should* filter it's own
	// connections but we should double check anyways.
	raddr := tc.RemoteMultiaddr()
	if s.Filters.AddrBlocked(raddr) {
		tc.Close()
		return nil, ErrAddrFiltered
	}
....

The reason we seem to have injected these filters into the upgrader is to save the cost of upgrading a raw inbound connection as the Swarm can't help us there until connection has been upgraded.

@wenchaopeng
Copy link

sorry, may I ask when this feature can be completed?

@raulk
Copy link
Member

raulk commented May 13, 2020

@wenchaopeng likely this week.

@wenchaopeng
Copy link

@raulk received, you are doing a great thing, thank you!

@raulk
Copy link
Member

raulk commented May 15, 2020

Bulk of the work is completed here, and everything has landed onto corresponding master branches. I'm going to open follow-up issues for each TODO item, so we can close this epic.

@raulk
Copy link
Member

raulk commented May 15, 2020

This is done, but cannot be released until we implement #931, or we'll cause a regression in the QUIC transport.

@raulk raulk closed this as completed May 15, 2020
@prestonvanloon
Copy link
Contributor

Can we keep this issue open until the feature is released? Otherwise, I am not sure what to track for the availability of connection gating. Thanks!

@aarshkshah1992
Copy link
Contributor Author

@prestonvanloon That makes sense. The main thing left now before we release this is #931 which should land soon.

@raulk
Copy link
Member

raulk commented May 19, 2020

@prestonvanloon Released! Read all about it: https://github.com/libp2p/go-libp2p/releases/tag/v0.9.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants