-
Notifications
You must be signed in to change notification settings - Fork 37
Conversation
@diasdavid @pgte @vasco-santos I've still got some cleanup to do per my notes in the PR overview, but I'd love to get some feedback on the overall implementation of the FSM. There may be some considerations I've missed that we could roll into this update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with some questions / comments.
if (circuitTried) { | ||
this.emit('error', new Error(`No available transports to dial peer ${this.theirB58Id}!`)) | ||
return this._state('disconnect') | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These last two conditions can dictate the end of the dialing attempt. As a client of the switch API, I'd like to now the reason why a dial failed. "No available transport" or "Circuit not enabled" doesn't tell me much of why the other transports failed.
As a remedy, I suggest that we keep track of the failing reasons and make them somehow available to the switch user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an alternative, emit a warning-like event like "connection attempt failed", containing an error object as payload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the warning option as that would provide the user with more information about the various attempts, even if connections succeed on a later transport.
With the current structure the switch would need to emit the events. Perhaps the payload should also include the peerInfo so users understand where it came from. Something like:
_switch.emit('error:connection_attempt_failed', {
peerInfo: theirPeerInfo,
error: ...
})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. This dialing is triggered by a switch.dial()
, which is callback-based..
Having the switch emit these events independently of the dial() call means that, for the API user, it will be hard to correlate a call with these warnings.
Usually I solve this by either returning an even stream or an event emitter. In this case, I think that something like a dialFSM would be something useful to consider...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Currently Libp2p
doesn't return anything from its dial methods so returning an event emitter there from a Switch
dialFSM would enable us to keep backwards compatibility more easily while we transition the system.
For js-ipfs we'd also need to make sure to return that emitter from swarm.connect()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pgte I've added a dialsFSM
method to the switch, that can be used instead of dial
. Instead of a connection, it will return the fsm. I've added some tests for it to show the events and functionality it has https://github.com/libp2p/js-libp2p-switch/pull/278/files#diff-080e21482eea9ffdd7ee511eb71d3522.
The main things to note:
The raw connection is still returned in the callback passed to dialFSM
- Users can listen on
error
events to get fatal messages - Users can listen on
error:connection_attempt_failed
events to get a list of errors for failed transport dials, these are not fatal to the connection. - Users can listen on
close
events to know when the specific connection is closed. - Users can call
fsm.close()
to close that specific connection.
Are there other events or methods that would be helpful to have? We can always wait to add other things if there aren't higher priority nice to haves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jacobheun That's great! Other than successful connection attempt (which you can infer from having the "connected" callback called), I don't see any other so far.
src/dialer.js
Outdated
connection.on('connected', () => connection.protect()) | ||
connection.on('private', () => connection.encrypt()) | ||
connection.on('encrypted', () => connection.upgrade()) | ||
connection.on('muxed', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can all these events occur more than once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, they should only occur once, I'll update these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good to me! Left just a few minor comments
src/dialer.js
Outdated
const peerInfo = getPeerInfo(peer, _switch._peerBook) | ||
const b58Id = peerInfo.id.toB58String() | ||
|
||
log(`${_switch._peerInfo.id.toB58String().slice(0, 8)} dial request to ${b58Id.slice(0, 8)} with protocol ${protocol}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verify if protocol
exists and if not add unknown
instead of allowing undefined
src/index.js
Outdated
log('The switch has stopped') | ||
this.emit('stopped') | ||
}) | ||
this.state.on('error', (err) => this.emit('error', err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aiming to ease debuggability, I think we should log the error here as well.
src/index.js
Outdated
@@ -167,7 +167,7 @@ class Switch extends EE { | |||
|
|||
/** | |||
* If a muxed Connection exists for the given peer, it will be closed | |||
* and its reference on the Switch will be removed. | |||
* and its reference on the Switch and will be removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo? I think it was correct before 🤔
src/index.js
Outdated
|
||
conn.muxer.end((err) => { | ||
// If OK things are fine, and someone just shut down | ||
if (err && err.message !== 'Fatal error: OK') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log the error?
src/limit-dialer/queue.js
Outdated
this._dialWithTimeout(transport, addr, (err, conn) => { | ||
if (err) { | ||
log('work:error') | ||
log(`${transport.constructor.name}:work:error`, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use log.error
here
fix: better support custom handlers
7a5d1b0
to
56ea400
Compare
After reviewing the crypto logic in switch as it currently is, we're not actually supporting multiple crypto modules on dial, only on listen. This needs to be corrected, but after doing some small changes in this branch I think it should wait until this is done as it's a significant change. That will enable us to review those changes more easily to ensure they're done properly. Also, since secio is currently our only crypto library in libp2p this isn't a critical need. |
- [`switch.stop(callback)`](#swarmclosecallback) | ||
- [`switch.connection`](#connection) | ||
- [`switch.dial(peer, protocol, callback)`](#switchdialpeer-protocol-callback) | ||
- [`switch.dialFSM(peer, protocol, callback)`](#switchdialfsmpeer-protocol-callback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better just mimic js-libp2p and do dial and dialProtocol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@diasdavid are you suggesting to add dialProtocol
in addition to the new dialFSM
or as a naming replacement? I think having dial
handle calls with and without a protocol is reasonable overloading. Replacing it wouldn't be good as it has a different callback footprint to give users access to the ConnectionFSM for finer control/handling.
I've exposed the same dialFSM
method in the libp2p branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to block this PR for this. What I mean is that given that libp2p (the module) is just an extension to the libp2p-switch (the dialing machine) that adds it other features, it would be cool if they both expose the same dialing/hangup APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. Ideally I'd like to deprecate dialProtocol
in libp2p and just use dial
. Having both feel unnecessary. I'll propose the change in an issue there and if dialProtocol
ends up staying we can add that in here.
@vasco-santos this should be ready for final review. I have this branch integrated in libp2p/js-libp2p#257, which should also be ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jacobheun all in all looks good to me 🙂
Left just minor notes
README.md
Outdated
- [`switch.unhandle(protocol)`](#switchunhandleprotocol) | ||
- [`switch.start(callback)`](#switchstartcallback) | ||
- [`switch.stop(callback)`](#switchstopcallback) | ||
- [`switch.connection`](#switchconnection) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please alphabetize the API? I believe this helps when looking for documentation.
src/index.js
Outdated
* Issues a start on the Switch state. | ||
* | ||
* @param {function} callback deprecated: Listening for the `error` and `start` events are recommended | ||
* @fires Switch#started |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fires Switch#started
?
src/index.js
Outdated
* A listener that will turn off all running services and listeners | ||
* | ||
* @private | ||
* @fires Switch#error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
@vasco-santos I updated the order in the readme and fixed the improper usage of jsdocs |
Getting this pushed up for visibility while I work on cleanup and add some additional tests.
Goal
This PR updates libp2p-switch to be a state machine, including its connections #24.
Updates
start
,stop
, anderror
events. Callback legacy support is included.err-codes
moduleTODOs
class-is
module and remove instanceof logic, as it's unreliable across package versions.Thoughts on Future Additions
References
Required by libp2p/js-libp2p#257