-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle monitor update failures in two more places + new fuzz test #290
Handle monitor update failures in two more places + new fuzz test #290
Conversation
691bf82
to
72dd991
Compare
@@ -2601,6 +2601,25 @@ impl ChannelMessageHandler for ChannelManager { | |||
true | |||
}) | |||
} | |||
pending_msg_events.retain(|msg| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm wandering, if client send stale message before channel_reestablish ones, isn't this mean that a code path is wrong somewhere ? Or it's not something we can be sure of because of the library architecture ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not strictly. In theory a user could get a disconnect from a peer, reconnect to that peer, and then process events, seeing the message send events for a given node_id (and not connection). I'm not sure if there aren't other related races, but at least this improves things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Yes sure better to be careful and it cost nothing there. Moreover if user doesn't take our peer_handler, it may break some of our assumptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More learnt than reviewed, monitor failures interaction with channel state machine are really interesting.
As you said, there is only left pre-ChannelFunded unimplemented! cases (and fee). Had a look to be sure, monitor update failure doesn't interfere with closing phase, at least after there is no HLTCs left on commiment_transaction. We can add a little test with a failure after shutdown have been exchanged on both side, but should be alright.
commitment_signed: commitment_msg, | ||
}, | ||
}); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build without unreachable branch (1.22.0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it will build without it, but having it in there means if there is a refactor in the future that breaks things we'll see panics instead of losing funds :p.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah astute! Maybe add a comment there with your answer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually gonna leave it. We use this pattern in a few places, and hopefully "unreachable!()" is rather self-documenting. In general, across the library, we'd prefer to panic than have a potentially money-losing bug, so this isn't particularly out of the ordinary.
//TODO: Do something with e? | ||
return | ||
}, | ||
} else { unreachable!(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build without unreachable branch (1.22.0)
4a2c20d
to
bfe9f1d
Compare
Best reviewed with -b
This shouldn't be required, but it may help prevent some downstream race conditions due to clients not sending message events quickly enough and trying to send stale messages before new channel_reestablish messages.
This is an oversight as the MessageSendEvent is otherwise entirely useless.
Sadly this requires reducing the honggfuzz iterations to fit within Travis' runtime limits.
bfe9f1d
to
49d6330
Compare
Based on #285, #286, and #288, this adds handling of monitor update failures in two more places, with some trivial tests to check sanity thereof. It also finally adds the fuzz test which found the issues in #286 and #288, though it really wants rust-bitcoin/rust-secp256k1#89 to fully test things. This leaves only three non-fee-handling unimplemented!()s in ChannelManager (and they're all during channel setup, so are much easier to handling).