Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

htlcswitch+lnwallet+channeldb: invalid sig fix #3872

Merged
merged 7 commits into from
Jan 23, 2020

Conversation

joostjager
Copy link
Contributor

@joostjager joostjager commented Dec 23, 2019

This PR persistently stores updates received from the remote party until we've produced a signature that includes these updates. Previously those updates were only stored in memory and dropped across restarts. This lead to the production of an invalid signature and channel force closure.

Fixes #3822
Fixes #3126

@joostjager joostjager force-pushed the invalid-sig-fix branch 6 times, most recently from a422d10 to 88163f4 Compare December 23, 2019 13:03
@joostjager joostjager force-pushed the invalid-sig-fix branch 14 times, most recently from fb66fd1 to 012a488 Compare January 6, 2020 13:55
When restoring an htlc fulfill update from disk, the payment hash wasn't
restored previously.
@joostjager joostjager force-pushed the invalid-sig-fix branch 2 times, most recently from c5d8cbb to 0e8a372 Compare January 6, 2020 14:15
@joostjager joostjager self-assigned this Jan 6, 2020
@joostjager joostjager added channels commitments Commitment transactions containing the state of the channel labels Jan 6, 2020
@joostjager joostjager added the database Related to the database/storage of LND label Jan 6, 2020
@joostjager joostjager marked this pull request as ready for review January 6, 2020 14:34
@Roasbeef Roasbeef added this to the 0.9.1 milestone Jan 7, 2020
Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR! I definitely think this is a nice way of solving this :)

My main concern is that we now restore the update logs from several sources: the local/remote commit, the pending remote commit and the dangling remote updates. Ideally we would reduce this, but that seems hard while still being backwards compatible, so this seems like the way to do it. We should just be careful that updates won't be restored from more that one of the sources, as that could lead to some subtle bugs.

lnwallet/channel.go Show resolved Hide resolved
lnwallet/channel_test.go Outdated Show resolved Hide resolved
channeldb/channel.go Outdated Show resolved Hide resolved
channeldb/channel.go Outdated Show resolved Hide resolved
localCommitmentHeight uint64) error {

if len(danglingRemoteUpdates) == 0 {
lc.log.Debugf("No dangling remote updates to restore")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug felt appropriate to me, given that this functionality is newly introduced. The log volume is still acceptable for users out there to collect. Do you think this is too spammy? We are already quite verbose on debug and this seemed to fit into that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could conditionally call this method instead (lift the if statement out to the caller).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed special casing completely. The for loop below is skipped and the sanity checks may be useful also in case there are no updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change uncovered that the sanity checks actually had false positives. A fee update may have already disappeared from the log, causing the last present update in the log not to equal the last committed update. Relaxed the check and brought it into the switch statement below.

lnwallet/channel.go Show resolved Hide resolved
lnwallet/channel.go Outdated Show resolved Hide resolved
lnwallet/channel.go Outdated Show resolved Hide resolved

switch payDesc.EntryType {
case Add:
lc.remoteUpdateLog.restoreHtlc(payDesc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some adds will be restored twice now, both when checking incoming HTLCs on our local commit, and here. But maybe that's okay?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... maybe it isn't necessary then to save and restore add msgs?

Copy link
Contributor Author

@joostjager joostjager Jan 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about this a bit more. The operation is idempotent, so it doesn't seem to hurt. But maybe we do need the save/restore for add msgs in case an htlc is added and later settled on the local commit. If we aren't able to sign during that time, we still need to process those messages later.

Although this is probably not even possible. We can only settle if we've also added the add to our remote commit tx. But it seems safer to do it like it is now.

channeldb/channel.go Outdated Show resolved Hide resolved
@joostjager
Copy link
Contributor Author

My main concern is that we now restore the update logs from several sources: the local/remote commit, the pending remote commit and the dangling remote updates.

Yes, fully agree to this. Maybe it would have been better to persist the complete local and remote update logs, to not have to think so hard about reconstruction.

@Roasbeef
Copy link
Member

Maybe it would have been better to persist the complete local and remote update logs, to not have to think so hard about reconstruction.

Any updates that aren't committed are to be forgotten upon reconnection. If we persisted all updates, we'd have issues w.r.t preemptively attempting to evaluate a commitment view using abandoned updates.

@Roasbeef
Copy link
Member

party until we've produced a signature that includes these updates

Haven't dug into the PR yet, but this seems incorrect. Instead, we should only store the updates when we receive a signature from the remote party.

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited to finally, finally (for real this time) squash this annoying bug. Completed an initial pass through the PR and may have found a bug that would cause us to store more than necessary (updates of theirs sent after their signature ).

lnwallet/channel.go Show resolved Hide resolved
lnwallet/channel_test.go Show resolved Hide resolved
channeldb/channel.go Outdated Show resolved Hide resolved
lnwallet/channel.go Show resolved Hide resolved
lnwallet/channel.go Outdated Show resolved Hide resolved
localCommitmentHeight uint64) error {

if len(danglingRemoteUpdates) == 0 {
lc.log.Debugf("No dangling remote updates to restore")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could conditionally call this method instead (lift the if statement out to the caller).

lnwallet/channel.go Outdated Show resolved Hide resolved
lnwallet/channel.go Show resolved Hide resolved
lnwallet/channel.go Show resolved Hide resolved
lnwallet/channel.go Outdated Show resolved Hide resolved
Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one is getting close!

I really like the way the commits are structured, with a simple buildup of commits before the last, important commit, where the real meat of the change is found.

lnwallet/channel_test.go Outdated Show resolved Hide resolved
channeldb/channel.go Show resolved Hide resolved
channeldb/channel.go Outdated Show resolved Hide resolved
lnwallet/channel.go Show resolved Hide resolved
lnwallet/channel.go Outdated Show resolved Hide resolved
lnwallet/channel.go Show resolved Hide resolved
lnwallet/channel.go Outdated Show resolved Hide resolved

// Fetch the last remote updates that have been committed on both sides.
lastRemoteCommitted := lc.remoteCommitChain.tip().theirMessageIndex
lastLocalCommitted := lc.localCommitChain.tip().theirMessageIndex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this should be tail since that is the oldest unrevoked commitment == the last index we acked. In this case I don't think it matters since we will always revoke after receiving commitment, but I think it'll be more correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

This test asserts that remote updates that are locked-in on the local
commitment, but haven't been signed for on the remote commitment, are
properly restored after a restart.
This method is only used to update the local commitment transaction.
Updated comment accordingly.
Extract functionality to methods as a preparation for serializing remote
log updates.
Extract method in preparation for restoring dangling remote updates. We
need to get rid of the early return.
This commit updates the channel state machine to
persistently store remote updates that we have received a
signature for, but that we haven't yet included in a commit
signature of our own.

Previously those updates were only stored in memory and
dropped across restarts. This lead to the production of
an invalid signature and channel force closure. The remote
party expects us to include those updates.
Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💯


lc.localUpdateLog.markHtlcModified(payDesc.ParentIndex)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: also add sanity check that the last update index matches the log index after all pending updates are restored? (doesn't have to be in this method, just after both state logs are fully restored)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so I discovered that this does not always need to be the case. #3872 (comment)

@@ -3092,6 +3277,96 @@ func (lc *LightningChannel) createCommitDiff(
}, nil
}

// getUnsignedAckedUpdates returns all remote log updates that we haven't
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "that we have acked, but haven't..."

@Roasbeef
Copy link
Member

Running this patch on testnet now.

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💴

chanID := lnwire.NewChanIDFromOutPoint(&lc.channelState.FundingOutpoint)

// Fetch the last remote update that we have signed for.
lastRemoteCommitted := lc.remoteCommitChain.tip().theirMessageIndex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 w.r.t the slight reshuffling of logic here, I find it much easier to follow now.

@joostjager joostjager merged commit 1413995 into lightningnetwork:master Jan 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
channels commitments Commitment transactions containing the state of the channel database Related to the database/storage of LND
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LND sends an invalid sig on reconnection Concurrency issue with update_fee (round 2)
3 participants