-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow overshooting final htlc amount and expiry #2062
Allow overshooting final htlc amount and expiry #2062
Conversation
e82a6b1
to
e433c3f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes for anyone reviewing :)
Codecov ReportPatch coverage:
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more Additional details and impacted files@@ Coverage Diff @@
## main #2062 +/- ##
==========================================
- Coverage 91.40% 91.36% -0.05%
==========================================
Files 102 101 -1
Lines 49563 49075 -488
Branches 49563 49075 -488
==========================================
- Hits 45303 44837 -466
+ Misses 4260 4238 -22
... and 31 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why specifically only allow overshooting by the smallest HTLC amount? The spec change doesn't say that there should be a limit on overpayment.
Seems like it would be a much simpler change to just accept if > total_msat
? Also prevents fingerprinting of LDK based nodes (if everyone else accepts overpayment of any amount and LDK doesn't, that identifies it).
I was mainly going off the part in the rationale where it says "However, no individual HTLC may be for less than the difference between the total paid and
Yea that's a good point. In hindsight it might've been better to start with the simpler change and ask if the extra restriction should be added. I haven't checked what the other implementations do so I can look into that soon. If other people agree I can remove the extra restriction :) |
Just thinking more about this, why would nodes want to overshoot by an entire HTLC? If nodes overshoot by that much, I figure they should just make another payment? Since relaxing this requirement is mainly to serve nodes retrying payments using a route with a constraining |
I agree that this is pretty unlikely, and they should just send another payment if they want to over pay. I just don't think it's valuable to enforce that, since they do no harm by giving us more money and the simpler change is (iiuc) still spec-compliant :) |
Yea, I think you're right. Also I realized my added restriction would actually end up failing some payments that should go through (in cases where a node retries a partial payment and to route it needs to overshoot by the value of a previous HTLC, now the whole payment fails - probably very uncommon, but nevertheless). I'll try to make this change sometime today or tomorrow, thanks for the good points @freddiekrugerrand! Also I explored the other implementations and it looks like CLN and LND don't support overshooting yet, while eclair allows overshooting by any amount. |
e433c3f
to
f741d7c
Compare
Made the aforementioned changes, removed some tests that no longer seem necessary, and updated the PR description to reflect these changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Sorry for the review delay here.
Does it make sense to also work on lightning/bolts#1040 here or would you rather do that separately/have someone else do that?
While I agree in theory, I don't really think we can practically consider this part of our trust model. There's so many things in lightning that allow identification (eg just the feature flags themselves) that I think we're kinda hosed :/. |
Sure! I noticed #1872 includes lightning/bolts#1032 alongside these and it overlaps with lightning/bolts#1040 so it probably won't be hard to include that as well.
|
f741d7c
to
5b314cd
Compare
Added in changes for failing HTLCs for already claimable payments, as well as for lightning/bolts#1032 and lightning/bolts#1040. Might've missed something, but several of the changes from the spec PRs seemed to already be present in the code. In case it helps save anyone some time looking, here's where I found some of the checks: Checking For 1040 the HTLC values that are being used to calculate the MPP set total were already coming from the |
5b314cd
to
02d1867
Compare
total_msat
for an MPPThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this, sorry about the review delay.
lightning/src/ln/channelmanager.rs
Outdated
@@ -3935,7 +3939,7 @@ where | |||
log_info!(self.logger, "Attempted to claim an incomplete payment which no longer had any available HTLCs!"); | |||
return; | |||
} | |||
if claimable_amt_msat != expected_amt_msat.unwrap() { | |||
if claimable_amt_msat < expected_amt_msat.unwrap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, this is confusing, so we've told the user they have (real MPP part total) in PaymentClaimable
, but now we're sanity-checking on the amount the sender told us they intended to send (which should have matched our invoice).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, it does look redundant. I tried taking it out to see and it seems it's relevant in the functional test test_double_partial_claim
, in the case where HTLCs timeout before the user claims their funds, then the sender resends only part of the payment, and then the user tries to process the original PaymentClaimable
event and claim the payment. I guess were you talking about if the check was necessary or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think we should keep this check, my point was that the check should switch to comparing against the amount from the PaymentClaimable
- the total received - not the amount from the onions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh gotcha thanks for the clarification
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think this was fixed? expected_amt_msat
is still set to whatever the onions had in them (via htlc.total_msat
, who's definition changed in this PR to be the onion data, not the htlc total - can we rename it or at least add a comment?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC I don't think total_msat was changed in this PR, just the validation against it, so I believe htlc.total_msat/expected_amt_msat was always the value in the onion?
total_msat
is still calculated as htlc.total_msat
for MPP and outgoing_amt_msat
for non-MPP, the problem is outgoing_amt_msat
has changed here - its now the onion data rather the incoming amount. But, more importantly, the event has changed - the event now tells the user about the actual amount, not the onion amount, and we're still checking against...not the actual amount.
Now that I get what you're saying however, I agree this check would make more sense if expected_amt_msat matched the amount reported in PaymentClaimable. Happy to add this here, or in a followup.
Yep, that was my point! Let's just do it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total_msat
is still calculated ashtlc.total_msat
for MPP andoutgoing_amt_msat
for non-MPP, the problem isoutgoing_amt_msat
has changed here - its now the onion data rather the incoming amount. But, more importantly, the event has changed - the event now tells the user about the actual amount,
Okay, noww I see. Thank you so much for explaining!
Let's just do it here?
On it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this again, should I actually just change the amount reported in PaymentClaimable
back to be the amount from the onion (i.e. htlc.total_msat
for MPP and outgoing_amt_msat
for non-MPP)? I'm thinking that will accomplish we want here: when claiming funds, expected_amt_msat
will represent what was reported to the user in PaymentClaimable
. The downside with this is that the event reports the amount from the invoice, while the actual amount may overshoot that.
I'm asking this because as I was trying to change expected_amt_msat
to match PaymentClaimable
now that PaymentClaimable
reports the actual amount received, I'm realizing it introduces this case: a node overpays the original amount, PaymentClaimable
is generated with the actual received amount, one/some of the HTLCs timeout before the receiver handles the event, now the sender resends HTLCs however now these HTLCs pay enough for the invoice but below expected_amt_msat
, i.e. the amount previously reported in PaymentClaimable
, so it refuses to claim. I'm thinking it makes more sense to be able to claim as long as we've received at least the amount originally requested in the invoice, and PaymentClaimable
should just report that and sometimes the user will end up claiming a little extra?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this again, should I actually just change the amount reported in PaymentClaimable back to be the amount from the onion (i.e. htlc.total_msat for MPP and outgoing_amt_msat for non-MPP)? I'm thinking that will accomplish we want here: when claiming funds, expected_amt_msat will represent what was reported to the user in PaymentClaimable. The downside with this is that the event reports the amount from the invoice, while the actual amount may overshoot that.
Right, I don't think so - I think we should report accurate numbers to the user and then we should act on those numbers. We could also report the sender-reported amount to the user if we want, but I think that's confusing (its neither the amount they received nor the amount they specified in the invoice...but something else).
a node overpays the original amount, PaymentClaimable is generated with the actual received amount, one/some of the HTLCs timeout before the receiver handles the event, now the sender resends HTLCs however now these HTLCs pay enough for the invoice but below expected_amt_msat
That's a good point, but I don't think we need to worry too much about this. If it causes payment failure, so be it - its generally not something that should ever happen HTLC timeouts after receiving them should be relatively rare, only when they were super tight to begin with and we get a block quickly thereafter. We could also consider making it arbitrarily rare by adding an extra block or two of headroom when accepting the HTLC vs when we go to time it out (I think I'll probably do that in a followup as a part of #2017 anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also report the sender-reported amount to the user if we want, but I think that's confusing (its neither the amount they received nor the amount they specified in the invoice...but something else).
Ah, true.
That's a good point, but I don't think we need to worry too much about this. If it causes payment failure, so be it - its generally not something that should ever happen HTLC timeouts after receiving them should be relatively rare, only when they were super tight to begin with and we get a block quickly thereafter. We could also consider making it arbitrarily rare by adding an extra block or two of headroom when accepting the HTLC vs when we go to time it out (I think I'll probably do that in a followup as a part of #2017 anyway).
Okay gotcha, sounds good :)
I appreciate the feedback! Will try to update this soon |
02d1867
to
ea8c945
Compare
Added a test for overshooting final cltv expiry, and went back to having an To make |
827a49e
to
84685b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Just nits
84685b9
to
14c0d88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me! Can you squash down the fixup commits and I think we can land this.
14c0d88
to
f2bbcb3
Compare
Awesome, thanks everyone for the review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff!!
@@ -7925,6 +7925,101 @@ fn test_can_not_accept_unknown_inbound_channel() { | |||
} | |||
} | |||
|
|||
#[test] | |||
fn test_onion_value_mpp_set_calculation() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test fails as expected on main :)
Just one comment - #2062 (comment) |
f2bbcb3
to
2e684ee
Compare
Just added a commit before everything to be able to validate on the amount reported in |
2e684ee
to
17565c3
Compare
lightning/src/ln/channelmanager.rs
Outdated
onion_payload: OnionPayload, | ||
timer_ticks: u8, | ||
/// The sum total of all MPP parts | ||
// The total value received for a payment (sum of all MPP parts if the payment is a MPP). | ||
// Gets set to the amount reported when pushing [`Event::PaymentClaimable`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: still make it a doc comment (/// instead of //) because cargo doc --document-private-items
will still show docs for it then.
log_trace!(self.logger, "Failing HTLC with payment_hash {} as payment is already claimable", | ||
log_bytes!(payment_hash.0)); | ||
fail_htlc!(claimable_htlc, payment_hash); | ||
} else if total_value >= $payment_data.total_msat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, we also need to change the timeout check, that code has to match this condition exactly, look for this elsewhere:
if htlcs[0].total_msat == htlcs.iter().fold(0, |total, htlc| total + htlc.value) {
return true;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you fix this, probably best to leave a comment both there and here noting that they need to match exactly, so that the next person who touches either line knows to update the other one :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Think we're just missing the comment here that references back to the timeout (which you've commented already) :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it above the first if statement because I thought it'd be weird with the indenting to include it in inside the branch above - in hindsight I think that actually would've been better, but since it's there at least I think I'll leave it as is :)
17565c3
to
82f1ae7
Compare
Feel free to squash the fixup commits, we're good this time, I'm pretty sure :) |
82f1ae7
to
e82e058
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have limited understanding of channelmanager, but lgtm + nicely tested/structured change
This is pre-work for allowing nodes to overshoot onion values and changing validation for MPP completion. This adds a field to `ClaimableHTLC` that is separate from the onion values, which represents the actual received amount reported in `PaymentClaimable` which is what we want to validate against when a user goes to claim.
While retrying a failed path of an MPP, a node may want to overshoot the `total_msat` in order to use a path with an `htlc_minimum_msat` greater than the remaining value being sent. This commit no longer fails MPPs that overshoot the `total_msat`, however it does fail HTLCs with the same payment hash that are received *after* a payment has become claimable.
Final nodes previously had stricter requirements on HTLC contents matching onion value compared to intermediate nodes. This allowed for probing, i.e. the last intermediate node could overshoot the value by a small amount and conclude from the acceptance or rejection of the HTLC whether the next node was the destination. This also applies to the msat amount, however this change was already present.
If routing nodes take less fees and pay the final node more than `amt_to_forward`, the receiver may see that `total_msat` has been met before all of the sender's intended HTLCs have arrived. The receiver may then prematurely claim the payment and release the payment hash, allowing routing nodes to claim the remaining HTLCs. Using the onion value `amt_to_forward` to determine when `total_msat` has been met allows the sender to control the set total.
e82e058
to
1d31b0e
Compare
Rebased because of conflicts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Feel free to just address that nit in a follow-up if you want. Don't want to hold this back longer just for that :)
log_trace!(self.logger, "Failing HTLC with payment_hash {} as payment is already claimable", | ||
log_bytes!(payment_hash.0)); | ||
fail_htlc!(claimable_htlc, payment_hash); | ||
} else if total_value >= $payment_data.total_msat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Think we're just missing the comment here that references back to the timeout (which you've commented already) :)
Merged since I want to pull these changes into some of my PRs, but feel free to address @dunxen's comment in a follow-up (or not, since it's a nit :)) |
0.0.115 - Apr 24, 2023 - "Rebroadcast the Bugfixes" API Updates =========== * The MSRV of the main LDK crates has been increased to 1.48 (lightningdevkit#2107). * Attempting to claim an un-expired payment on a channel which has closed no longer fails. The expiry time of payments is exposed via `PaymentClaimable::claim_deadline` (lightningdevkit#2148). * `payment_metadata` is now supported in `Invoice` deserialization, sending, and receiving (via a new `RecipientOnionFields` struct) (lightningdevkit#2139, lightningdevkit#2127). * `Event::PaymentFailed` now exposes a failure reason (lightningdevkit#2142). * BOLT12 messages now support stateless generation and validation (lightningdevkit#1989). * The `NetworkGraph` is now pruned of stale data after RGS processing (lightningdevkit#2161). * Max inbound HTLCs in-flight can be changed in the handshake config (lightningdevkit#2138). * `lightning-transaction-sync` feature `esplora-async-https` was added (lightningdevkit#2085). * A `ChannelPending` event is now emitted after the initial handshake (lightningdevkit#2098). * `PaymentForwarded::outbound_amount_forwarded_msat` was added (lightningdevkit#2136). * `ChannelManager::list_channels_by_counterparty` was added (lightningdevkit#2079). * `ChannelDetails::feerate_sat_per_1000_weight` was added (lightningdevkit#2094). * `Invoice::fallback_addresses` was added to fetch `bitcoin` types (lightningdevkit#2023). * The offer/refund description is now exposed in `Invoice{,Request}` (lightningdevkit#2206). Backwards Compatibility ======================= * Payments sent with the legacy `*_with_route` methods on LDK 0.0.115+ will no longer be retryable via the LDK 0.0.114- `retry_payment` method (lightningdevkit#2139). * `Event::PaymentPathFailed::retry` was removed and will always be `None` for payments initiated on 0.0.115 which fail on an earlier version (lightningdevkit#2063). * `Route`s and `PaymentParameters` with blinded path information will not be readable on prior versions of LDK. Such objects are not currently constructed by LDK, but may be when processing BOLT12 data in a coming release (lightningdevkit#2146). * Providing `ChannelMonitorUpdate`s generated by LDK 0.0.115 to a `ChannelMonitor` on 0.0.114 or before may panic (lightningdevkit#2059). Note that this is in general unsupported, and included here only for completeness. Bug Fixes ========= * Fixed a case where `process_events_async` may `poll` a `Future` which has already completed (lightningdevkit#2081). * Fixed deserialization of `u16` arrays. This bug may have previously corrupted the historical buckets in a `ProbabilisticScorer`. Users relying on the historical buckets may wish to wipe their scorer on upgrade to remove corrupt data rather than waiting on it to decay (lightningdevkit#2191). * The `process_events_async` task is now `Send` and can thus be polled on a multi-threaded runtime (lightningdevkit#2199). * Fixed a missing macro export causing `impl_writeable_tlv_based_enum{,_upgradable}` calls to not compile (lightningdevkit#2091). * Fixed compilation of `lightning-invoice` with both `no-std` and serde (lightningdevkit#2187) * Fix an issue where the `background-processor` would not wake when a `ChannelMonitorUpdate` completed asynchronously, causing delays (lightningdevkit#2090). * Fix an issue where `process_events_async` would exit immediately (lightningdevkit#2145). * `Router` calls from the `ChannelManager` now call `find_route_with_id` rather than `find_route`, as was intended and described in the API (lightningdevkit#2092). * Ensure `process_events_async` always exits if any sleep future returns true, not just if all sleep futures repeatedly return true (lightningdevkit#2145). * `channel_update` messages no longer set the disable bit unless the peer has been disconnected for some time. This should resolve cases where channels are disabled for extended periods of time (lightningdevkit#2198). * We no longer remove CLN nodes from the network graph for violating the BOLT spec in some cases after failing to pay through them (lightningdevkit#2220). * Fixed a debug assertion which may panic under heavy load (lightningdevkit#2172). * `CounterpartyForceClosed::peer_msg` is now wrapped in UntrustedString (lightningdevkit#2114) * Fixed a potential deadlock in `funding_transaction_generated` (lightningdevkit#2158). Security ======== * Transaction re-broadcasting is now substantially more aggressive, including a new regular rebroadcast feature called on a timer from the `background-processor` or from `ChainMonitor::rebroadcast_pending_claims`. This should substantially increase transaction confirmation reliability without relying on downstream `TransactionBroadcaster` implementations for rebroadcasting (lightningdevkit#2203, lightningdevkit#2205, lightningdevkit#2208). * Implemented the changes from BOLT PRs lightningdevkit#1031, lightningdevkit#1032, and lightningdevkit#1040 which resolve a privacy vulnerability which allows an intermediate node on the path to discover the final destination for a payment (lightningdevkit#2062). In total, this release features 110 files changed, 11928 insertions, 6368 deletions in 215 commits from 21 authors, in alphabetical order: * Advait * Alan Cohen * Alec Chen * Allan Douglas R. de Oliveira * Arik Sosman * Elias Rohrer * Evan Feenstra * Jeffrey Czyz * John Cantrell * Lucas Soriano del Pino * Marc Tyndel * Matt Corallo * Paul Miller * Steven * Steven Williamson * Steven Zhao * Tony Giorgio * Valentine Wallace * Wilmer Paulino * benthecarman * munjesi
Closes #1872, #2012 (implements lightning/bolts#1031, lightning/bolts#1032, lightning/bolts#1040)
The rationale/changes for the spec PRs, respectively:
While retrying a failed path of an MPP, a node may want to overshoot the
total_msat
in order to use a path with anhtlc_minimum_msat
greater than the remaining value being sent. This PR no longer fails MPPs that overshoot thetotal_msat
, however it does fail HTLCs with the same payment hash that are received after a payment has become claimable.Final nodes previously had stricter requirements on HTLC contents matching onion values compared to intermediate nodes. This allowed for probing, i.e. the last intermediate node could overshoot the value by a small amount and conclude from the acceptance or rejection of the HTLC whether the next node was the destination. This PR now allows overshooting the
outgoing_cltv_value
. This also applies to theamount_msat
andamt_to_forward
, however this change was already present.If routing nodes take less fees and pay the final node more than
amt_to_forward
, the receiver may see thattotal_msat
has been met before all of the sender's intended HTLCs have arrived. The receiver may then prematurely claim the payment and release the payment hash, allowing routing nodes to claim the remaining HTLCs. This PR now uses the onion valueamt_to_forward
to determine whentotal_msat
has been met which the sender to control the set total.