-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1916 Followups Part 1 #1994
1916 Followups Part 1 #1994
Conversation
Codecov ReportBase: 90.91% // Head: 91.32% // Increases project coverage by
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more Additional details and impacted files@@ Coverage Diff @@
## main #1994 +/- ##
==========================================
+ Coverage 90.91% 91.32% +0.40%
==========================================
Files 99 99
Lines 52505 55081 +2576
Branches 52505 55081 +2576
==========================================
+ Hits 47735 50302 +2567
- Misses 4770 4779 +9
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
deef28a
to
b2517eb
Compare
lightning/src/ln/outbound_payment.rs
Outdated
pmt.increment_attempts(); | ||
let mut params = params.clone(); | ||
if params.max_path_count as usize > retries_remaining { | ||
params.max_path_count = retries_remaining as u8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to do this? I read the max_path_count
docs as the max paths per payment attempt, not total including retries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, in this case there is one path remaining that we're willing to take for this payment (attempt), so we should definitely limit the path finder to ensure we only get that many paths.
This ultimately stems from the (admittedly somewhat strange) decision to tie retries to the number of payment paths in retry attempts, rather than the number of failed paths in the being-retried HTLCs. It left the code a good chunk cleaner (as it lets us increment the attempts in one place, in retry_payment_with_route
, rather than keeping track of whether an attempt is for a retry or original path, and then incrementing in PendingOutboundPayment::remove
)
&self, payment_hash: PaymentHash, payment_secret: &Option<PaymentSecret>, payment_id: PaymentId, | ||
retry_strategy: Retry, route_params: RouteParameters, router: &R, | ||
first_hops: Vec<ChannelDetails>, inflight_htlcs: InFlightHtlcs, entropy_source: &ES, | ||
node_signer: &NS, best_block_height: u32, send_payment_along_path: F | ||
node_signer: &NS, best_block_height: u32, logger: &L, send_payment_along_path: F, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we do have to, or, when I tried to remove it, I got complaints about the deref being unsized. When you do &*<Deref<Target=Logger>>
it first has to deref into something Sized
, which fails, and then we take a ref again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was confused. I think you should be able to have the function still take a Deref
but not use a reference. We do that when passing Deref
parameters from ChainMonitor
to ChannelMonitor
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would mean we have to add a Clone
bound and do a clone
, which for an Arc
isn't free, sadly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... was thinking some thing like this but could be missing something:
monitor_state.monitor.transaction_unconfirmed(txid, &*self.broadcaster, &*self.fee_estimator, &*self.logger); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, okay, I honestly have no idea. I don't see why that works but this doesn't, but I also can't get it to work 🤷
final_cltv_expiry_delta: max_unsent_cltv_delta, | ||
final_cltv_expiry_delta: | ||
if let Some(delta) = payment_params.final_cltv_expiry_delta { delta } | ||
else { max_unsent_cltv_delta }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be unreachable like the other place we set final_cltv_expiry_delta
? I think we might want the min_unsent_cltv_delta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because these are the paymentpatams that the user gave us, which could have None
, though they're supposed to set it. The debug assert on the retry pipeline we can do because we update the paymentparams' cltv-delta if the user didn't set one when we fail.
fn is_retryable_now(&self, pending_paths: usize) -> bool { | ||
match self { | ||
PendingOutboundPayment::Retryable { retry_strategy: None, .. } => { | ||
// We're handling retries manually, we can always retry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we were planning to drop support for manual retries. Is there a use case for them? It's a bit confusing that fail_htlc
checks payment.get().auto_retries_remaining_now() > 0
instead of is_retryable_now()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good question, I'm okay with removing it, I think, I was just trying to keep backwards compat here. Do you want me to remove it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, it seems like we should remove support for manual retries pre-release. Since IIUC there's no reason to keep it in this PR, could we remove it, since it adds a bit of review burden?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the biggest hurdle is that it's pretty hard to remove in a sensible way while the InvoicePayer
is still in-tree. I could remove it in this PR as well, but I understood you wanted to do that later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I thought we could remove the bits from this PR without touching InvoicePayer
, let's leave it then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well to remove it we'd have to break InvoicePayer
's ability to retry, I think :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we didn't care about that in between releases since we're removing InvoicePayer
anyway, but maybe that'd be a bridge too far. In any case, seems like too small of a complexity diff to be worth too much debate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it'd break a bunch of tests and other unpleasantness, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it'd break ~all the tests.
cb45f1e
to
41f288a
Compare
Reverted the change of the |
Github said no more comments for me, but I think the last commit could be its own PR. Otherwise this pretty much LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High-level ACK. Deleted most of my comments as they were addressed in the recent fixups.
lightning/src/ln/outbound_payment.rs
Outdated
/// MPP path/HTLC, it will only count as a single retry. | ||
/// This is the number of retried sends done. Each send may be multiple HTLCs along multiple | ||
/// paths if the router decides to split up a send, and may retry multiple failed HTLCs if they | ||
/// failed around the same time and were retried with a single [`Route`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand why "and were retried with a single [Route
]" is included here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The act of finding the route is the one single thing here - we're retrying multiple HTLCs via multiple HTLCs, the only common thing is we called get_route once. I rephrased it to mention find_route, but let me know if you think I should just drop it. I only like it because otherwise it seems like maybe we're just counting retries based on some arbitrary "did they happen near each other in time" criteria, instead of discrete actions.
41f288a
to
172241a
Compare
Meant to say, ACK after squash |
Yeah, feel free to squash |
`PaymentParams` is all about the parameters for a payment, i.e. the parameters which are static across all the paths of a paymet. `RouteParameters` is about the information specific to a given `Route` (i.e. a set of paths, among multiple potential sets of paths for a payment). The CLTV delta thus doesn't belong in `RouterParameters` but instead in `PaymentParameters`. Worse, because `RouteParameters` is built from the information in the last hops of a `Route`, when we deliberately inflate the CLTV delta in path-finding, retries of the payment will have the final CLTV delta double-inflated as it inflates starting from the final CLTV delta used in the last attempt. By moving the CLTV delta to `PaymentParameters` we avoid this issue, leaving only the sought amount in the `RouteParameters`.
172241a
to
face191
Compare
|
`PaymentParams` is all about the parameters for a payment, i.e. the parameters which are static across all the paths of a paymet. `RouteParameters` is about the information specific to a given `Route` (i.e. a set of paths, among multiple potential sets of paths for a payment). The CLTV delta thus doesn't belong in `RouterParameters` but instead in `PaymentParameters`. Worse, because `RouteParameters` is built from the information in the last hops of a `Route`, when we deliberately inflate the CLTV delta in path-finding, retries of the payment will have the final CLTV delta double-inflated as it inflates starting from the final CLTV delta used in the last attempt. When we calculate the `final_cltv_expiry_delta` to put in the `RouteParameters` returned via events after a payment failure, we should re-use the new one in the `PaymentParameters`, rather than the one that was in the route itself.
`TestRouter` allows us to simply select the `Route` that will be returned in the next `find_route` call, but it does so without any checking of what was *requested* for the call. This makes it a somewhat dubious test utility as it very helpfully ensures we ignore errors in the routes we're looking for. Instead, we require users of `TestRouter` pass a `RouteParameters` to `expect_find_route`, which we compare against the requested parameters passed to `find_route`.
The documentation for `Retry` is very clear that it counts the number of failed paths, not discrete retries. When we added retries internally in `ChannelManager`, we switched to counting the number if discrete retries, even if multiple paths failed and were replace with multiple MPP HTLCs. Because we are now rewriting retries, we take this opportunity to reduce the places where retries are analyzed, specifically a good chunk of code is removed from `pay_internal`. Because we now retry multiple failed paths with one single retry, we keep the new behavior, updating the docs on `Retry` to describe the new behavior.
When we landed the initial in-`ChannelManager` payment retries, we stored the `RouteParameters` in the payment info, and then re-use it as-is for new payments. `RouteParameters` is intended to store the information specific to the *route*, `PaymentParameters` exists to store information specific to a payment. Worse, because we don't recalculate the amount stored in the `RouteParameters` before fetching a new route with it, we end up attempting to retry the full payment amount, rather than only the failed part. This issue brought to you by having redundant data in datastructures, part 5,001.
This should marginally reduce compile times for the tests by reducing the total volume of code across the tests in the lightning crate.
face191
to
0765600
Compare
Oops, sorry, bad doc link, $ git diff-tree -U1 face1912 07656006
diff --git a/lightning/src/ln/channelmanager.rs b/lightning/src/ln/channelmanager.rs
index baae29d1c..e9b7fb949 100644
--- a/lightning/src/ln/channelmanager.rs
+++ b/lightning/src/ln/channelmanager.rs
@@ -249,4 +249,4 @@ pub(crate) enum HTLCSource {
/// backwards) compatibility reasons, but prefer to use the data in the
- /// [`outbound_payment`] module, which stores per-payment data once instead of in each
- /// HTLC.
+ /// [`super::outbound_payment`] module, which stores per-payment data once instead of in
+ /// each HTLC.
payment_params: Option<PaymentParameters>, |
While digging into followups for #1916 I noticed the amount issue, and then also noticed the CLTV issue which is existing in our retry pipeline.