Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abandon payments on behalf of the user and remove manual retries #2008

Conversation

valentinewallace
Copy link
Contributor

@valentinewallace valentinewallace commented Feb 3, 2023

Since ChannelManager now parameterizes payment sends with a retry strategy, we're removing support for manual retries.

As a side effect of this, it makes sense to always call abandon_payment on behalf of the user.

Partially addresses #1932

Based on #2007

@jkczyz
Copy link
Contributor

jkczyz commented Feb 3, 2023

Concept ACK

@TheBlueMatt
Copy link
Collaborator

Can be rebased now. Some further doc updates will need to happen based on the changes from #1873.

@TheBlueMatt TheBlueMatt added this to the 0.0.114 milestone Feb 3, 2023
@valentinewallace valentinewallace force-pushed the 2023-02-remove-manual-retries branch 2 times, most recently from aa81d37 to 07c3ed9 Compare February 3, 2023 19:57
@valentinewallace valentinewallace force-pushed the 2023-02-remove-manual-retries branch from 6a1f42f to 2d4c713 Compare February 6, 2023 02:07
@valentinewallace valentinewallace marked this pull request as ready for review February 6, 2023 02:12
@codecov-commenter
Copy link

codecov-commenter commented Feb 6, 2023

Codecov Report

Base: 91.07% // Head: 91.07% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (485809d) compared to base (ce2ed44).
Patch coverage: 91.08% of modified lines in pull request are covered.

❗ Current head 485809d differs from pull request most recent head d6f805c. Consider uploading reports for the commit d6f805c to get more accurate results

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2008   +/-   ##
=======================================
  Coverage   91.07%   91.07%           
=======================================
  Files          99       99           
  Lines       51711    51679   -32     
  Branches    51711    51679   -32     
=======================================
- Hits        47097    47068   -29     
+ Misses       4614     4611    -3     
Impacted Files Coverage Δ
lightning/src/routing/router.rs 94.91% <ø> (ø)
lightning/src/util/events.rs 31.63% <ø> (+2.07%) ⬆️
lightning/src/ln/chanmon_update_fail_tests.rs 97.54% <80.00%> (-0.05%) ⬇️
lightning/src/ln/functional_tests.rs 96.93% <82.60%> (-0.06%) ⬇️
lightning/src/ln/payment_tests.rs 97.00% <92.50%> (-0.25%) ⬇️
lightning/src/ln/functional_test_utils.rs 91.38% <93.75%> (-0.09%) ⬇️
lightning/src/ln/outbound_payment.rs 90.58% <98.46%> (+1.20%) ⬆️
lightning/src/ln/channelmanager.rs 87.40% <100.00%> (-0.04%) ⬇️
lightning/src/ln/monitor_tests.rs 99.56% <100.00%> (-0.01%) ⬇️
lightning/src/ln/onion_route_tests.rs 97.40% <100.00%> (-0.01%) ⬇️
... and 6 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@valentinewallace
Copy link
Contributor Author

valentinewallace commented Feb 6, 2023

I can't repro the CI fuzz failure locally. I'm running:

export TARGET="chanmon_consistency" # adjust for your output
export HEX="32fffffffffffffffffffffffffffffffebaaedce6af48a03bbfd25e8cd0\
3641404140757453743563747970654963456253743563747970"

mkdir -p ./test_cases/$TARGET
echo $HEX | xxd -r -p > ./test_cases/$TARGET/any_filename_works

export RUST_BACKTRACE=1
export RUSTFLAGS="--cfg=fuzzing"
cargo test

from the fuzz directory. Lmk if someone sees something wrong with this script 👀

Edit: I added +1.58 to the cargo test line and was able to repro

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didnd't review the test changes yet.

if let Err(e) = self.pay_internal(payment_id, None, route_params, router, first_hops(), inflight_htlcs(), entropy_source, node_signer, best_block_height, logger, &send_payment_along_path) {
log_info!(logger, "Errored retrying payment: {:?}", e);
self.abandon_payment(payment_id, pending_events);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd appreciate a nice, long comment here and doc comment on pay_internal talking about what cases pay_internal is guaranteed to return an Err on and why we okay using that as an indicator here - the current Err-or-not logic is a bit haphazard so would like it in writing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at the new docs pls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to change a lot in #2014 actually, so I'm going to save this feedback item for that PR

lightning/src/ln/outbound_payment.rs Show resolved Hide resolved
lightning/src/ln/outbound_payment.rs Show resolved Hide resolved
lightning/src/ln/outbound_payment.rs Show resolved Hide resolved
@valentinewallace valentinewallace force-pushed the 2023-02-remove-manual-retries branch from 485809d to d6f805c Compare February 9, 2023 01:23
Copy link
Contributor

@alecchendev alecchendev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly took a look at channelmanager.rs and outbound_payment.rs to learn and ask questions. Looks good as far as I can tell!

I saw in related PRs that moving payment retrying into ChannelManager is to enable support for trampoline payments. I was wondering how it does that. Are all these PRs to support automated retrying of payments for when a node built with LDK is acting as a trampoline node?

Comment on lines 498 to 501
outbounds.retain(|pmt_id, pmt| {
let mut retain = true;
let auto_retryable = pmt.is_auto_retryable_now();
if let PendingOutboundPayment::Retryable { pending_amt_msat, total_msat, payment_params: Some(params), .. } = pmt {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused what's going on in this loop (outbounds.retain(...)). For each outbound payment, it looks to be checking if it's valid to retry, and either marking it abandoned and pushing a PaymentFailed event or setting retry_id_route_params to retry the payment. I'm mainly confused because it looks like if there's multiple retryable payments, this will just set retry_id_route_params multiple times and afterwards only retry whatever single payment it was last set to. I'd imagine it'd make more sense want to retry each retryable payment that it finds from this iteration (instead of looping through all payments to just retry a single one) right? Not sure if I'm understanding this correctly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was something I meant to revisit but forgot, thanks. I'm still not 100% sure of the new solution since it adds an extra lock of pending_outbound_payments, but it seems overall better to me for the reasons you specified.

We can't retry each payment as we go because we need to drop the pending_outbound_payments lock each time we call pay_internal. It is a bit weird though. See this discussion for more info: #1916 (comment)

Copy link
Contributor

@alecchendev alecchendev Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't retry each payment as we go because we need to drop the pending_outbound_payments lock each time we call pay_internal. It is a bit weird though. See this discussion for more info: #1916 (comment)

Ah gotcha.

I think the new solution is an improvement at least in readability with the iteration for retrying/lock dropping and abandoning of payments being separate, but yea, this seems tough to do in a clean way :)

Comment on lines 1345 to 1348
{
let mut pending_outbounds = outbound_payments.pending_outbound_payments.lock().unwrap();
if let Some(PendingOutboundPayment::Retryable { ref mut total_msat, .. }) = pending_outbounds.get_mut(&our_payment_id) {
*total_msat += 1000;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confused what this block is for. I'm thinking this whole test to make sure that check_retry_payments abandons + fails the payment after exhausting retry attempts...is this part to simulate that there are more sats that still need to be sent than are pending..?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this part to simulate that there are more sats that still need to be sent than are pending..?

Yup, exactly. Added a comment to clarify

Comment on lines 1143 to 1142
pub(super) fn abandon_payment(&self, payment_id: PaymentId,
pending_events: &Mutex<Vec<events::Event>>
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious what the reason is for changing abandon_payment, check_retry_payments, etc. to take in and modify pending_events as opposed to how it was before. Is it mainly just because it’s a bit cleaner or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit cleaner because we want to keep as much code out of channelmanager.rs as possible, since it's already huge :)

@valentinewallace valentinewallace force-pushed the 2023-02-remove-manual-retries branch 2 times, most recently from ae2f238 to 2fab2b8 Compare February 9, 2023 18:48
@valentinewallace
Copy link
Contributor Author

Are all these PRs to support automated retrying of payments for when a node built with LDK is acting as a trampoline node?

Yes, that's exactly right!

Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. Github wan't very functional on airline wifi. 😞

lightning/src/ln/outbound_payment.rs Outdated Show resolved Hide resolved
lightning/src/ln/outbound_payment.rs Outdated Show resolved Hide resolved
lightning/src/ln/outbound_payment.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/outbound_payment.rs Outdated Show resolved Hide resolved
Comment on lines +679 to 682
/// Parameters used by LDK to compute a new [`Route`] when retrying the failed payment path.
///
/// [`Route`]: crate::routing::router::Route
/// [`find_route`]: crate::routing::router::find_route
retry: Option<RouteParameters>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need these anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, though from offline discussions this doesn't need to block the release. Documented in follow-up issue

lightning/src/util/events.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Largely LGTM modulo a question about the test changes.

lightning/src/ln/outbound_payment.rs Outdated Show resolved Hide resolved
Comment on lines +2239 to +2249
if i == expected_paths.len() - 1 {
match events[1] {
Event::PaymentFailed { ref payment_hash, ref payment_id } => {
assert_eq!(*payment_hash, our_payment_hash, "unexpected second payment_hash");
assert_eq!(*payment_id, expected_payment_id);
}
_ => panic!("Unexpected second event"),
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I completely understand the changes to the testing behavior. Not necessarily here -- which may be justified -- but elsewhere where the number of events has increased because of the additional Event::PaymentFailed generated. Wouldn't it be less invasive if the Event::PaymentFailed case was only checked in a dedicated test? Then when testing other behavior, we could have retries remaining such that no new event are generated so those tests are more narrowly focused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A portion of the changed tests won't have retryable HTLCs, so the PaymentFailed will be generated no matter what. Most of the others that I'm seeing tend to use the test utils such as route_payment, but when I update that util to use _with_retry I get 55 test failures.. I think it might be tricky since having retryable failed HTLCs will often lead to the generation of retry msg events to be handled. That said, it would probably be good in the long term to move the test utils to use _with_retry

lightning/src/ln/outbound_payment.rs Outdated Show resolved Hide resolved
@@ -1048,6 +1061,10 @@ impl OutboundPayments {
});
}

if !payment_is_probe && (!is_retryable_now || !payment_retryable || retry.is_none()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, its not a new issue, but where do we remove pending payments for probes? AFAICT we add one normally in send_probe but then don't remove it anywhere here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, filed #2036

lightning/src/ln/payment_tests.rs Show resolved Hide resolved
lightning/src/ln/payment_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/payment_tests.rs Show resolved Hide resolved
lightning/src/ln/payment_tests.rs Show resolved Hide resolved
lightning/src/ln/outbound_payment.rs Outdated Show resolved Hide resolved
lightning/src/ln/functional_test_utils.rs Outdated Show resolved Hide resolved
This makes it uniform with the outbound payment methods that generate events
and set us up for abandoning payments on behalf of the user.
@valentinewallace valentinewallace force-pushed the 2023-02-remove-manual-retries branch from a97e1d2 to 03b5e8a Compare February 15, 2023 17:42
@TheBlueMatt
Copy link
Collaborator

Can you add this to Abandon payment if retry fails in process_pending_htlcs? I think it describes correctly the return conditions which the new code is ultimately relying on.

diff --git a/lightning/src/ln/outbound_payment.rs b/lightning/src/ln/outbound_payment.rs
index 52208bcc2..d1749bf5f 100644
--- a/lightning/src/ln/outbound_payment.rs
+++ b/lightning/src/ln/outbound_payment.rs
@@ -546,6 +546,7 @@ impl OutboundPayments {
                });
        }
 
+       /// Will return `Ok(())` iff at least one HTLC is sent for the payment.
        fn pay_internal<R: Deref, NS: Deref, ES: Deref, IH, SP, L: Deref>(
                &self, payment_id: PaymentId,
                initial_send_info: Option<(PaymentHash, &Option<PaymentSecret>, Option<PaymentPreimage>, Retry)>,

lightning/src/ln/payment_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/payment_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
@@ -670,20 +664,6 @@ pub enum Event {
/// For both single-path and multi-path payments, this is set if all paths of the payment have
/// failed. This will be set to false if (1) this is an MPP payment and (2) other parts of the
/// larger MPP payment were still in flight when this event was generated.
///
/// Note that if you are retrying individual MPP parts, using this value to determine if a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just remove the field entirely rather than just this hunk of text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's removed in #2014, let me know if you'd prefer we move that commit over here

lightning/src/ln/outbound_payment.rs Outdated Show resolved Hide resolved
Removed retry_single_path_payment, it's replaced by automatic_retries with
AutoRetry::Success
We're no longer supporting manual retries since
ChannelManager::send_payment_with_retry can be parameterized by a retry
strategy

This commit also updates all docs related to retry_payment and abandon_payment.
Since these docs frequently overlap with changes in preceding commits where we
start abandoning payments on behalf of the user, all the docs are updated in
one go.
@valentinewallace valentinewallace force-pushed the 2023-02-remove-manual-retries branch from 42c2350 to da34ada Compare February 15, 2023 22:59
@TheBlueMatt TheBlueMatt merged commit 30b9d9f into lightningdevkit:main Feb 16, 2023
@wpaulino wpaulino mentioned this pull request Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants