Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle network updates from failed payments in BackgroundProcessor #1043

Merged

Conversation

jkczyz
Copy link
Contributor

@jkczyz jkczyz commented Aug 14, 2021

MessageSendEvent::PaymentFailureNetworkUpdate served as a hack to pass an HTLCFailChannelUpdate from ChannelManager to NetGraphMsgHandler via PeerManager. Instead, remove the event entirely and move the contained
data (renamed NetworkUpdate) into Event::PaymentFailed to be processed by an event handler.

This requires making NetGraphMsgHandler an EventHandler, which is used in BackgroundProcessor to decorate the user-provided EventHandler with the ability to update the NetworkGraph accordingly.

The general concept of intercepting events via a decorator will be used in a follow-up PR for scoring channels and retrying failed payments.

@codecov
Copy link

codecov bot commented Aug 14, 2021

Codecov Report

Merging #1043 (992df51) into main (423f1b1) will increase coverage by 0.21%.
The diff coverage is 91.04%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1043      +/-   ##
==========================================
+ Coverage   90.83%   91.04%   +0.21%     
==========================================
  Files          65       65              
  Lines       33281    33762     +481     
==========================================
+ Hits        30232    30740     +508     
+ Misses       3049     3022      -27     
Impacted Files Coverage Δ
lightning-net-tokio/src/lib.rs 77.27% <ø> (+0.29%) ⬆️
lightning/src/ln/msgs.rs 89.50% <ø> (+0.39%) ⬆️
lightning/src/ln/peer_handler.rs 45.94% <ø> (+0.16%) ⬆️
lightning/src/util/test_utils.rs 82.55% <ø> (+0.21%) ⬆️
lightning/src/util/events.rs 27.54% <40.00%> (+13.60%) ⬆️
lightning/src/ln/onion_utils.rs 94.91% <75.00%> (ø)
lightning/src/ln/channelmanager.rs 85.90% <84.61%> (+0.05%) ⬆️
lightning-background-processor/src/lib.rs 94.20% <85.71%> (+0.26%) ⬆️
lightning/src/routing/network_graph.rs 92.12% <87.13%> (+0.31%) ⬆️
lightning/src/routing/router.rs 96.25% <92.00%> (-0.10%) ⬇️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 423f1b1...992df51. Read the comment docs.

@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch 2 times, most recently from f5e9985 to bec75d7 Compare August 14, 2021 03:11
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK. Only did a high-level skim but I like it.

lightning/src/routing/network_graph.rs Show resolved Hide resolved
lightning/src/routing/network_graph.rs Outdated Show resolved Hide resolved
}
}

self.inner.handle_event(event)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmmmm, have we considered maybe just allowing events to be handled by multiple handlers? Passing them off to an inner handler works, I suppose, but its kinda awkward. I guess I'm not a huge fan of passing the events by reference instead of value, but most of them have no heap allocation, and eg SpendableOutputs should be pretty rare (requires on-chain txn).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decorator approach here is nice because it allows us to short-circuit handling. This is needed for payment retries where the inner handler is only invoked after the final failure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yea, good point. This isn't a trait change, though, so we could do different things for different handlers, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure I follow what you mean by do "different things for different handlers". Do you have a concrete example that may clarify?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in, there's nothing here that requires that we handle ignoring/passing down events the same for NetworkGraph as we do for PaymentRetryer, or whatever, is there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you also want to update to just implement EventHandler on the NetworkGraph? Given the background processor already explicitly takes a NetworkGraph, it wouldn't complicate the background processor much at all (and maybe simplify it if we make the network graph there optional).

(see also other comment about removing NetworkGraph from BackgroundProcessor)

We'd have to parameterize NetworkGraph by a Logger if we did that. 😕 How about implementing EventHandler for NetGraphMsgHandler? I've been toying around with renaming that NetworkGossip, making the naming less specific about how and that it updates NetworkGraph.

If we're gonna go down the path of decorators eating and passing events onto inner handlers, I dunno if we need to bother passing events by reference.

Using references allows a bit more flexibility like post-handling event as mentioned above.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd have to parameterize NetworkGraph by a Logger if we did that.

Eh, we basically already do that for everything, not a big deal IMO.

How about implementing EventHandler for NetGraphMsgHandler? I've been toying around with renaming that NetworkGossip, making the naming less specific about how and that it updates NetworkGraph.

No strong feelings, do what you want for both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't pushed it yet, but I went with implementing it on NetGraphMsgHandler. It's also nice to not have to know about a decorated EventHandler when deserializing a NetworkGraph.

That said, just want to point out a drawback with either is that the guide for setting up a node will need to change. Now event handling (step 16) can no longer be discussed independently of initializing NetGraphMsgHandler (optional step 12).

cc: @valentinewallace Any opinions on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. That is a slight drawback but not a deal breaker IMO

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's ok. Indeed, ideally we'd have everything be nice and separate, but if we're gonna have something related to the network graph handle events, that's the outcome we're gonna have. It is nice that all the bindings have moved towards using the lightning-background-processor crate, though, so it'll all be common infrastructure in any case.

@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch 2 times, most recently from 99c2337 to de7cb1d Compare August 17, 2021 16:40
@TheBlueMatt TheBlueMatt added this to the 0.0.101 milestone Aug 17, 2021
@valentinewallace valentinewallace self-requested a review August 17, 2021 21:36
Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General call for more docs on the usage of this new API, but this is looking good to me!

lightning-background-processor/src/lib.rs Outdated Show resolved Hide resolved
lightning/src/util/events.rs Outdated Show resolved Hide resolved
@valentinewallace
Copy link
Contributor

Implemented in the sample: valentinewallace/ldk-sample@ddc9414
I think it looks good, though it does add a clone to each Event and a few more clones of NetworkGraph.

@TheBlueMatt
Copy link
Collaborator

Implemented in the sample: valentinewallace/ldk-sample@ddc9414 I think it looks good, though it does add a clone to each Event and a few more clones of NetworkGraph.

Note that you don't actually need the clone - https://git.bitcoin.ninja/index.cgi?p=ldk-sample;a=commit;h=bc119d8fb23d8e55c91df002ca490062f9cc72a6

@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch from de7cb1d to 4e2f2a4 Compare August 19, 2021 18:00
@jkczyz
Copy link
Contributor Author

jkczyz commented Aug 19, 2021

Updated the docs!

There's a couple open issues with testing:

  • test_invalid_channel_announcement seems non-sensical, so maybe just drop it as discussed on Slack
  • htlc_fail_async_shutdown fails when performing checks in expect_payment_failed_with_update

For the latter, I commented out the assertion in expect_payment_failed_with_update, which I won't want to merge. So need to figure a workaround. Perhaps just not use the macro. 😕

lightning/src/routing/network_graph.rs Outdated Show resolved Hide resolved
lightning/src/routing/network_graph.rs Show resolved Hide resolved
lightning/src/ln/functional_tests.rs Outdated Show resolved Hide resolved
lightning/src/ln/onion_route_tests.rs Outdated Show resolved Hide resolved
@@ -1007,23 +1007,17 @@ fn htlc_fail_async_shutdown() {
nodes[0].node.handle_update_fail_htlc(&nodes[1].node.get_our_node_id(), &updates_2.update_fail_htlcs[0]);
commitment_signed_dance!(nodes[0], nodes[1], updates_2.commitment_signed, false, true);

expect_payment_failed!(nodes[0], our_payment_hash, false);
expect_payment_failed_with_update!(nodes[0], our_payment_hash, false, chan_1.0.contents.short_channel_id, false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue the issue here is on the channel end - we fail with chan_disabled instead of permanent_channel_failure. I think we should swap the shutdown checks in Channel::update_add_htlc to instead use 0x4000|8 and then fix channelmanager to handle it (I haven't reviewed this, just hacked it up, so may need more thought):

@@ -3461,33 +3464,32 @@ impl<Signer: Sign, M: Deref, T: Deref, K: Deref, F: Deref, L: Deref> ChannelMana
                                }
 
                                let create_pending_htlc_status = |chan: &Channel<Signer>, pending_forward_info: PendingHTLCStatus, error_code: u16| {
-                                       // Ensure error_code has the UPDATE flag set, since by default we send a
-                                       // channel update along as part of failing the HTLC.
-                                       assert!((error_code & 0x1000) != 0);
                                        // If the update_add is completely bogus, the call will Err and we will close,
                                        // but if we've sent a shutdown and they haven't acknowledged it yet, we just
                                        // want to reject the new HTLC and fail it backwards instead of forwarding.
                                        match pending_forward_info {
                                                PendingHTLCStatus::Forward(PendingHTLCInfo { ref incoming_shared_secret, .. }) => {
-                                                       let reason = if let Ok(upd) = self.get_channel_update_for_unicast(chan) {
-                                                               onion_utils::build_first_hop_failure_packet(incoming_shared_secret, error_code, &{
-                                                                       let mut res = Vec::with_capacity(8 + 128);
-                                                                       // TODO: underspecified, follow https://github.com/lightningnetwork/lightning-rfc/issues/791
-                                                                       res.extend_from_slice(&byte_utils::be16_to_array(0));
-                                                                       res.extend_from_slice(&upd.encode_with_len()[..]);
-                                                                       res
-                                                               }[..])
+                                                       let reason = if (error_code & 0x1000) != 0 {
+                                                               if let Ok(upd) = self.get_channel_update_for_unicast(chan) {
+                                                                       onion_utils::build_first_hop_failure_packet(incoming_shared_secret, error_code, &{
+                                                                               let mut res = Vec::with_capacity(8 + 128);
+                                                                               // TODO: underspecified, follow https://github.com/lightningnetwork/lightning-rfc/issues/791
+                                                                               res.extend_from_slice(&byte_utils::be16_to_array(0));
+                                                                               res.extend_from_slice(&upd.encode_with_len()[..]);
+                                                                               res
+                                                                       }[..])
+                                                               } else {
+                                                                       // The only case where we'd be unable to successfully get a
+                                                                       // channel update is if the channel isn't in the fully-funded
+                                                                       // state yet, implying our counterparty is trying to route
+                                                                       // payments over the channel back to themselves (cause no one
+                                                                       // else should know the short_id is a lightning channel yet).
+                                                                       // We should have no problem just calling this
+                                                                       // unknown_next_peer (0x4000|10).
+                                                                       onion_utils::build_first_hop_failure_packet(incoming_shared_secret, 0x4000|10, &[])
+                                                               }
                                                        } else {
-                                                               // The only case where we'd be unable to
-                                                               // successfully get a channel update is if the
-                                                               // channel isn't in the fully-funded state yet,
-                                                               // implying our counterparty is trying to route
-                                                               // payments over the channel back to themselves
-                                                               // (cause no one else should know the short_id
-                                                               // is a lightning channel yet). We should have
-                                                               // no problem just calling this
-                                                               // unknown_next_peer (0x4000|10).
-                                                               onion_utils::build_first_hop_failure_packet(incoming_shared_secret, 0x4000|10, &[])
+                                                               onion_utils::build_first_hop_failure_packet(incoming_shared_secret, error_code, &[])
                                                        };
                                                        let msg = msgs::UpdateFailHTLC {
                                                                channel_id: msg.channel_id,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work though I had to change which channel is used in the failing test otherwise the short channel id wouldn't match. Take a look at d5ea8e8. I'll properly backport the fixup if all looks well.

@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch from 4e2f2a4 to d5ea8e8 Compare August 25, 2021 01:32
@TheBlueMatt
Copy link
Collaborator

Needs rebase.

@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch from d5ea8e8 to 9828dcc Compare August 25, 2021 22:12
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One material comment, but once we get resolution on that I'm ACK.

lightning/src/routing/network_graph.rs Outdated Show resolved Hide resolved
/// # Example
///
/// ```
/// # extern crate bitcoin;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does # do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hides it from showing up in the documentation. Useful if you have an elaborate setup that may distract from the example. Could leave some of these lines in if you'd like. Need to fix the indentation anyhow.

}
}

self.inner.handle_event(event)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you also want to update to just implement EventHandler on the NetworkGraph? Given the background processor already explicitly takes a NetworkGraph, it wouldn't complicate the background processor much at all (and maybe simplify it if we make the network graph there optional).

If we're gonna go down the path of decorators eating and passing events onto inner handlers, I dunno if we need to bother passing events by reference.

lightning-background-processor/src/lib.rs Outdated Show resolved Hide resolved
>
(persister: CMP, event_handler: EH, chain_monitor: M, channel_manager: CM, peer_manager: PM, logger: L) -> Self
(persister: CMP, event_handler: EH, chain_monitor: M, channel_manager: CM, peer_manager: PM, network_graph: G, logger: L) -> Self
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

network_graph needs to be Optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may actually want to remove this entirely from BackgroundProcessor. With #1059, users will want to decorate InvoicePayer with NetworkUpdateHandler so that updates are applied prior to retrying. Given they'll want to use the InvoicePayer to make payments, we can't construct that here without returning it back to them somehow.

So instead we may want to leave it to the user to set up the event handler to pass to BackgroundProcessor. But we could perhaps provide a utility for decorating their event handler with standard decorators.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I guess that makes sense - you only get those errors if you send a payment and it fails, so requiring a payment retryer in order to handle those errors isn't crazy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the changes to BackgroundProcessor.

@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch 2 times, most recently from e0aff9c to f9bee36 Compare September 8, 2021 23:55
@jkczyz
Copy link
Contributor Author

jkczyz commented Sep 8, 2021

I removed the commit that made NetworkGraph have shared ownership since it is no longer necessary now that NetGraphMsgHandler is responsible for event handling.

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the docs could be a little more explicit that you're expected to nest EventHandlers for BackgroundProcessor::start now but this is looking slick overall!

lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
let is_enabled = msg.contents.flags & (1 << 1) != (1 << 1);
let status = if is_enabled { "enabled" } else { "disabled" };
log_debug!(self.logger, "Updating channel with channel_update from a payment failure. Channel {} is {}.", short_channel_id, status);
let _ = self.network_graph.update_channel(msg, &self.secp_ctx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there be a TODO to handle the error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. The code that produces them is reused in other places but likely isn't actionable here. Might be able to log one marked IgnoreAndLog.

lightning/src/util/test_utils.rs Outdated Show resolved Hide resolved
@TheBlueMatt
Copy link
Collaborator

Needs rebase.

@jkczyz
Copy link
Contributor Author

jkczyz commented Sep 10, 2021

I think the docs could be a little more explicit that you're expected to nest EventHandlers for BackgroundProcessor::start now but this is looking slick overall!

Added a short note about decorators in start. Otherwise, we'll likely want some better documentation which could go on a utility used to wrap an event handler with common ones.

@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch from f9bee36 to 1997e75 Compare September 10, 2021 03:56
@TheBlueMatt TheBlueMatt modified the milestones: 0.0.101, 0.0.102 Sep 10, 2021
@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch from 1997e75 to 69fc1f9 Compare September 10, 2021 03:57
Passing an Event by reference rather and by move gives more flexibility
for composing event handlers without needing to clone events.
@TheBlueMatt
Copy link
Collaborator

Moving this (and related stuff) to 0.0.102 because it looks like we may want to ship an 0.0.101 to get some fixes out.

In preparation for giving NetworkGraph shared ownership, wrap individual
fields in RwLock. This allows removing the outer RwLock used in
NetGraphMsgHandler.
Now that NetworkGraph uses interior mutability, the RwLock used around
it in NetGraphMsgHandler is no longer needed. This allows for shared
ownership without a lock.
Hide the internal locking of NetworkGraph by providing a read-only
view. This way the locking order is handled internally.
It doesn't seem to be testing anything useful that isn't covered
elsewhere.
This affects the htlc_fail_async_shutdown test.
@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch from 69fc1f9 to 4455d5f Compare September 10, 2021 04:18
@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch 2 times, most recently from 1bda8c4 to 8362ead Compare September 15, 2021 02:39
@jkczyz
Copy link
Contributor Author

jkczyz commented Sep 15, 2021

Based on an offline conversation with @TheBlueMatt, I removed the decorated EventHandler from NetGraphMsgHandler. Instead, BackgroundProcessor now takes an optional NetGraphMsgHandler which it delegates to before letting the user's event handler run.

This approach is a bit cleaner since it simplifies NetGraphMsgHandler and makes it such that BackgroundProcessor does the right thing for the user.

@TheBlueMatt
Copy link
Collaborator

ACK aside from the two comments above.

@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch from 8362ead to 0b452b7 Compare September 15, 2021 15:27
Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, LGTM

Comment on lines +40 to +41
/// It will also call [`PeerManager::process_events`] periodically though this shouldn't be relied
/// upon as doing so may result in high latency.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so I checked lightningdevkit.org guides, the sample, process_events docs and I don't think we indicate anywhere that the user should be calling this function perioically (i.e., users do currently rely on BackgroundProcessor to call process_events atm).

Should we update some other docs/the sample, or open an issue to do so?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its primarily called by lightning-net-tokio or by the NioPeerHandler in Java, so I don't think users need to care, at least not normal users.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did open an issue at #1075.

MessageSendEvent::PaymentFailureNetworkUpdate served as a hack to pass
an HTLCFailChannelUpdate from ChannelManager to NetGraphMsgHandler via
PeerManager. Instead, remove the event entirely and move the contained
data (renamed NetworkUpdate) to Event::PaymentFailed to be processed by
an event handler.
PaymentFailed events contain an optional NetworkUpdate describing
changes to the NetworkGraph as conveyed by a node along a failed payment
path according to BOLT 4. An EventHandler should apply the update to the
graph so that future routing decisions can account for it.

Implement EventHandler for NetGraphMsgHandler to update NetworkGraph.
Previously, NetGraphMsgHandler::handle_htlc_fail_channel_update
implemented this behavior.
Decorate the user-supplied EventHandler with NetGraphMsgHandler in
the BackgroundProcessor. The resulting handler will intercept
PaymentFailed events in order to update the NetworkGraph in the
background before delegating to the user's event handler.
@jkczyz jkczyz force-pushed the 2021-07-network-update-handler branch from 0b452b7 to 992df51 Compare September 15, 2021 16:53
@TheBlueMatt TheBlueMatt merged commit bb9df69 into lightningdevkit:main Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants