-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store channels
per-peer
#1507
Store channels
per-peer
#1507
Conversation
Codecov ReportBase: 90.74% // Head: 90.72% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1507 +/- ##
==========================================
- Coverage 90.74% 90.72% -0.03%
==========================================
Files 96 96
Lines 50133 50523 +390
Branches 50133 50523 +390
==========================================
+ Hits 45493 45835 +342
- Misses 4640 4688 +48
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Snap, will address the failing check_commits check tomorrow. Sorry about that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments with some examples of a few high-level patterns here - you've added a lot of unreachable
branches, some of which I think look quite reachable, at a minimum there should probably be a comment in a few of the cases explaining why you're confident those are unreachable. Similarly, there are quite a few places you take the per_peer_state
write lock where I think you only need the read
lock. I'd really like to see some kind of writeup in the documentation for the various internal fields explaining the consistency guarantees between the channel_state
fields and the per_peer_state
stuff, especially given we'd really like to remove stuff from channel_state
over time and take individual locks for a shorter period to get better parallelism.
lightning/src/ln/channelmanager.rs
Outdated
/// added to `ChannelMonitor`s in the future, this map should be removed, as only the | ||
/// `ChannelManager::per_peer_state` is required to access the channel if the | ||
/// `counterparty_node_id` is passed with `MonitorEvent`s. | ||
pub(super) id_to_peer: HashMap<[u8; 32], PublicKey>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe lets move this out of the ChannelHolder
, not sure what reason we'd have to put it in there and easy to just track it separately? Once you do that, I think we can drop the $id_to_peer
argument in most of the macros and just take the lock for a very short period by accessing $self
.
@@ -511,31 +511,32 @@ macro_rules! get_htlc_update_msgs { | |||
|
|||
#[cfg(test)] | |||
macro_rules! get_channel_ref { | |||
($node: expr, $lock: ident, $channel_id: expr) => { | |||
($node: expr, $peer_state_lock: ident, $channel_id: expr) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets keep the old style here where we do the lock inside of the macro to DRY up the callsites, which are now a bit more verbose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I opted for this version as I simply had too many problems in getting your suggested version to work. But I'll give it another go and try to fix that :)
lightning/src/ln/channelmanager.rs
Outdated
hash_map::Entry::Vacant(_) => return Err(APIError::ChannelUnavailable{err: "No such channel".to_owned()}) | ||
break Ok(()); | ||
}, | ||
hash_map::Entry::Vacant(_) => return Err(APIError::ChannelUnavailable{err: format!("No such channel for the passed counterparty_node_id {}", counterparty_node_id) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe update the error messages in a separate commit? This single commit is...kinda huge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Yeah, huge is a bit of an understatement hah 😄
lightning/src/ln/channelmanager.rs
Outdated
/// the handling of the events. | ||
/// | ||
/// TODO: | ||
/// The `counterparty_node_id` can't be passed with `MonitorEvent`s currently, as adding it to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requiring it in MonitorEvent
s breaks backwards compat, adding it does not :) Maybe phrase this as a "we should add this to the monitor events and then eventually rely on it".
lightning/src/ln/channelmanager.rs
Outdated
/// Because adding or removing an entry is rare, we usually take an outer read lock and then | ||
/// operate on the inner value freely. Sadly, this prevents parallel operation when opening a | ||
/// new channel. | ||
/// Currently, we usually take an outer read lock and then operate on the inner value freely. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh? Multiple threads can take the read lock, so that still allows for parallel operation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Snap thanks! That ment to say write
lock, but given that I'll update that, I'll change this as well :)
lightning/src/ln/channelmanager.rs
Outdated
}, | ||
hash_map::Entry::Vacant(entry) => { entry.insert(channel); } | ||
} | ||
} else { unreachable!() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this unreachable? I don't think it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given your feedback on "taking the write lock, then dropping it, then taking it again", I'd agree :). I'll make sure to be more careful about this and other unreachable
clauses.
lightning/src/ln/channelmanager.rs
Outdated
if is_permanent { | ||
remove_channel!(self, channel_state, chan_entry); | ||
break result; | ||
let per_peer_state = self.per_peer_state.write().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this taking a write
lock? In general it seems you take a write
lock in quite a few places where its not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the reasoning behind taking the write
lock here as well as in other places where it's not really needed is because of what i said in the PR message above. But I'll make sure to change that given that it shouldn't be necessary then :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading through this again I realize that I should probably specify a scenario I'm worried about without digging into the message/request thread execution further:
If we grab the read
lock only, in the places where we only need operate on the inner lock, to I'm a bit worried for a scenario where for example large routing nodes would have threads that grab the outer read
lock rapidly, so that there's almost always at least one read
lock acquired. This would then cause issues for the threads that want to access the outer lock with write
access, as they would need to wait an extensive amount of time to access the write
lock, locking the thread meanwhile.
Though, like I stated in my PR message, I'm haven't digged into that deep enough to understand if this is a cause for concern, or if that's nothing we need to worry about :)!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yea, writer starvation could be a concern. We have a FairRwLock
that we use in PeerManager
for this, which would let you take read locks without worrying too much about it :).
lightning/src/ln/channelmanager.rs
Outdated
}, | ||
None => { return Err(APIError::ChannelUnavailable { err: "No such channel".to_owned() }) }, | ||
let (res, chan) = { | ||
let per_peer_state = self.per_peer_state.write().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets avoid taking the write lock, then dropping it, then taking it again, it makes it really hard to reason about the consistency across the two parts of a single function, and I think it likely makes the unreachable
down in the else
branch of the second per_peer_state
lock actually reachable.
Thanks a lot for the feedback @TheBlueMatt! Will address it as soon as I can. I will therefore push the failing check commits fix together with the feedback addressing as well :) |
Would you like me to update the name of the |
lightning/src/ln/channelmanager.rs
Outdated
/// SCIDs (and outbound SCID aliases) to the real channel id. Outbound SCID aliases are added | ||
/// here once the channel is available for normal use, with SCIDs being added once the funding | ||
/// transaction is confirmed at the channel's required confirmation depth. | ||
pub(super) short_to_id: HashMap<u64, [u8; 32]>, | ||
pub(super) short_to_id: HashMap<u64, (PublicKey, [u8; 32])>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe update the variable name and consequence now you're storing the counterparty_node_id
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok sure :)
Could be something like |
I guess you could, no strong opinion. Ideally we'll just remove it sooner or later anyway, so dunno if its worth touching every line with See-also #1403 (comment) - dunno if you want to take that part of this PR out to help move #1403 forward. |
Thanks a lot for the review @ariard! Ok, given that there's no immediate need to update the name, I'll leave it for now as this PR is getting quite huge, and it might therefore be better to leave the name change for a follow-up later because of that. Sorry for the delay on this one, I'm gonna try to get time to address the feedback later today! |
Looks like this needs rebase after #1527. I do wonder if we can split this PR up a bit more - either more commits or ideally even more PRs - to do various smaller, more reviewable new-parameters and such changes before the big honking "move things around" change. |
Sure I'll try to split it up as much as possible! I'll push the new maps in a separate PR and such. Unfortunately though, I think it's quite unavoidable to not make the big "move things around" commit huge, as long as we want every separate commit to build. I'll try to focus it as much as possible on only moving things around, and not changing logic, error messages and such in the same commit though :). |
Yea, there's no question that final commit is going to be big no matter what we do. My only point here is if there's anything we can split at all, we absolutely should, and if possible even into a new PR. |
1b262ca
to
ae665ce
Compare
Apologies for taking so long time to address the feedback! This PR has now been split up and is now based on #1542. So this can be marked as blocked by dependent PR.
@TheBlueMatt, could you please specify a bit what consistency guarantees you would like to have documented? Is it mainly the consistency guarantees between the different maps you're interested in :)? I still need to update some of the commit messages, but will get to that when i squash the commits :). Happy to address any feedback :) |
Yea, basically that - in what order do locks need to be taken, what lock ensures what is consistent with what, etc. |
Needs rebase 🚀 |
ae665ce
to
cdc0f13
Compare
Rebased on main now that the blocking PR has been merged :)! |
Can you go ahead and squash fixups down and clean up the git history? Then I'll do another round of review. |
1e60496
to
d536b69
Compare
Sure squashed :)! Though I've realized I still haven't written the docs for the consistency guarantees. A question before I write the documentation though. Do we want this PR to not change the lock order for the |
After `channels` are now stored in the `per_peer_state`, some logic can be simplified and extra accessing of the `per_peer_state` can be removed.
c06abcd
to
23fdcca
Compare
Thanks, squashed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooookkayyy this looks good to me. There's some notes here that I'd like to see addressed in a followup, but they're not bugs, and in the interest of delaying this, I'm happy to see it land.
@@ -1651,24 +1686,18 @@ where | |||
|
|||
let mut failed_htlcs: Vec<(HTLCSource, PaymentHash)>; | |||
let result: Result<(), _> = loop { | |||
let mut channel_state_lock = self.channel_state.lock().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, when we remove a channel we need to check and see if the peer the channel was connected to has since disconnected and remove the per_peer
entry for it. ie if we force-close a channel for a peer we aren't connected to we have to clean the per_peer entry. I'm not 100% sure how best to do this without always taking the per_peer_state write lock, which would kill paralleism, so maybe we can do it in the timer. In any case, we should ensure it doesn't leave us vulnerable to a DoS like in #1889.
break; | ||
} | ||
|
||
let peer_state_mutex = per_peer_state.as_ref().unwrap().get(&counterparty_node_id).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't do redundant lookups back-to-back :)
@@ -3684,73 +3771,95 @@ where | |||
} | |||
|
|||
fn claim_funds_from_hop<ComplFunc: FnOnce(Option<u64>) -> Option<MonitorUpdateCompletionAction>>(&self, | |||
mut channel_state_lock: MutexGuard<ChannelHolder<<K::Target as SignerProvider>::Signer>>, | |||
per_peer_state_lock: RwLockReadGuard<HashMap<PublicKey, Mutex<PeerState<<K::Target as SignerProvider>::Signer>>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to claim_funds_internal
, I think we should just drop the lock argument here.
}; | ||
|
||
let (found_channel, mut peer_state_opt) = if counterparty_node_id_opt.is_some() && per_peer_state_lock.get(&counterparty_node_id_opt.unwrap()).is_some() { | ||
let peer_mutex = per_peer_state_lock.get(&counterparty_node_id_opt.unwrap()).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, you're doing two lookups in a row.
|
||
if found_channel { | ||
let peer_state = &mut *peer_state_opt.as_mut().unwrap(); | ||
if let hash_map::Entry::Occupied(mut chan) = peer_state.channel_by_id.entry(chan_id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we look up twice.
let per_peer_state = self.per_peer_state.read().unwrap(); | ||
let peer_state_mutex_opt = per_peer_state.get(counterparty_node_id); | ||
if let None = peer_state_mutex_opt { | ||
return Err(MsgHandleErrInternal::send_err_msg_no_close(format!("Can't find a peer matching the passed counterparty node_id {}", counterparty_node_id), msg.temporary_channel_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the message handling functions, we should debug_assert!(false)
here - the peer handler should never pass us messages for peers we don't think are connected.
}}); | ||
if ev_index.is_some() { | ||
let mut updated_msg_events = msg_events.to_vec(); | ||
let ev = updated_msg_events.remove(ev_index.unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're cloning the entire event vec just to remove one and return it? Let's instead pass in a mut reference to msg_events
and remove that way.
let mut peer_state_lock = peer_state_mutex.lock().unwrap(); | ||
let peer_state = &mut *peer_state_lock; | ||
if peer_state.pending_msg_events.len() > 0 { | ||
let mut peer_pending_events = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can drop this entirely, no? append
"leaves the other vec empty"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass minus tests, nothing major. I think we're going to land this and I'll do more post-merge review this week, so we can get this out of the way :)
channel_state.pending_msg_events.push(events::MessageSendEvent::SendOpenChannel { | ||
node_id: their_network_key, | ||
msg: res, | ||
}); | ||
Ok(temporary_channel_id) | ||
} | ||
|
||
fn list_channels_with_filter<Fn: FnMut(&(&[u8; 32], &Channel<<K::Target as SignerProvider>::Signer>)) -> bool>(&self, f: Fn) -> Vec<ChannelDetails> { | ||
fn list_channels_with_filter<Fn: FnMut(&(&[u8; 32], &Channel<<K::Target as SignerProvider>::Signer>)) -> bool + Copy>(&self, f: Fn) -> Vec<ChannelDetails> { | ||
let mut res = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use with_capacity
for conciseness
/// Because adding or removing an entry is rare, we usually take an outer read lock and then | ||
/// operate on the inner value freely. Sadly, this prevents parallel operation when opening a | ||
/// new channel. | ||
/// operate on the inner value freely. This opens up for parallel per-peer operation for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// operate on the inner value freely. This opens up for parallel per-peer operation for | |
/// operate on the inner value freely. This opens up parallel per-peer operation for |
/// The bulk of our storage will eventually be here (message queues and the like). Currently | ||
/// the `per_peer_state` stores our channels on a per-peer basis, as well as the peer's latest | ||
/// features. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't message queues move in this PR?
/// The bulk of our storage will eventually be here (message queues and the like). Currently | |
/// the `per_peer_state` stores our channels on a per-peer basis, as well as the peer's latest | |
/// features. | |
/// The bulk of our storage. Stores our channels on a per-peer basis, as well as the peer's latest | |
/// features. |
let per_peer_state = self.per_peer_state.read().unwrap(); | ||
|
||
let peer_state_mutex_opt = per_peer_state.get(&their_network_key); | ||
if let None = peer_state_mutex_opt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use is_none()
and elsewhere
} | ||
); | ||
let per_peer_state = self.per_peer_state.read().unwrap(); | ||
if let Some(peer_state_mutex) = per_peer_state.get(&counterparty_node_id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should error if the peer is not found in this method, fine to save for followup though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I'm not 100% sure what the correct result is here. We successfully managed to force_close
the channel, but we're unable send this message to the peer. The message would just be dropped anyways if we're no longer connected to the peer, so it maybe should be an OK(())
res?
I'll return Err
here in the follow-ups though, and then we can take decide there what the correct approach is :)!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second glance, I think you're right and we shouldn't bother Err
ing :)
@@ -2187,7 +2225,7 @@ where | |||
/// public, and thus should be called whenever the result is going to be passed out in a | |||
/// [`MessageSendEvent::BroadcastChannelUpdate`] event. | |||
/// | |||
/// May be called with channel_state already locked! | |||
/// May be called with peer_state already locked! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment confuses me because since a Channel
is being passed in, peer_state
must be locked, but not a big deal
let peer_state_mutex_opt = per_peer_state.get(counterparty_node_id); | ||
if let None = peer_state_mutex_opt { | ||
return Err(APIError::APIMisuseError{ err: format!("Can't find a peer matching the passed counterparty node_id {}", counterparty_node_id) }); | ||
} | ||
let mut peer_state_lock = peer_state_mutex_opt.unwrap().lock().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this would be more idiomatic/concise:
let peer_state_mutex_opt = per_peer_state.get(counterparty_node_id); | |
if let None = peer_state_mutex_opt { | |
return Err(APIError::APIMisuseError{ err: format!("Can't find a peer matching the passed counterparty node_id {}", counterparty_node_id) }); | |
} | |
let mut peer_state_lock = peer_state_mutex_opt.unwrap().lock().unwrap(); | |
let peer_state_mutex = per_peer_state.get(counterparty_node_id) | |
.ok_or_else(|| APIError::APIMisuseError{ err: format!("Can't find a peer matching the passed counterparty node_id {}", counterparty_node_id) })?; | |
let mut peer_state_lock = peer_state_mutex.lock().unwrap(); |
and elsewhere
Thanks a lot @TheBlueMatt and @valentinewallace for reviewing this and the preceding PRs! Super happy to see this finally getting in ready state as it's been quite a while since this was initially opened 🚀! I won't be able to address the feedback today as it's already night here. I'll start working on it tomorrow though :). |
#1944 should fix the windows issues. Once that gets a review or two will land this. |
Thanks a lot for digging into the issue @TheBlueMatt! |
I think don't bother touching this PR, I tested an earlier variant of #1944 on my own branch based on this and it seemed to work so I'm happy with it, and it should be easy enough to fix if I'm wrong. |
Ok great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, will take another look after the follow-up
#[cfg(debug_assertions)] | ||
{ | ||
if let None = per_peer_state.get(&$counterparty_node_id) { | ||
// This shouldn't occour in tests unless an unkown counterparty_node_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/occour/occur, s/unkown/unknown
@@ -3481,11 +3557,12 @@ where | |||
fn fail_htlc_backwards_internal(&self, source: &HTLCSource, payment_hash: &PaymentHash, onion_error: &HTLCFailReason, destination: HTLCDestination) { | |||
#[cfg(debug_assertions)] | |||
{ | |||
// Ensure that the `channel_state` lock is not held when calling this function. | |||
// Ensure that no peer state channel storage lock is not held when calling this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/no peer state/the peer state
if let Some(channel) = by_id.get_mut(&previous_channel_id) { | ||
channel.claim_htlc_while_disconnected_dropping_mon_update(claimable_htlc.prev_hop.htlc_id, payment_preimage, &args.logger); | ||
if let Some(peer_node_id) = id_to_peer.get(&previous_channel_id){ | ||
let peer_state_mutex = per_peer_state.get(peer_node_id).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know this unwrap is safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The id_to_peer
map isn't written/read from disk, its built as we deserialize channels, so at this point it should be guaranteed to be consistent with our channel set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agree that this is a bit confusing :). Even the channel existence check below is actually redundant, as we're guaranteed to have the channel here if it's exists in the id_to_peer
map. I can of course add an extra if
clause here though if you think it makes the code less confusing :).
Another general problem here going forward though: I'm a little worried here what we're gonna do here once we remove the id_to_peer
map. Ideally we should be using the short_to_chan_info
map here instead. Gonna look more into if we're guaranteed to have all channels we can have serialized
claims for in the short_to_chan_info
map here, especially channels that's not fully closed but about to (ie. is_usable
is false).
Aiming to have the follow-ups ready by tomorrow! |
Sorry for the delay. I've now opened the followup PR #1966 were the feedback has been addressed :)! |
0.0.114 - Mar 3, 2023 - "Faster Async BOLT12 Retries" API Updates =========== * `InvoicePayer` has been removed and its features moved directly into `ChannelManager`. As such it now requires a simplified `Router` and supports `send_payment_with_retry` (and friends). `ChannelManager::retry_payment` was removed in favor of the automated retries. Invoice payment utilities in `lightning-invoice` now call the new code (lightningdevkit#1812, lightningdevkit#1916, lightningdevkit#1929, lightningdevkit#2007, etc). * `Sign`/`BaseSign` has been renamed `ChannelSigner`, with `EcdsaChannelSigner` split out in anticipation of future schnorr/taproot support (lightningdevkit#1967). * The catch-all `KeysInterface` was split into `EntropySource`, `NodeSigner`, and `SignerProvider`. `KeysManager` implements all three (lightningdevkit#1910, lightningdevkit#1930). * `KeysInterface::get_node_secret` is now `KeysManager::get_node_secret_key` and is no longer required for external signers (lightningdevkit#1951, lightningdevkit#2070). * A `lightning-transaction-sync` crate has been added which implements keeping LDK in sync with the chain via an esplora server (lightningdevkit#1870). Note that it can only be used on nodes that *never* ran a previous version of LDK. * `Score` is updated in `BackgroundProcessor` instead of via `Router` (lightningdevkit#1996). * `ChainAccess::get_utxo` (now `UtxoAccess`) can now be resolved async (lightningdevkit#1980). * BOLT12 `Offer`, `InvoiceRequest`, `Invoice` and `Refund` structs as well as associated builders have been added. Such invoices cannot yet be paid due to missing support for blinded path payments (lightningdevkit#1927, lightningdevkit#1908, lightningdevkit#1926). * A `lightning-custom-message` crate has been added to make combining multiple custom messages into one enum/handler easier (lightningdevkit#1832). * `Event::PaymentPathFailure` is now generated for failure to send an HTLC over the first hop on our local channel (lightningdevkit#2014, lightningdevkit#2043). * `lightning-net-tokio` no longer requires an `Arc` on `PeerManager` (lightningdevkit#1968). * `ChannelManager::list_recent_payments` was added (lightningdevkit#1873). * `lightning-background-processor` `std` is now optional in async mode (lightningdevkit#1962). * `create_phantom_invoice` can now be used in `no-std` (lightningdevkit#1985). * The required final CLTV delta on inbound payments is now configurable (lightningdevkit#1878) * bitcoind RPC error code and message are now surfaced in `block-sync` (lightningdevkit#2057). * Get `historical_estimated_channel_liquidity_probabilities` was added (lightningdevkit#1961). * `ChannelManager::fail_htlc_backwards_with_reason` was added (lightningdevkit#1948). * Macros which implement serialization using TLVs or straight writing of struct fields are now public (lightningdevkit#1823, lightningdevkit#1976, lightningdevkit#1977). Backwards Compatibility ======================= * Any inbound payments with a custom final CLTV delta will be rejected by LDK if you downgrade prior to receipt (lightningdevkit#1878). * `Event::PaymentPathFailed::network_update` will always be `None` if an 0.0.114-generated event is read by a prior version of LDK (lightningdevkit#2043). * `Event::PaymentPathFailed::all_paths_removed` will always be false if an 0.0.114-generated event is read by a prior version of LDK. Users who rely on it to determine payment retries should migrate to `Event::PaymentFailed`, in a separate release prior to upgrading to LDK 0.0.114 if downgrading is supported (lightningdevkit#2043). Performance Improvements ======================== * Channel data is now stored per-peer and channel updates across multiple peers can be operated on simultaneously (lightningdevkit#1507). * Routefinding is roughly 1.5x faster (lightningdevkit#1799). * Deserializing a `NetworkGraph` is roughly 6x faster (lightningdevkit#2016). * Memory usage for a `NetworkGraph` has been reduced substantially (lightningdevkit#2040). * `KeysInterface::get_secure_random_bytes` is roughly 200x faster (lightningdevkit#1974). Bug Fixes ========= * Fixed a bug where a delay in processing a `PaymentSent` event longer than the time taken to persist a `ChannelMonitor` update, when occurring immediately prior to a crash, may result in the `PaymentSent` event being lost (lightningdevkit#2048). * Fixed spurious rejections of rapid gossip sync data when the graph has been updated by other means between gossip syncs (lightningdevkit#2046). * Fixed a panic in `KeysManager` when the high bit of `starting_time_nanos` is set (lightningdevkit#1935). * Resolved an issue where the `ChannelManager::get_persistable_update_future` future would fail to wake until a second notification occurs (lightningdevkit#2064). * Resolved a memory leak when using `ChannelManager::send_probe` (lightningdevkit#2037). * Fixed a deadlock on some platforms at least when using async `ChannelMonitor` updating (lightningdevkit#2006). * Removed debug-only assertions which were reachable in threaded code (lightningdevkit#1964). * In some cases when payment sending fails on our local channel retries no longer take the same path and thus never succeed (lightningdevkit#2014). * Retries for spontaneous payments have been fixed (lightningdevkit#2002). * Return an `Err` if `lightning-persister` fails to read the directory listing rather than panicing (lightningdevkit#1943). * `peer_disconnected` will now never be called without `peer_connected` (lightningdevkit#2035) Security ======== 0.0.114 fixes several denial-of-service vulnerabilities which are reachable from untrusted input from channel counterparties or in deployments accepting inbound connections or channels. It also fixes a denial-of-service vulnerability in rare cases in the route finding logic. * The number of pending un-funded channels as well as peers without funded channels is now limited to avoid denial of service (lightningdevkit#1988). * A second `channel_ready` message received immediately after the first could lead to a spurious panic (lightningdevkit#2071). This issue was introduced with 0conf support in LDK 0.0.107. * A division-by-zero issue was fixed in the `ProbabilisticScorer` if the amount being sent (including previous-hop fees) is equal to a channel's capacity while walking the graph (lightningdevkit#2072). The division-by-zero was introduced with historical data tracking in LDK 0.0.112. In total, this release features 130 files changed, 21457 insertions, 10113 deletions in 343 commits from 18 authors, in alphabetical order: * Alec Chen * Allan Douglas R. de Oliveira * Andrei * Arik Sosman * Daniel Granhão * Duncan Dean * Elias Rohrer * Jeffrey Czyz * John Cantrell * Kurtsley * Matt Corallo * Max Fang * Omer Yacine * Valentine Wallace * Viktor Tigerström * Wilmer Paulino * benthecarman * jurvis
Closes #105 and enables #422 and #1278.
Moves the storage of our channels from the
ChannelManager::ChannelHolder
to theChannelHolder::PeerState
to enable that we store the channels per-peer instead of a single map.This allows us to have channels with the same
temporary_channel_id
for different peers, as specified by BOLT2:https://github.com/lightning/bolts/blob/4b62d26af93a07166d6a9de2cb5eff80849d9c19/02-peer-protocol.md?plain=1#L172