Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to relax temporary_channel_id check #105

Closed
yuntai opened this issue Jul 31, 2018 · 11 comments · Fixed by #1507
Closed

Need to relax temporary_channel_id check #105

yuntai opened this issue Jul 31, 2018 · 11 comments · Fixed by #1507
Labels
good first issue Good for newcomers
Milestone

Comments

@yuntai
Copy link
Contributor

yuntai commented Jul 31, 2018

Per
https://github.com/lightningnetwork/lightning-rfc/blame/4b62d26af93a07166d6a9de2cb5eff80849d9c19/02-peer-protocol.md#L172,

temporary_channel_id is required to be unique only within the same peer.

Then, this seems too restrictive:
https://github.com/rust-bitcoin/rust-lightning/blob/f975639d036b08706fc007aa327bffbc0a5f217b/src/ln/channelmanager.rs#L1265-L1267

@TheBlueMatt
Copy link
Collaborator

TheBlueMatt commented Jul 31, 2018

Hmm, indeed, though honestly I'd kinda like to get the spec changed for that, it seems kinda shitty that we have to make the map use a (node_id, temporary_channel_id) instead of just channel_id key and I'd be highly surprised if any clients currently violate that.

@yuntai yuntai closed this as completed Aug 1, 2018
@TheBlueMatt
Copy link
Collaborator

Technically we violate the spec, so this should stay open, even if I'd prefer the solution be a change to the spec.

@TheBlueMatt
Copy link
Collaborator

Note that this should come for free with #422.

TheBlueMatt added a commit to TheBlueMatt/rust-lightning that referenced this issue Jan 28, 2021
When we receive an error message from a peer, it can indicate a
channel which we should close. However, we previously did not
check that the counterparty who sends us such a message is the
counterparty with whom we have the channel, allowing any
connected peer to make us force-close any channel we have as long
as they know the channel id.

This commit simply changes the force-close logic to check that the
sender matches the channel's counterparty node_id, though as noted
in lightningdevkit#105, we eventually need to change the indexing anyway to allow
absurdly terrible peers to open channels with us.
TheBlueMatt added a commit to TheBlueMatt/rust-lightning that referenced this issue Feb 1, 2021
When we receive an error message from a peer, it can indicate a
channel which we should close. However, we previously did not
check that the counterparty who sends us such a message is the
counterparty with whom we have the channel, allowing any
connected peer to make us force-close any channel we have as long
as they know the channel id.

This commit simply changes the force-close logic to check that the
sender matches the channel's counterparty node_id, though as noted
in lightningdevkit#105, we eventually need to change the indexing anyway to allow
absurdly terrible peers to open channels with us.
TheBlueMatt added a commit to TheBlueMatt/rust-lightning that referenced this issue Feb 1, 2021
When we receive an error message from a peer, it can indicate a
channel which we should close. However, we previously did not
check that the counterparty who sends us such a message is the
counterparty with whom we have the channel, allowing any
connected peer to make us force-close any channel we have as long
as they know the channel id.

This commit simply changes the force-close logic to check that the
sender matches the channel's counterparty node_id, though as noted
in lightningdevkit#105, we eventually need to change the indexing anyway to allow
absurdly terrible peers to open channels with us.
TheBlueMatt added a commit to TheBlueMatt/rust-lightning that referenced this issue Feb 3, 2021
When we receive an error message from a peer, it can indicate a
channel which we should close. However, we previously did not
check that the counterparty who sends us such a message is the
counterparty with whom we have the channel, allowing any
connected peer to make us force-close any channel we have as long
as they know the channel id.

This commit simply changes the force-close logic to check that the
sender matches the channel's counterparty node_id, though as noted
in lightningdevkit#105, we eventually need to change the indexing anyway to allow
absurdly terrible peers to open channels with us.

Found during review of lightningdevkit#777.
TheBlueMatt added a commit to TheBlueMatt/rust-lightning that referenced this issue Feb 3, 2021
When we receive an error message from a peer, it can indicate a
channel which we should close. However, we previously did not
check that the counterparty who sends us such a message is the
counterparty with whom we have the channel, allowing any
connected peer to make us force-close any channel we have as long
as they know the channel id.

This commit simply changes the force-close logic to check that the
sender matches the channel's counterparty node_id, though as noted
in lightningdevkit#105, we eventually need to change the indexing anyway to allow
absurdly terrible peers to open channels with us.
@ariard ariard added the good first issue Good for newcomers label Feb 7, 2021
TheBlueMatt added a commit to TheBlueMatt/rust-lightning that referenced this issue Feb 7, 2021
When we receive an error message from a peer, it can indicate a
channel which we should close. However, we previously did not
check that the counterparty who sends us such a message is the
counterparty with whom we have the channel, allowing any
connected peer to make us force-close any channel we have as long
as they know the channel id.

This commit simply changes the force-close logic to check that the
sender matches the channel's counterparty node_id, though as noted
in lightningdevkit#105, we eventually need to change the indexing anyway to allow
absurdly terrible peers to open channels with us.

Found during review of lightningdevkit#777.
TheBlueMatt added a commit to TheBlueMatt/rust-lightning that referenced this issue Feb 10, 2021
When we receive an error message from a peer, it can indicate a
channel which we should close. However, we previously did not
check that the counterparty who sends us such a message is the
counterparty with whom we have the channel, allowing any
connected peer to make us force-close any channel we have as long
as they know the channel id.

This commit simply changes the force-close logic to check that the
sender matches the channel's counterparty node_id, though as noted
in lightningdevkit#105, we eventually need to change the indexing anyway to allow
absurdly terrible peers to open channels with us.

Found during review of lightningdevkit#777.
valentinewallace pushed a commit to valentinewallace/rust-lightning that referenced this issue Feb 18, 2021
When we receive an error message from a peer, it can indicate a
channel which we should close. However, we previously did not
check that the counterparty who sends us such a message is the
counterparty with whom we have the channel, allowing any
connected peer to make us force-close any channel we have as long
as they know the channel id.

This commit simply changes the force-close logic to check that the
sender matches the channel's counterparty node_id, though as noted
in lightningdevkit#105, we eventually need to change the indexing anyway to allow
absurdly terrible peers to open channels with us.

Found during review of lightningdevkit#777.
valentinewallace pushed a commit to valentinewallace/rust-lightning that referenced this issue Feb 19, 2021
When we receive an error message from a peer, it can indicate a
channel which we should close. However, we previously did not
check that the counterparty who sends us such a message is the
counterparty with whom we have the channel, allowing any
connected peer to make us force-close any channel we have as long
as they know the channel id.

This commit simply changes the force-close logic to check that the
sender matches the channel's counterparty node_id, though as noted
in lightningdevkit#105, we eventually need to change the indexing anyway to allow
absurdly terrible peers to open channels with us.

Found during review of lightningdevkit#777.
@TheBlueMatt TheBlueMatt modified the milestones: 0.1, 0.2 Nov 10, 2021
@ConorOkus ConorOkus moved this to Done in LDK Roadmap Dec 7, 2021
@ViktorTigerstrom
Copy link
Contributor

Hmmm how would we like to design this?

Would a good first step be to update the ChannelManager::per_peer_state to something like:

per_peer_state: RwLock<HashMap<PublicKey, PeerHolder<Signer>>>,

pub(super) struct PeerHolder<Signer: Sign> {
	pub(super) by_id: Mutex<HashMap<[u8; 32], Channel<Signer>>>,
	pub(super) peer_state: Mutex<PeerState>,
}

Naturally the issue then becomes that there's no efficient way to retrieve a Channel with just the channel_id. Would we therefore like to add something like a channel_peer_by_id: RwLock<HashMap<[u8; 32], PublicKey>> to the ChannelManager, where we add values only after the actual channel_id has been created (to not run into the same issue again with potential duplicate temporary_channel_ids)?

But, that proposed design is also problematic as we would then need to change the ChannelManager API to include the counterparty_node_id for the public functions that can be called before the actual channel_id has been set (accept_inbound_channel, force_close_channel etc.)... Though I'm not sure if there's really a way around that since we would now open up for multiple channels with the same temporary_channel_id, and making it unclear to the user which Channel would be effected by a call to those functions when multiple with the same temporary_channel_id exists would be really bad.

I'd be very thankful for any feedback :)!

@TheBlueMatt
Copy link
Collaborator

Would a good first step be to update the ChannelManager::per_peer_state to something like:

Yes, I think that's roughly the right design. I'm a bit worried about getting the locking right, but ultimately we want to get a bit of per-peer parallelism anyway, re #422.

Naturally the issue then becomes that there's no efficient way to retrieve a Channel with just the channel_id.

Good! We should avoid doing that, which is most of the point of this issue :) You're right, we'll need to change a few public functions to take the (counterparty_node_id, channel_id) pair instead of just the channel_id, but by my count that's only five methods.

@ViktorTigerstrom
Copy link
Contributor

Yes, I think that's roughly the right design. I'm a bit worried about getting the locking right, but ultimately we want to get a bit of per-peer parallelism anyway, re #422.

Perfect I'll start with that then :).

Good! We should avoid doing that, which is most of the point of this issue :) You're right, we'll need to change a few public functions to take the (counterparty_node_id, channel_id) pair instead of just the channel_id, but by my count that's only five methods.

Ok great thanks, I'll scrap the idea of the channel_peer_by_id: RwLock<HashMap<[u8; 32], PublicKey>> then!

@ViktorTigerstrom
Copy link
Contributor

ViktorTigerstrom commented May 15, 2022

Hmmm given that the OnionHopData/hop_data payload, only contains the short_channel_id (https://github.com/lightning/bolts/blob/master/04-onion-routing.md#packet-structure), we'd still need some way to retrieve a Channel in an efficient way for the cases where we don't have the counterparty_node_id.

My idea above was to make it possible to fetch the counterparty_node_id PublicKey with a channel_id by adding a channel_peer_by_id: RwLock<HashMap<[u8; 32], PublicKey>> to the ChannelManager (or I guess we could have the key be a short_channel_id if you think that's better), and then retrieve the Channel from the ChannelManager::per_peer_state.

My suggestion would be to only add channel_ids to the channel_peer_by_id: RwLock<HashMap<[u8; 32], PublicKey>>, and not temporary_channel_ids to not run into the issue with duplicate temporary_channel_ids again.

Do you have a better suggestion of how to solve the problem, or do you think this would suffice :)?

@TheBlueMatt
Copy link
Collaborator

I don't think we need a new map here, we should just make the short_to_id have a (PublicKey, [u8; 32]) instead of just a [u8; 32].

@ViktorTigerstrom
Copy link
Contributor

Thanks for the very helpful feedback @TheBlueMatt, and apologies for the delay on this one! I've been traveling most of last week, but I hope to be able to push a PR for this later this week.

While working on this one a few more problems and questions have arisen, for which it would be very helpful to get some clarifications of:

Question 1)
One problem I haven't really been able to find a solution to is how we should retrieve a channel when a MonitorEvent is triggered, given that the new channel storage design that requires either the short_channel_id, or the counterparty_node_id + the channel_id to retrieve the channel. Here's an example of this problem:

MonitorEvent::UpdateFailed(funding_outpoint) => {

From my understanding, we lack the data required to always be able to generate the short_channel_id at that stage, which otherwise would make it possible to find the channel through the short_to_id map.
Therefore I guess the other approach would need to be to pass the counterparty_node_id with the affected MonitorEvents (UpdateFailed and UpdateCompleted). However, that is also problematic as that would require that we add the counterparty_node_id to some field in the MonitorHolder/ChannelMonitorImpl, so that it can be passed when the Events are generated. Adding the counterparty_node_id to the ChannelMonitorImpl would ruin the backwards compatibility though, as it would need to be a required field, and any ChannelMonitorImpl objects serialized prior to adding the field would lack it...

Am I missing something, or do you have any suggestion of how to solve this problem?

Question 2a)
I found this comment in the docs for per_peer_state:
If also holding channel_state lock, must lock channel_state prior to per_peer_state.

Is the comment above only applicable for ensuring that we lock the channel storage before the per_peer_state(the current ChannelHolder::by_id map, which is now moved out from ChannelHolder which makes the comment irrelevant now)? Or is it applicable for all other maps of the ChannelHolder as well?

Question 2b)
Given the comment in 2a should I modify my proposed design above, and move the Mutex for each field in the per_peer_state out to instead lock the entire PeerState struct?

Old design proposal (with slightly updated naming to be more correct):

per_peer_state: RwLock<HashMap<PublicKey, PeerState<Signer>>>,

pub(super) struct PeerState<Signer: Sign> {
	pub(super) channel_by_id: Mutex<HashMap<[u8; 32], Channel<Signer>>>,
	pub(super) latest_features: Mutex<InitFeatures>,
}

New design proposal:

per_peer_state: RwLock<HashMap<PublicKey, Mutex<PeerState<Signer>>>>,

pub(super) struct PeerState<Signer: Sign> {
	pub(super) channel_by_id: HashMap<[u8; 32], Channel<Signer>>,
	pub(super) latest_features: InitFeatures,
}

That would still open up for some per-peer parallelism, though perhaps to some lesser degree. But given the comment in 2a, the extra potential parallelism by the old design proposal seems be something we don't want anyways.

Question 3)
An unfortunate side effect with the proposed design, is that looping over all channels becomes much less efficient.
To loop over all channels, we now need to loop over all Peers (even those we have no channels to), and then lock each channel_by_id Mutex for each Peer, and then loop over any potential channels in the map.

Looping over each channel is sadly done in quite a few places, and unfortunately I think theres no way around this inefficient looping in a few instances. Is that an ok efficiency loss?

In a few instances where we are looping over the channels though, we are only interested in channels which have the funding_tx / channel_id, so that might leave room to make the looping more efficient.

Lastly:
For question 1, and partly for question 3, my suggestion above about the channel_peer_by_id: RwLock<HashMap<[u8; 32], PublicKey>> where we only store channels with a set channel_id (and not channels with only temporary_channel_ids) could be helpful, so let me know if you think it would be worth implementing that.

@TheBlueMatt
Copy link
Collaborator

Question 1)

Grrrrrrrrr, yea, that's a good point, and somewhat frustrating indeed. Long-term we can consider changing the monitor updates to include the counterparty node id (which I think may be useful in ChannelMonitors for other reasons), but for now I suppose we'll need to reintroduce the channel_id/funding_txo -> counterparty id lookup table (though only using it in monitor update handling).

Question 2a)
This is just a general lock-order comment - we have to be consistent in the order we take any locks, and in this case the order is channel_state before per_peer-state. We should be able to flip that order depending on the various callsites, but tests will blow up on you if the lockrder isn't consistent, so you should be able to see it pretty quick.

Question 2b)
I think we should take the "new design proposal". I don't think we care very much about parallelism within a single peer - we can only handle one message from each peer at a time, so it'd just be a difference of sending + handling message at the same time, which isn't gonna be better by the "old design proposal" anyway.

Question 3)
I'm not particularly worried about this - we generally do the looping on a timer or serialization, which isn't all that time-critical, and it shouldn't be a huge change anyway - its only the introduction of an extra lock, not a different datastructure.

@ViktorTigerstrom
Copy link
Contributor

Thanks a lot for the detailed answers @TheBlueMatt!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants