Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify and relax requirements on channel_reestablish #932

Closed
wants to merge 1 commit into from

Conversation

pm47
Copy link
Collaborator

@pm47 pm47 commented Oct 27, 2021

The peer that is using an outdated commitment must not fail the channel itself, rather it should request the other peer to fail the channel.

Also, I think we could relax the requirements on the "up-to-date" side of the channel: if we detect that remote is using an outdated state, we could also stay idle and give remote a chance to fix its data? Maybe the node operator just accidentally started their node using a wrong directory. Current behavior is a bit trigger happy. Of course if remote sends an error, then we force-close.

The peer that is using an outdated commitment must not fail the channel itself, rather it should request the other peer to fail the channel.
@t-bast
Copy link
Collaborator

t-bast commented Oct 27, 2021

Also, I think we could relax the requirements on the "up-to-date" side of the channel: if we detect that remote is using an outdated state, we could also stay idle and give remote a chance to fix its data?

I totally agree, the up-to-date node shouldn't force-close immediately when they detect that their peer is behind, they should wait for the peer to send an error.

@rustyrussell
Copy link
Collaborator

There's nothing wrong with this clarification. A node is supposed to fail the channel on either send or receive of an error; we don't define "fail the channel" but logically "send errors from now on and broadcast the latest commit tx": the spec already says when you must not broadcast (because it's not the latest).

If you didn't want to fail the channel, don't send an error, send a warning.

(C-lightning doesn't fail on error receipt, we'd have to fix that if we wanted to start sending a warning in these cases!).

@@ -1381,7 +1381,7 @@ A node:
`your_last_per_commitment_secret` is correct for that
`next_revocation_number` minus 1:
- MUST NOT broadcast its commitment transaction.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as @rustyrussell mentioned and according to BOLT 1 upon sending an error the node SHOULD fail the channel and failing usually means broadcasting the commitment transaction.

This situation seems a little bit paradox because obviously in this case the node does not want to / MUST NOT broadcast its commitment transaction and thus can't really fail the channel.

I wonder if this could also leave the node in a problematic situation indicating the peer that they may attempt to cheat? But I guess it can't be worse than the already existing situation.

Copy link
Contributor

@lightning-developer lightning-developer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can either make the recommendation who fails the channel when sending or receiving an error more precise or relax them in BOLT1.

One idea that comes to my mind is that we could define that failing a channel always starts with sending an error depending on the state / situation we have to define how the channel is to be failed / how the error is being handled.

Also @rustyrussell wrote:

If you didn't want to fail the channel, don't send an error, send a warning.

edit:

Do we have warnings? I could not find a definition for warning messages in BOLT 1.

Never mind! I discovered #834 which is a delight because after my review yesterday I was about to open an issue suggesting that we could properly define what we mean with fail the channel.

Copy link
Collaborator

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK ad3ed9d

Copy link
Contributor

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack ad3ed9d

t-bast added a commit to t-bast/bolts that referenced this pull request Jan 4, 2022
It was decided in the spec meeting to combine lightning#932 and lightning#942 as they are
closely related. It was also decided to add a rationale for lightning#932 to clearly
state why nodes must wait for an error before force-closing instead of
eagerly force-closing when detecting that their peer is behind.
rustyrussell added a commit to rustyrussell/lightning that referenced this pull request Jan 8, 2022
This was put in late 2019, and @t-bast says Eclair doesn't ignore their
errors and has had no issues.

It also conflicts with lightning/bolts#932
which suggests you *should* fail when you receive an error.

Signed-off-by: Rusty Russell <[email protected]>
rustyrussell added a commit to rustyrussell/lightning that referenced this pull request Jan 9, 2022
This was put in late 2019, and @t-bast says Eclair doesn't ignore their
errors and has had no issues.

It also conflicts with lightning/bolts#932
which suggests you *should* fail when you receive an error.

Signed-off-by: Rusty Russell <[email protected]>
rustyrussell added a commit to rustyrussell/lightning that referenced this pull request Jan 12, 2022
This was put in late 2019, and @t-bast says Eclair doesn't ignore their
errors and has had no issues.

It also conflicts with lightning/bolts#932
which suggests you *should* fail when you receive an error.

Signed-off-by: Rusty Russell <[email protected]>
lightning-developer pushed a commit to lightning-developer/lightning-rfc that referenced this pull request Jan 17, 2022
It was decided in the spec meeting to combine lightning#932 and lightning#942 as they are
closely related. It was also decided to add a rationale for lightning#932 to clearly
state why nodes must wait for an error before force-closing instead of
eagerly force-closing when detecting that their peer is behind.
@lightning-developer
Copy link
Contributor

As requested by @t-bast I have cherry-picked from his branch and updated #942. Thus I believe this PR can be closed and #942 is ready to be merged.

@t-bast
Copy link
Collaborator

t-bast commented Jan 17, 2022

Closing in favor of #942 (which has already been merged)

@t-bast t-bast closed this Jan 17, 2022
rustyrussell added a commit to rustyrussell/lightning that referenced this pull request Jan 20, 2022
This was put in late 2019, and @t-bast says Eclair doesn't ignore their
errors and has had no issues.

It also conflicts with lightning/bolts#932
which suggests you *should* fail when you receive an error.

Signed-off-by: Rusty Russell <[email protected]>
rustyrussell added a commit to ElementsProject/lightning that referenced this pull request Jan 24, 2022
This was put in late 2019, and @t-bast says Eclair doesn't ignore their
errors and has had no issues.

It also conflicts with lightning/bolts#932
which suggests you *should* fail when you receive an error.

Signed-off-by: Rusty Russell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants