restored channel has mismatched state on the two nodes #228

jl777 · 2017-08-23T10:41:11Z

from sendpay node: { "unique_id" : 0, "state" : "CHANNELD_NORMAL", "netaddr" : "Unknown protocol 0", "peerid" : "02779b57b66706778aa1c7308a817dc080295f3c2a6af349bb1114b8be328c28dc", "connected" : false, "channel" : "70332:1:0", "msatoshi_to_us" : 10125000000, "msatoshi_total" : 10125000000 }

from receiving node: { "unique_id" : 0, "state" : "CHANNELD_NORMAL", "netaddr" : "::ffff:5.9.253.196:55268", "peerid" : "03b03efcf647e6dd48b949e5b1f7e9e064257a5c48c2d56b1334b283b48338f821", "connected" : false, "channel" : "70332:1:0", "msatoshi_to_us" : 9900000000, "msatoshi_total" : 10125000000 }

I am also seeing strange "netaddr" for reconnected peers. It seems close, but not quite restoring state to the same as from a fresh connect. For now, I need to make it forget about the other node to be able to sendpay successfully again. The only way I know how to make it forget is by removing the sql DB file. maybe there is a less drastic way to make a peer forgotten?

cdecker · 2017-08-23T11:03:17Z

The netaddr issue is that we don't store the address to the DB, it's likely no longer the same anyway, e.g., when the connection was initiated by the other end the node would see a random high port. We will rely on gossip and the DNS seeds to get the current address for the node.

The second thing is that they disagree on their current msatoshis, that's a bit scarier. Can you tell us more about how to reproduce this? Which node connects to which other node and which node gets restarted?

jl777 · 2017-08-23T11:34:11Z

just get any error during the loop of sendpay, since it never gets past 100 sendpays, it always ends up in this state. I restart both nodes, but it doesnt seem to matter.

then restart the node and odds are very good, it wont restore state properly and end up with things like:

lightningd(16350): peer 02779b57b66706778aa1c7308a817dc080295f3c2a6af349bb1114b8be328c28dc: Attempt to send HTLC but unowned (CHANNELD_NORMAL)
sendpay rhash.(2bcb9b3beaf32425d3bf2c449c0ff58115e7621c289971a970ee2e9f57b211a3) 0.00100000 to [{"id":"02779b57b66706778aa1c7308a817dc080295f3c2a6af349bb1114b8be328c28dc","channel":"70798:1:1","msatoshi":100000000,"delay":10}] -> "first peer not ready: WIRE_TEMPORARY_CHANNEL_FAILURE" preimage.0000000000000000000000000000000000000000000000000000000000000000

still trying to isolate the causal factor, but before the update, I was able to always go into the sendpay loop. Now, I need to wipe the DB file and addfunds, fundchannel, etc. all over again.

Oh, one thing. Is the paid invoices list critically needed for the payments? I am assuming that once the invoice is paid, it can be deleted, but wanted to make sure that isnt causing any problems

The way to reproduce this is:

establish a channel
iterate sendpay as fast as possible until it fails

now it is in a state where I have to wipe the DB file in order to get the nodes to be able to sendpay again.

jl777 · 2017-08-23T11:47:52Z

after getting EXPIRY error, both nodes have the correct values in the channel:

sending node: { "unique_id" : 0, "state" : "CHANNELD_NORMAL", "netaddr" : "5.9.253.195:9735", "peerid" : "02779b57b66706778aa1c7308a817dc080295f3c2a6af349bb1114b8be328c28dc", "connected" : true, "owner" : "lightningd_channel", "channel" : "70880:1:1", "msatoshi_to_us" : 90950000000, "msatoshi_total" : 101250000000 }

receiving node: { "unique_id" : 0, "state" : "CHANNELD_NORMAL", "netaddr" : "::ffff:5.9.253.196:48872", "peerid" : "03b03efcf647e6dd48b949e5b1f7e9e064257a5c48c2d56b1334b283b48338f821", "connected" : true, "owner" : "lightningd_channel", "channel" : "70880:1:1", "msatoshi_to_us" : 10300000000, "msatoshi_total" : 101250000000 }

I will now stop each lightningd and restart it:

sending node: { "unique_id" : 0, "state" : "CHANNELD_NORMAL", "netaddr" : "Unknown protocol 0", "peerid" : "02779b57b66706778aa1c7308a817dc080295f3c2a6af349bb1114b8be328c28dc", "connected" : false, "channel" : "70880:1:1", "msatoshi_to_us" : 101250000000, "msatoshi_total" : 101250000000 }

receiving node: { "unique_id" : 0, "state" : "CHANNELD_NORMAL", "netaddr" : "Unknown protocol 0", "peerid" : "03b03efcf647e6dd48b949e5b1f7e9e064257a5c48c2d56b1334b283b48338f821", "connected" : false, "channel" : "70880:1:1", "msatoshi_to_us" : 0, "msatoshi_total" : 101250000000 } ] }

both nodes appear to have forgotten all of the payments. only the receiving node is deleting the paid invoice, so there is something amiss here as that cant affect the sending nodes sent amount.

restore on startup appears to be the issue

cdecker self-assigned this Aug 23, 2017

cdecker added the bug label Aug 23, 2017

cdecker mentioned this issue Aug 24, 2017

lightningd: Fix channel-persistence for channels with commits #231

Merged

rustyrussell closed this as completed in #231 Aug 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

restored channel has mismatched state on the two nodes #228

restored channel has mismatched state on the two nodes #228

jl777 commented Aug 23, 2017

cdecker commented Aug 23, 2017

jl777 commented Aug 23, 2017

jl777 commented Aug 23, 2017

restored channel has mismatched state on the two nodes #228

restored channel has mismatched state on the two nodes #228

Comments

jl777 commented Aug 23, 2017

cdecker commented Aug 23, 2017

jl777 commented Aug 23, 2017

jl777 commented Aug 23, 2017