-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connectd does demux part 1 #4984
Connectd does demux part 1 #4984
Commits on Jan 8, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 635e72b - Browse repository at this point
Copy the full SHA 635e72bView commit details -
connectd: keep timeout timer around so we can disable it.
connectd will be keeping the conn open, so it needs to free this "conn_timeout" timer. Pass it through, so we can do that. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f99d086 - Browse repository at this point
Copy the full SHA f99d086View commit details -
pytest: make test_channel_state_changed_unilateral more robust.
This test started mostly failing (in non-DEVELOPER mode) after the next patch, due to timing issues. Handle both cases for now, and we'll add more enhancements later to things we should be handling more consistently. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9d6466a - Browse repository at this point
Copy the full SHA 9d6466aView commit details -
connectd: maintain connection with peer, shuffle data.
Instead of passing the incoming socket to lightningd for the subdaemon, create a new one and simply shuffle data between them, keeping connectd in the loop. For the moment, we don't decrypt at all, just shuffle. This means our buffer code is kind of a hack, but that goes away once we start actually decrypting and understanding message boundaries. This implementation is naive: it closes the socket to the local daemon as soon as the peer closes the socket to us. This is fixed in a successive patch series (along with many other similar issues). Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f961e4a - Browse repository at this point
Copy the full SHA f961e4aView commit details -
gossipwith: create our own internal sync_crypto functions.
This avoids changes to crypto_sync which are coming in next patch. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8c05f73 - Browse repository at this point
Copy the full SHA 8c05f73View commit details -
lightningd: make sure gossipd knows before we update blockheight.
Without this, we can get spurious failures from lnprototest, which is waiting for lightningd to acknowledge the blockheight in getinfo: ``` 2021-12-21T00:56:21.460Z DEBUG lightningd: Adding block 109: 57a7bd3ade3a88a899e5b442075e9722ccec07e0205f9a913b304a3e2e038e26 2021-12-21T00:56:21.470Z DEBUG lightningd: Adding block 110: 11a280eb76f588e92e20c39999be9d2baff016c3c6bac1837b649a270570b7dd 2021-12-21T00:56:21.479Z DEBUG lightningd: Adding block 111: 02977fc9529b2ab4e0a805c4bc1bcfbff5a4e6577a8b31266341d22e204a1d27 2021-12-21T00:56:21.487Z DEBUG lightningd: Adding block 112: 2402f31c5ddfc9e847e8bbfb7df084d29a5d5d936a4358c109f2f4cf9ea8d828 2021-12-21T00:56:21.496Z DEBUG lightningd: Adding block 113: 5a561fe9423b4df33f004fc09985ee3ef38364d692a56a8b27ecbc6098a16d39 2021-12-21T00:56:21.505Z DEBUG lightningd: Adding block 114: 4502f5ec23c89177872846848848322e8fa6c3fb6f5eb361194e4cd47596dfe9 2021-12-21T00:56:21.511Z DEBUG 02f9308a019258c31049344f85f89d5229b531c845836f99b08601f113bce036f9-gossipd: Ignoring future channel_announcment for 109x1x0 (current block 108) ``` Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d322f9c - Browse repository at this point
Copy the full SHA d322f9cView commit details -
connectd: do decryption for peers.
We temporarily hack to sync_crypto_write/sync_crypto_read functions to not do any crypto, and do it all in connectd. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for da36066 - Browse repository at this point
Copy the full SHA da36066View commit details -
peer_io: replace crypto_sync in daemons, use normal wire messages.
Now connectd is doing the crypto, we can use normal wire io. We create helper functions to clearly differentiate between "peer" comms and intra-daemon comms though. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1e55691 - Browse repository at this point
Copy the full SHA 1e55691View commit details -
per_peer_state: remove struct crypto_state
Now that connectd does the crypto, no need to hand around crypto_state. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 36f4a75 - Browse repository at this point
Copy the full SHA 36f4a75View commit details -
connectd: do dev_disconnect logic.
As connectd handles more packets itself, or diverts them to/from gossipd, it's the only place we can implement the dev_disconnect logic. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d215f9e - Browse repository at this point
Copy the full SHA d215f9eView commit details -
connectd: do nagle by packet type.
channeld can't do it any more: it's using local sockets. Connectd can do it, and simply does it by type. Amazingly, on my machine the timing change *always* caused test_channel_receivable() to fail, due to a latent race. Includes feedback from @cdecker. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1303733 - Browse repository at this point
Copy the full SHA 1303733View commit details -
common: add routine for absolute timeouts (vs. relative).
Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for cc6f91a - Browse repository at this point
Copy the full SHA cc6f91aView commit details -
connectd: serve gossip_store file for the peer.
We actually intercept the gossip_timestamp_filter, so the gossip_store mechanism inside the per-peer daemon never kicks off for normal connections. The gossipwith tool doesn't set OPT_GOSSIP_QUERIES, so it gets both, but that only effects one place. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d9f9d39 - Browse repository at this point
Copy the full SHA d9f9d39View commit details -
subdaemons: don't stream gossip_store at all.
We now let gossipd do it. This also means there's nothing left in 'struct per_peer_state' to send across the wire (the fds are sent separately), so that gets removed from wire messages too. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e31bccf - Browse repository at this point
Copy the full SHA e31bccfView commit details
Commits on Jan 11, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 7555826 - Browse repository at this point
Copy the full SHA 7555826View commit details -
connectd: remove per_peer_state in favor of keeping gossip_fd directly.
Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for cd679cc - Browse repository at this point
Copy the full SHA cd679ccView commit details -
connectd: put more stuff into struct gossip_state.
We're the only ones who use it now, so put our fields inside it and make it local. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 12a0a05 - Browse repository at this point
Copy the full SHA 12a0a05View commit details -
connectd: get addresses from lightningd, not gossipd.
It's weird to have connectd ask gossipd, when lightningd can just do it and hand all the addresses together. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a2a8893 - Browse repository at this point
Copy the full SHA a2a8893View commit details -
various: minor cleanups from Christian's review.
More significant things have been folded. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 65db41b - Browse repository at this point
Copy the full SHA 65db41bView commit details -
connectd: drop support (unused) for @ during handshake.
We could implement it, but we don't have to. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for eeff573 - Browse repository at this point
Copy the full SHA eeff573View commit details -
connectd: implement @ correctly.
dev_blackhole_fd was a hack, and doesn't work well now we are async (it worked for sync comms in per-peer daemons, but now we could sneak through a read before we get to the next write). So, make explicit flags and use them. This is much easier now we have all peer comms in one place. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5bc92fc - Browse repository at this point
Copy the full SHA 5bc92fcView commit details -
msg_queue: don't allow magic MSG_PASS_FD message for peers.
msg_queue was originally designed for inter-daemon comms, and so it has a special mechanism to mark that we're trying to send an fd. Unfortunately, a peer could also send such a message, confusing us! Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 029fdcf - Browse repository at this point
Copy the full SHA 029fdcfView commit details -
ccan: update to get io_sock_shutdown
Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 22a5676 - Browse repository at this point
Copy the full SHA 22a5676View commit details -
connectd: don't just close to peer, but use shutdown().
We would lose packets sometimes due to this previously, but it doesn't happen over localhost so our tests didn't notice. However, now we have connectd being sole thing talking to peers, we can do a more elegant shutdown, which should fix closing. Signed-off-by: Rusty Russell <[email protected]> Changelog-Fixed: Protocol: Always flush sockets to increase chance that final message get to peer (esp. error packets).
Configuration menu - View commit details
-
Copy full SHA for 3df6eba - Browse repository at this point
Copy the full SHA 3df6ebaView commit details -
connectd: also do the shutdown()-close for final_msg sends.
Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c35dde8 - Browse repository at this point
Copy the full SHA c35dde8View commit details -
connectd: flush queues before hanging up.
This is critical in the common case where peer sends an error and hangs up: we almost never get to relay the error to the subd in time. This also applies in the other direction: we need to flush the queue to the peer when the subd closes. Note we only free the actual peer struct when lightningd reaps us with connectd_peer_disconnected(). Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2ff7746 - Browse repository at this point
Copy the full SHA 2ff7746View commit details -
Configuration menu - View commit details
-
Copy full SHA for e8bd20f - Browse repository at this point
Copy the full SHA e8bd20fView commit details -
pytest: fix flake in test_reconnect_sender_add1
l1 might split in a commitment_signed before it notices the disconnect, and this test fails: ``` for i in range(0, len(disconnects)): with pytest.raises(RpcError): l1.rpc.sendpay(route, rhash, payment_secret=inv['payment_secret']) > l1.rpc.waitsendpay(rhash) E Failed: DID NOT RAISE <class 'pyln.client.lightning.RpcError'> ``` Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9199de5 - Browse repository at this point
Copy the full SHA 9199de5View commit details -
peer subds: ignore failed writes.
In the case where the peer sends an error (and hangs up) immediately after init, connectd *doesn't actually read the error* (even after all the previous fixes so it actually receives the error!). This is because to tried to first write WIRE_CHANNEL_REESTABLISH, and that fails, so it never tries to read. Generally, we should ignore write failures; we'll find out if the socket is closed when we read nothing. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7e85667 - Browse repository at this point
Copy the full SHA 7e85667View commit details -
pytest: disable tests/test_closing.py::test_onchain_all_dust
Here's the "Normal Tet Config clang-fuzzing" setup where it fails: ``` CC=clang CONFIGURATOR_CC=clang CWARNFLAGS=-Wall -Wundef -Wmissing-prototypes -Wmissing-declarations -Wstrict-prototypes -Wold-style-definition -Werror CDEBUGFLAGS=-std=gnu11 -g -fstack-protector-strong COPTFLAGS= SQLITE3_CFLAGS= SQLITE3_LDLIBS=-lsqlite3 POSTGRES_INCLUDE=-I/usr/include/postgresql POSTGRES_LDLIBS=-L/usr/lib/x86_64-linux-gnu -lpq VALGRIND=0 DEVELOPER=1 EXPERIMENTAL_FEATURES=0 COMPAT=1 ``` Here's the truncated test output: ``` if anchor_expected(): expected_1['B'].append(('external', ['anchor'], None, None)) expected_2['B'].append(('external', ['anchor'], None, None)) expected_1['B'].append(('wallet', ['anchor'], None, None)) expected_2['B'].append(('wallet', ['anchor'], None, None)) tags = check_utxos_channel(l1, [channel_id], expected_1) > check_utxos_channel(l2, [channel_id], expected_2, tags) tests/test_closing.py:2662: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tests/utils.py:321: in check_utxos_channel txid = matchup_events(u_set, evs, chans, tag_list) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ u_set = [[{'account_id': 'external', 'blockheight': 104, 'coin_type': 'bcrt', 'credit': '0msat', ...}, {'account_id': 'external', 'blockheight': 110, 'coin_type': 'bcrt', 'credit': '0msat', ...}]] evs = [('external', ['to_them'], None, None), ('external', ['htlc_timeout'], None, None)] chans = ['8ede62cea34c5196467c68175f70b8915f0edda421c5087a99584d4197cfb6c4'] tag_list = {'A': 'c5b6cf97414d58997a08c521a4dd0e5f91b8705f17687c4696514ca3ce62de8e', 'B': 'bb09b25d6653aeeec188961347ff80e90dca6f4a29cc017856f6585adb8cb468'} def matchup_events(u_set, evs, chans, tag_list): assert len(u_set) == len(evs) and len(u_set) > 0 txid = u_set[0][0]['utxo_txid'] for ev in evs: found = False for u in u_set: # We use 'cid' as a placeholder for the channel id, since it's # dyanmic, but we need to sub it in. 'chans' is a list of cids, # which are mapped to `cid` tags' suffixes. eg. 'cid1' is the # first cid in the chans list if ev[0][:3] == 'cid': idx = int(ev[0][3:]) acct = chans[idx - 1] else: acct = ev[0] if u[0]['account_id'] != acct or u[0]['tags'] != ev[1]: continue if ev[2] is None: > assert u[1] is None E AssertionError ```
Configuration menu - View commit details
-
Copy full SHA for 3f532bf - Browse repository at this point
Copy the full SHA 3f532bfView commit details -
pytest: disable automatic reconnection.
We seem to hit a race between manual reconnect (with address hint) and an automatic reconnection attempt which fails: ``` > l4.rpc.connect(l3.info['id'], 'localhost', l3.port) ... E pyln.client.lightning.RpcError: RPC call failed: method: connect, payload: {'id': '035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d', 'host': 'localhost', 'port': 41285}, error: {'code': 401, 'message': 'All addresses failed: 127.0.0.1:36678: Connection establishment: Connection refused. '} ``` See how it didn't even try the given address? Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f096e8e - Browse repository at this point
Copy the full SHA f096e8eView commit details -
connectd: make sure we io_log msgs doing to gossipd.
test_gossip_no_empty_announcements relies on this! Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f63b944 - Browse repository at this point
Copy the full SHA f63b944View commit details -
connectd: don't ignore requests to connect if we're shutting down.
We used to shut down peers atomically, but now we flush the connections there's a delay. If we are asked to connect in that time, we ignore it, as we are already connected, but that's wrong: we need to remember that we were told to connect and reconnect. This should solve a few weird test failures where "connect" would hang indefinitely. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 49c8d95 - Browse repository at this point
Copy the full SHA 49c8d95View commit details
Commits on Jan 12, 2022
-
pytest: disable test_closing_different_fees for elements temporarily.
There's actually a bug in our closing tx size estimation; I'll do a separate patch for this, though. Seems this used to be flaky, now we always flush queues, so it's more reliably caught. Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 013d895 - Browse repository at this point
Copy the full SHA 013d895View commit details -
pytest: fix race when we mine blocks after pay().
This seems to trigger now, especially on PostgresQL (maybe it's faster to process blocks?). e.g. test_closing_simple() hangs in close(), because the close is unilateral because the HTLC timed out, so it's waiting for a block (other lines removed): ``` lightningd-1: 2022-01-12T00:33:46.258Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: peer_out WIRE_COMMITMENT_SIGNED lightningd-1: 2022-01-12T00:33:46.278Z DEBUG lightningd: close_command: timeout = 172800 2022-01-12T01:03:36.9757201Z lightningd-2: 2022-01-12T00:33:46.384Z DEBUG lightningd: Adding block 104: 73ffa19d27d048613b2731e1682b4efff0dc226807d8cc99d724523c2ea58204 2022-01-12T01:03:36.9759053Z lightningd-2: 2022-01-12T00:33:46.396Z DEBUG lightningd: Adding block 105: 44fd06ed053a0d0594abcfefcfa69089351fc89080826799fb4b278a68fe5c20 2022-01-12T01:03:36.9760865Z lightningd-2: 2022-01-12T00:33:46.406Z DEBUG lightningd: Adding block 106: 0fee2dcbd1376249832642079131275e195bba4fb49cc9968df3a899010bba0f 2022-01-12T01:03:36.9762632Z lightningd-2: 2022-01-12T00:33:46.418Z DEBUG lightningd: Adding block 107: 7f24f2d7d3e83fe3256298bd661e57cdf92b058440738fd4d7e1c8ef4a4ca073 2022-01-12T01:03:36.9773411Z lightningd-2: 2022-01-12T00:33:46.429Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: peer_in WIRE_REVOKE_AND_ACK 2022-01-12T01:03:36.9794707Z lightningd-2: 2022-01-12T00:33:46.437Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: Commits outstanding after recv revoke_and_ack 2022-01-12T01:03:36.9788197Z lightningd-2: 2022-01-12T00:33:46.433Z DEBUG lightningd: Adding block 108: 283b371fb5d1ef42980ea10ab9f5965a179af8e91ddf31c8176e79820e1ec54d 2022-01-12T01:03:36.9799347Z lightningd-2: 2022-01-12T00:33:46.439Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: HTLC 0[REMOTE] => RCVD_REMOVE_REVOCATION 2022-01-12T01:03:36.9808057Z lightningd-2: 2022-01-12T00:33:46.447Z UNUSUAL 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-chan#1: Peer permanent failure in CHANNELD_NORMAL: Fulfilled HTLC 0 RCVD_REMOVE_REVOCATION cltv 109 hit deadline ``` This is because `pay` returns from l1 when it has the preimage, not when the HTLC is fully resolved. Add a helper for this, and call it at the end of the pay test helper. We might need this elsewhere though! Signed-off-by: Rusty Russell <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 886bede - Browse repository at this point
Copy the full SHA 886bedeView commit details