Peer channel split, so we can sanely allow a peer to reopen a channel while others are closed #975

rustyrussell · 2018-02-11T11:18:20Z

This moves the internal structure closer to the db structure, where we separate peers and channels; conceptually closer to the final goal of having peers and channels be actual interface to the db (perhaps with a simple cache).

Much of the heavy lifting during the transition is done by channel2peer and peer2channel, to avoid a giant megapatch. These assume a 1:1 relationship between the two, which is true until the very final patch.

Signed-off-by: Rusty Russell <[email protected]>

Clearly we could do more damage if we continue. Signed-off-by: Rusty Russell <[email protected]>

This avoids clashing with the new_channel we're about to add to lightningd, and also matches its counterpart new_initial_channel. Signed-off-by: Rusty Russell <[email protected]>

ZmnSCPxj · 2018-02-11T23:06:01Z

Concept ACK

This will be required to give it direct access to the ld->peers list. Signed-off-by: Rusty Russell <[email protected]>

Both when we forget about an opening peer, and at startup. We're going to be relying on this, and the next patch, as we refactor peer/channel handling to mirror the db. Signed-off-by: Rusty Russell <[email protected]>

ON DELETE CASCADE goes the other way: we should clean up peers with no channels from db. Signed-off-by: Rusty Russell <[email protected]>

…annel. In practice, it currently always does, so we've never hit an error. Signed-off-by: Rusty Russell <[email protected]>

This is not connected yet; during the transition, there will be a 1:1 mapping from channel to peer, so we can use channel2peer and peer2channel to shim between them. Signed-off-by: Rusty Russell <[email protected]>

Much like the database; peer contains id, address, channel contains per-channel information. Where we create a channel, we always create the peer too. For the moment, peer->log and channel->log coexist side-by-side, to reduce some of the churn. Note that this changes the API to dev-forget-channel: if we have more than one channel, we insist they specify the short-channel-id. Signed-off-by: Rusty Russell <[email protected]>

Channels are within the peer structure, but the peer is freed only when the last channel is freed. We also implement channel_set_owner() and make peer_set_owner() a temporary wrapper. Signed-off-by: Rusty Russell <[email protected]>

And move them into channel.c. Signed-off-by: Rusty Russell <[email protected]>

This rolls through many other functions, making them take channel not peer. Signed-off-by: Rusty Russell <[email protected]>

And move the no-remaining-htlcs check from the peer destructor to the channel destructor. Signed-off-by: Rusty Russell <[email protected]>

Signed-off-by: Rusty Russell <[email protected]>

This final sweep only keepl peer2channel within peer_control.c for the reconnect case. Signed-off-by: Rusty Russell <[email protected]>

Signed-off-by: Rusty Russell <[email protected]>

And return the correct error message for the channel they give, if they try to re-establish on an error channel. Signed-off-by: Rusty Russell <[email protected]>

cdecker

You really weren't joking when you said this is a large PR, but it's an excellent read 😉

I really don't like having DB operations hidden in destructors, they should always be explicit at the point we are freeing the memory and only if we really want to forget the channel for example.

NACK fefb026

cdecker · 2018-02-12T17:23:04Z

lightningd/peer_control.c

@@ -181,6 +181,7 @@ static void free_peer(struct peer *peer, const char *why)
 		command_fail(peer->opening_cmd, "%s", why);
 		peer->opening_cmd = NULL;
 	}
+	wallet_channel_delete(peer->ld->wallet, peer->channel->id);


Doesn't this also mean we delete all channels from the DB on a clean shutdown? That'd definitely be undesirable.

I prefer having an explicit call to wallet_channel_delete when we're sure we want to forget about the channel, rather than have this hidden DB change in a destructor anyway.

cdecker · 2018-02-12T17:23:59Z

wallet/wallet.c

+	sqlite3_stmt *stmt;
+
+	/* Get rid of OPENINGD entries; they don't last across reconnects */
+	stmt = db_prepare(w->db, "DELETE FROM channels WHERE state=?");


Should really be a migration and OPENINGD channels should never be added again after the migration is applied.

Yes, that's in a future patch. For the moment we still put opening channels in the DB, so this is needed.

cdecker · 2018-02-12T17:25:56Z

wallet/wallet.c

 {
 	sqlite3_stmt *stmt;
-	stmt = db_prepare(w->db, "DELETE FROM channels WHERE id=?");
+	/* FIXME: The line to clean up if we're last channel for peer would


Please no 😉 Let's not distribute business logic across more components.

Well, I see it the other way: eventually channel.c moves into wallet, and we use accessors. That makes it an internal in-memory cache.

But the comment was more about how we want on cascade delete in reverse, and whether we could do it better than this.

I somewhat agree with @rustyrussell here. I think we should accept SQLITE3 as our backend database, and, build subsystems that directly manipulate the DB while exposing a small number of functions that perform parts of business logic on the DB. xref. wallet/invoices.

Having triggers and caches is a bad idea, a trigger may be deleting the row in the DB while the cache still looks ok, so you'd be replicating the logic in C anyway to keep the cache coherent, or you can remove caches altogether, at which point I'd be reluctantly willing to move some business logic over to the DB.

I have yet to see a project that uses triggers and doesn't add complexity as a result of it, maybe this is the first, but I highly doubt it 😉

cdecker · 2018-02-12T17:37:07Z

lightningd/channel.h

+	struct peer *peer;
+
+	/* Database ID: 0 == not in db yet */
+	u64 dbid;


Any reason we aren't just calling this id?

id is used everywhere for node pubkey. This is much more explicit as to what it is.

cdecker · 2018-02-12T18:17:23Z

lightningd/channel.c

+		command_fail(channel->opening_cmd, "%s", why);
+		channel->opening_cmd = NULL;
+	}
+	wallet_channel_delete(channel->peer->ld->wallet, channel->dbid,


Here we again use a destructor to actually delete stuff in the DB. This is bad IMHO since the lifetime in memory is not the same as the lifetime in the DB. If we do a clean shutdown we may end up deleting all channels from the DB...

I have a strong preference for explicit DB operations rather than hiding them in destructors.

That's not the destructor. The destructors are all called destroy_xxx for that reason.

TBH I wanted the destructor-deletes semantic, but as you point out, shutdown would clean our db:)

The destructors are all called destroy_xxx for that reason.

This surprises me. There are a few destructors that are not named by this convention (remove_timer, invoice_waiter_dtor, free_subd_req, remove_unstored_payment). How strict is this rule, and should we have a cleanup issue for this?

Ok, then I propose renaming free_channel to delete_peer, since free_* suggests a memory operation to me :-)

Agreed, will push on top.

And will push destroy renaming too...

@cdecker

free_channel() sounds like a destructor. Suggested-by: @cdecker Signed-off-by: Rusty Russell <[email protected]>

rustyrussell · 2018-02-13T23:53:36Z

OK, all feedback should be addressed now! Thanks!

@cdecker

This provides a sanity check that we are in sync, and also keeps the logic in the program and out of the SQL. Since the destructor now doesn't clean up the peer, there are some wider changes to be made when cleaning up. Most notably we create lots of channels in run-wallet.c and they previously freed the peer: now we need free the peer explicitly, so we need to free them first. Suggested-by: @cdecker Signed-off-by: Rusty Russell <[email protected]>

@ZmnSCPxj

We usually did this, but sometimes they were named after what they did, rather than what they cleaned up. There are still a few exceptions: 1. I didn't bother creating destroy_xxx wrappers for htable routines which already existed. 2. Sometimes destructors really are used for side-effects (eg. to simply mark that something was freed): these are clearer with boutique names. 3. Generally destructors are static, but they don't need to be: in some cases we attach a destructor then remove it later, or only attach to *some* cases. These are best with qualifiers in the destroy_<type> name. Suggested-by: @ZmnSCPxj Signed-off-by: Rusty Russell <[email protected]>

Changed as requested

cdecker · 2018-02-14T10:31:53Z

ACK 383afe8

rustyrussell added 3 commits February 11, 2018 21:32

gossipd/test: update mocks.

102a591

Signed-off-by: Rusty Russell <[email protected]>

db: don't allow newer db versions.

677d2a0

Clearly we could do more damage if we continue. Signed-off-by: Rusty Russell <[email protected]>

channeld: rename new_channel to new_full_channel.

f937b71

This avoids clashing with the new_channel we're about to add to lightningd, and also matches its counterpart new_initial_channel. Signed-off-by: Rusty Russell <[email protected]>

rustyrussell added the Work in Progress label Feb 11, 2018

rustyrussell force-pushed the peer-channel-split branch from a99313e to 26c4342 Compare February 12, 2018 00:29

rustyrussell removed the Work in Progress label Feb 12, 2018

rustyrussell force-pushed the peer-channel-split branch 3 times, most recently from ee37f2b to 4b79feb Compare February 12, 2018 02:57

rustyrussell added 15 commits February 12, 2018 20:40

wallet: add ld pointer.

5e0123c

This will be required to give it direct access to the ld->peers list. Signed-off-by: Rusty Russell <[email protected]>

wallet: delete channels in state OPENINGD.

b10cf34

Both when we forget about an opening peer, and at startup. We're going to be relying on this, and the next patch, as we refactor peer/channel handling to mirror the db. Signed-off-by: Rusty Russell <[email protected]>

wallet: delete peers with no channels.

2d959d2

ON DELETE CASCADE goes the other way: we should clean up peers with no channels from db. Signed-off-by: Rusty Russell <[email protected]>

wallet: properly handle case where peer has no address when saving ch…

8e25301

…annel. In practice, it currently always does, so we've never hit an error. Signed-off-by: Rusty Russell <[email protected]>

lightningd: create new structure channel to hold per-channel info.

b41087d

This is not connected yet; during the transition, there will be a 1:1 mapping from channel to peer, so we can use channel2peer and peer2channel to shim between them. Signed-off-by: Rusty Russell <[email protected]>

lightningd: channels own the peer.

018c93a

Channels are within the peer structure, but the peer is freed only when the last channel is freed. We also implement channel_set_owner() and make peer_set_owner() a temporary wrapper. Signed-off-by: Rusty Russell <[email protected]>

lightningd: rename peer_fail functions to channel_fail.

fefb026

And move them into channel.c. Signed-off-by: Rusty Russell <[email protected]>

subd: keep pointer to channel, not peer.

544f676

This rolls through many other functions, making them take channel not peer. Signed-off-by: Rusty Russell <[email protected]>

htlc: keep channel pointer, not peer pointer.

6d63093

And move the no-remaining-htlcs check from the peer destructor to the channel destructor. Signed-off-by: Rusty Russell <[email protected]>

lightningd/peer_htlcs: remove remaining peer_ shims.

8704434

Signed-off-by: Rusty Russell <[email protected]>

lightningd: bitcoind and topology routines take channel, not peer.

e0eb5f0

Signed-off-by: Rusty Russell <[email protected]>

lightningd: remove almost all other peer2channel / channel2peer shims.

32293ed

This final sweep only keepl peer2channel within peer_control.c for the reconnect case. Signed-off-by: Rusty Russell <[email protected]>

lightningd: remove peer->log in favor of channel->log.

9d16fa5

Signed-off-by: Rusty Russell <[email protected]>

lightningd: allow a new channel open from peer if no *active* channels.

ea40ebe

And return the correct error message for the channel they give, if they try to re-establish on an error channel. Signed-off-by: Rusty Russell <[email protected]>

rustyrussell force-pushed the peer-channel-split branch from 4b79feb to ea40ebe Compare February 12, 2018 10:13

cdecker previously requested changes Feb 12, 2018

View reviewed changes

cdecker mentioned this pull request Feb 12, 2018

Fix crash when trying to forget a channel with HTLCs #987

Closed

channel: rename free_channel to delete_channel.

afa19c5

free_channel() sounds like a destructor. Suggested-by: @cdecker Signed-off-by: Rusty Russell <[email protected]>

rustyrussell force-pushed the peer-channel-split branch 3 times, most recently from 12c60a7 to 25753aa Compare February 14, 2018 01:24

rustyrussell added 2 commits February 14, 2018 12:23

rustyrussell force-pushed the peer-channel-split branch from 25753aa to 383afe8 Compare February 14, 2018 01:53

rustyrussell mentioned this pull request Feb 14, 2018

Opening cleanups #996

Merged

cdecker merged commit 55d9620 into ElementsProject:master Feb 14, 2018

ZmnSCPxj mentioned this pull request Feb 15, 2018

unable to reopen uncooperative channel close between 2 c-lightning. #720

Closed

ZmnSCPxj mentioned this pull request Feb 26, 2018

Peers not being forgotten after disconnect #522

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peer channel split, so we can sanely allow a peer to reopen a channel while others are closed #975

Peer channel split, so we can sanely allow a peer to reopen a channel while others are closed #975

rustyrussell commented Feb 11, 2018

ZmnSCPxj commented Feb 11, 2018

cdecker left a comment

cdecker Feb 12, 2018

cdecker Feb 12, 2018

rustyrussell Feb 12, 2018

cdecker Feb 12, 2018

rustyrussell Feb 12, 2018

ZmnSCPxj Feb 13, 2018

cdecker Feb 13, 2018

cdecker Feb 12, 2018

rustyrussell Feb 12, 2018

cdecker Feb 12, 2018

rustyrussell Feb 12, 2018

ZmnSCPxj Feb 13, 2018

cdecker Feb 13, 2018

rustyrussell Feb 13, 2018

rustyrussell commented Feb 13, 2018

cdecker commented Feb 14, 2018

Peer channel split, so we can sanely allow a peer to reopen a channel while others are closed #975

Peer channel split, so we can sanely allow a peer to reopen a channel while others are closed #975

Conversation

rustyrussell commented Feb 11, 2018

ZmnSCPxj commented Feb 11, 2018

cdecker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rustyrussell commented Feb 13, 2018

cdecker commented Feb 14, 2018