Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Onchaind replay different txs #1171

Merged

Conversation

rustyrussell
Copy link
Contributor

This fixes #1114 with some other incidental cleanups along the way.

Copy link
Contributor

@ZmnSCPxj ZmnSCPxj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments...

@@ -701,6 +705,9 @@ u32 wallet_first_blocknum(struct wallet *w, u32 first_possible)
else
first_utxo = UINT32_MAX;
}
#else
first_utxo = db_get_intvar(w->db, "last_processed_block", UINT32_MAX);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this not potentially lose onchain funds if onchain funds were added long before any channels were created? Maybe safer to use first_possible here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, first_possible is the first theoretical possibility for the blockchain.

first_utxo (could be called "first_unseen_utxo") is simply the last processed_block (plus one, but that would make this statement more complex).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will have to trust you on that, as the tx scanning is beyond my capability for now.

* comparison (ignoring signatures) works pretty well.
*
* FIXME: Better would be to compare outputs, but they weren't
* saved to db correctly until now. (COMPAT_V052)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment suggests we should have a #ifdef COMPAT_V052 or similar in the vicinity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in the FIXME as grep-fodder for COMPAT_V052. I'm not sure we'll actually do this, but at least we'll have the chance to do so when we decided to delete all those....

{
u8 *script = tal_arr(ctx, u8, 0);

add_op(&script, OP_RETURN);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not standard OP_RETURN require an additional push after the OP_RETURN? Otherwise the tx may not get broadcast across the Bitcoin network to reach a miner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bitcoind source doesn't seem to, but I added a test to be sure!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is test? I think regtest and testnet skip over IsStandardTx check?

The bitcoind source doesn't seem to indeed; it allows a plain OP_RETURN: https://github.com/bitcoin/bitcoin/blob/d889c036cd6f683116e6a27e404be2809d1deb76/src/script/standard.cpp#L88-L96

All the guides I found on the Internet indicate a single data push after OP_RETURN but the actual code seems to allow 0 to any number of data pushes, as long as the total length is below the limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, found some other issues, took me much longer to push than I hoped!

{
u8 *script = tal_arr(ctx, u8, 0);

add_op(&script, OP_RETURN);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is test? I think regtest and testnet skip over IsStandardTx check?

The bitcoind source doesn't seem to indeed; it allows a plain OP_RETURN: https://github.com/bitcoin/bitcoin/blob/d889c036cd6f683116e6a27e404be2809d1deb76/src/script/standard.cpp#L88-L96

All the guides I found on the Internet indicate a single data push after OP_RETURN but the actual code seems to allow 0 to any number of data pushes, as long as the total length is below the limit.

@rustyrussell rustyrussell force-pushed the onchaind-replay-different-txs branch from 6dada6c to cce272f Compare March 6, 2018 05:48
@rustyrussell
Copy link
Contributor Author

OK, fixed some CI-found errors, and added 4 new commits....

Copy link
Contributor

@ZmnSCPxj ZmnSCPxj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me, though CI seems to disagree...

@rustyrussell rustyrussell force-pushed the onchaind-replay-different-txs branch from cce272f to aa2074d Compare March 6, 2018 11:06
@rustyrussell
Copy link
Contributor Author

More test errors: I had to fix the wallet COMPAT patch, as we weren't setting the last_processed_block variable on all cases.

@rustyrussell rustyrussell force-pushed the onchaind-replay-different-txs branch from aa2074d to 9f7363e Compare March 6, 2018 20:29
tal_resize(&tx->output, 0);
else
/* Result is trivial? Spent to OP_RETURN to avoid leaving dust. */
if (tx->output[0].amount < dust_limit_satoshis + fee) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solves the to_us respend issue when the value is below the fee I guess. Wouldn't it be better to not create that to_us output at all? But that'd require signaling to the peer that we're ok omitting our output in the commit, which is a spec change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is onchain-only: if fees rise and we can't spent, it's best to remove the UTXO.

The alternative would be to drop the feerate and cross our fingers (or consider it closed?).

For some outputs we're not in a hurry, it's just tidying up: they could potentially use a lower fee rate.

@rustyrussell rustyrussell force-pushed the onchaind-replay-different-txs branch from 9f7363e to 7b35410 Compare March 7, 2018 00:07
@rustyrussell
Copy link
Contributor Author

Rebased to fix conflicts.

@cdecker
Copy link
Member

cdecker commented Mar 7, 2018

Looks like we have just one include out of order, otherwise we're good to go :-)

In general, it is true that accessors should take const and discard it,
but chainparams is *always* const.

Signed-off-by: Rusty Russell <[email protected]>
This lets us clearly mark transition features, in a way that they can
be removed after 0.6 is released.

Signed-off-by: Rusty Russell <[email protected]>
Our testing also reveals a bug: we start lightningd and shut it down
before fully processing the blockchain, so we don't set
last_processed_block.  Fix that by setting it immediately once we have
a block: worst case it goes backwards a little.

Signed-off-by: Rusty Russell <[email protected]>
Use const, add TAKES to declaration.

Signed-off-by: Rusty Russell <[email protected]>
And fix up resulting breakage.

Signed-off-by: Rusty Russell <[email protected]>
This simplifies things, and means it's always in the database.  Our
previous approach to creating it on the fly had holes when it was
created for onchaind, causing us to use another every time we
restarted.

Signed-off-by: Rusty Russell <[email protected]>
It would be better to give them unique values, but we don't fully support
db migrate anyway and this is simple (though they will end up using the
same key for multiple channel closes if created before this commit).

Note that even if bip32_max_index is currently unset, it defaults to 0
so it will be found.

Signed-off-by: Rusty Russell <[email protected]>
The root cause of ElementsProject#1114 was that the restarted onchaind created a
different proposal to the one which had previously been mined:

	2018-03-01T09:41:08.884Z lightningd(1): lightning_onchaind-020d3d5995a973c878e3f6e5f59da54078304c537f981d7dcef73367ecbea0e90e chan #1: STATUS_FAIL_INTERNAL_ERROR: THEIR_UNILATERAL/OUR_HTLC spent with weird witness 3

After the previous patches which fixed the output address difference,
we could identify proposals by their outputs, but during the
transition (onchaind started with old buggy version, restarted now)
that wouldn't be right, so we match the inputs, discarding signatures
which will be different.  This works for all current cases.

Closes: ElementsProject#1114
Signed-off-by: Rusty Russell <[email protected]>
…fees.

This was revealed in ElementsProject#1114; onchaind isn't actually completely idempotent
due to fee changes (and the now-fixed change in keys used).

This triggers the bug by restarting with different fees, resulting in
onchaind not recognizing its own proposal:

	2018-03-05T09:38:15.550Z lightningd(23076): lightning_onchaind-022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59 chan #1: STATUS_FAIL_INTERNAL_ERROR: THEIR_UNILATERAL/OUR_HTLC spent with weird witness 3

Signed-off-by: Rusty Russell <[email protected]>
They're illegal.  Instead do OP_RETURN so we don't pollute the UTXO.

Signed-off-by: Rusty Russell <[email protected]>
DONATING_TO_MINERS is pretty clear, I think.

Signed-off-by: Rusty Russell <[email protected]>
With the following patch applied, we could clearly see onchaind try to
broadcast the timeout tx one block too early:

	sendrawtx exit 26, gave error code: -26?error message:?non-final (code 64)?

This is because of an out-by-one error in calculating the relative
depth required, since the out->tx_blockheight is already 1 before the
current block.

Signed-off-by: Rusty Russell <[email protected]>
rustyrussell and others added 2 commits March 7, 2018 16:33
We say "in N blocks" but we actually mean "N blocks after this tx" which is
actually N-1 or less.  Change wording and tighten tests which misunderstood
this.

Also, the 'assert not l1.daemon.is_in_log('onchaind complete, forgetting peer')'
are unlikely to work until the daemon has actually seen the block, so add
sync_blockheight before all of those.

These changes reveal some sloppy testing, which we fix.
  
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Christian Decker <[email protected]>
@cdecker cdecker force-pushed the onchaind-replay-different-txs branch from 7b35410 to 419d95a Compare March 7, 2018 15:41
@cdecker
Copy link
Member

cdecker commented Mar 7, 2018

Rebased, fixed a minor conflict in the mocks of run-wallet and fixed header ordering.

@ZmnSCPxj if you could ack that there hasn't been a major change then I can merge this :-)

@ZmnSCPxj
Copy link
Contributor

ZmnSCPxj commented Mar 7, 2018

ACK 419d95a

@cdecker cdecker merged commit d363a68 into ElementsProject:master Mar 7, 2018
@cdecker cdecker mentioned this pull request Mar 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consistent crash "FATAL SIGNAL 11 RECEIVED" after catch-up
3 participants