Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing/ - Fix chain.LastHeader callsite #1668

Closed
wants to merge 1 commit into from
Closed

testing/ - Fix chain.LastHeader callsite #1668

wants to merge 1 commit into from

Conversation

danwt
Copy link

@danwt danwt commented Jul 6, 2022

Bug

The way the light client works is:

When an UpdateClient operation completes and a header H0 with height h0, valset v0 and nextValset nV0 becomes trusted the client stores nV0 as the trustedValset for h0. A future UpdateClient operation might occur with header H1, height h1 (h0 < h1) and citing h0 as the trusted height. For this operation to succeed the trustedValset included in H1 must equal nV0.

The bug is in the code: suppose an attempted execution following the above pattern. Suppose the header H0 is trusted by the light client. This header was created from the field chain.LastHeader which is updated too early and thus has an incorrect nextValSet. Thus the field nextValset is set to v0 and not nV0. This is trusted by the client so the client will expect a future header H1 to have trustedValset equal to v0. But this trustedValset is correctly queried here

tmTrustedVals, ok = counterparty.GetValsAtHeight(int64(trustedHeight.RevisionHeight + 1))

using historical info, set by the staking module based on the correct notion of nextVals. Thus there is a mismatch. It will trigger this

if !bytes.Equal(consState.NextValidatorsHash, tvalHash) {

I can replicate this bug but it is a bit involved. I will update this PR with more info if wanted when I have time.

Extra info

See also ValidatorHash and NextValidatorHash here

https://github.com/tendermint/tendermint/blob/master/spec/core/data_structures.md#header=

as well as here in tendermint

	// hashes from the app output from the prev block
	ValidatorsHash     []byte `protobuf:"bytes,8,opt,name=validators_hash,json=validatorsHash,proto3" json:"validators_hash,omitempty"`
	NextValidatorsHash []byte `protobuf:"bytes,9,opt,name=next_validators_hash,json=nextValidatorsHash,proto3" json:"next_validators_hash,omitempty"`

thus Vals and NextVals should be the same for both fields here

ibc-go/testing/chain.go

Lines 56 to 57 in 527a11a

LastHeader *ibctmtypes.Header // header for last block height committed
CurrentHeader tmproto.Header // header for current block height

which they are not.

*PS: I wonder if there is a 'two wrongs make a right' thing going on somewhere in testing/, generally

@codecov-commenter
Copy link

codecov-commenter commented Jul 6, 2022

Codecov Report

Merging #1668 (9a48dec) into main (7d971ec) will increase coverage by 0.28%.
The diff coverage is 74.28%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1668      +/-   ##
==========================================
+ Coverage   80.06%   80.35%   +0.28%     
==========================================
  Files         165      207      +42     
  Lines       12381    16880    +4499     
==========================================
+ Hits         9913    13564    +3651     
- Misses       2000     2755     +755     
- Partials      468      561      +93     
Impacted Files Coverage Δ
.github/scripts/build_test_matrix.go 74.28% <74.28%> (ø)
modules/core/02-client/keeper/migrations.go 37.50% <0.00%> (-12.50%) ⬇️
modules/core/04-channel/types/keys.go 75.00% <0.00%> (-10.72%) ⬇️
modules/core/02-client/keeper/events.go 92.15% <0.00%> (-7.85%) ⬇️
modules/core/03-connection/keeper/verify.go 91.58% <0.00%> (-6.69%) ⬇️
modules/apps/transfer/module.go 55.76% <0.00%> (-2.86%) ⬇️
modules/core/02-client/genesis.go 14.63% <0.00%> (-2.51%) ⬇️
...7-interchain-accounts/controller/ibc_middleware.go 82.11% <0.00%> (-2.37%) ⬇️
modules/core/02-client/legacy/v100/solomachine.go 4.21% <0.00%> (-2.35%) ⬇️
modules/apps/29-fee/module.go 54.25% <0.00%> (-1.31%) ⬇️
... and 103 more

@danwt danwt changed the title Fix chain.LastHeader callsite testing/ - Fix chain.LastHeader callsite Jul 6, 2022
@AdityaSripal
Copy link
Member

Hi Dan, thanks for the detailed explanation!! Since the bug is quite subtle, I think it would be very helpful to see a test case written that would fail with the current test library. That would make it a lot easier to verify the existence of bug and fix.

Thanks!

@colin-axner
Copy link
Contributor

Many thanks @danwt 🙏

I agree with @AdityaSripal that'd be nice to have a test case which shows the existence of the bug. I guess the general idea for the function is that it should:

// do application updates

// apply validator set changes

// create header

whereas currently we

// do application updates

// create header

// apply validator set changes

but the bug lies in the fact that the application changes should change the value of the next validator set

The changes make sense to me

@danwt
Copy link
Author

danwt commented Jul 12, 2022

Hey, I will try to provide a small replication at some point but it will take a little bit. Thanks.

danwt added a commit to informalsystems/ibc-go that referenced this pull request Jul 14, 2022
@danwt danwt closed this by deleting the head repository Dec 3, 2022
@colin-axner
Copy link
Contributor

Just curious why this got closed. Is it still an issue?

@danwt
Copy link
Author

danwt commented Dec 5, 2022

Oh no, I had a PR from my clone and I deleted the clone by accident (I was doing house cleaning).

I'll see if I can retrieve it.

@danwt danwt reopened this Dec 5, 2022
@colin-axner
Copy link
Contributor

Hi @danwt, just spent a little time revisiting this issue. Any luck on providing a small replication?

I have a hard time verifying the bug via logical reasoning. From my understanding, begin block in the staking module sets the current validator set for the block, while end block obtains a list of changes, applies them and returns the changes. I believe once a header begin block is being called, the header has already been committed to by the validator set (they agreed on the included txs for the next blocks execution). See my #1169 (comment) here. But something doesn't add up for me here. If end block returns already applied changes, then as you say, the validator set of h + 1 should be the validator set + applied changes. But as this comment says, it appears to be a n + 2 situation.

Ah, maybe it is this:

block h
// validators commit to block h, sign the header
// begin block
// deliver, deliver, deliver
// end block
// app.Commit()

block h + 1
// validators are set to block h.nextValidators
// next validators have block h.nextValidators + block h val set changes
// commit to state changes performed by executing last block (block h + 1 has historical info tracking of block h changes which != block h validators

Is historical info correct? Is there something I am missing, cc @AdityaSripal

@colin-axner
Copy link
Contributor

colin-axner commented May 18, 2023

Oh yea, here we can see the historical info has the header and val set. I'd be interested to know if hash(info.valset) == hash(info.header.valset) or if it is the hash(info.header.nextVals)?

I'm still not sure about the returned end block changes already being reflected in the next block or not

@colin-axner
Copy link
Contributor

I think this might be related to #3900, if state changes are being applied in EndBlock, I think the testing package isn't creating a header with those state changes until the block after

@colin-axner
Copy link
Contributor

colin-axner commented Jul 3, 2023

Hi @danwt, I am going to close this issue. Thanks for your report. I have opened #3997 which explicitly mimics the flow of ABCI calls. I believe that code has the correct handling of the validator set + applying the validator set changes. I don't think there is an issue though in the existing code, as validator set updates must occur after a block application logic. That is, the header should be committed first with the validator set + next validator set, then execute app logic which returns set changes applied in the next block (when proposing the block)

The testing pkg I think in its current state is somewhat hard to follow as it does things mostly correct but not perfect. I have proposed naming changes (LatestCommittedHeader, CurrentUncommittedHeader), which hopefully should clear things up

@colin-axner colin-axner closed this Jul 3, 2023
@danwt
Copy link
Author

danwt commented Jul 4, 2023

Sounds good 👍
Sorry it's been a while since I was working on Cosmos!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants