Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: priority nonce mempool iteration is not thread safe #675

Closed
1 task done
mmsqe opened this issue Aug 21, 2024 · 3 comments
Closed
1 task done

[Bug]: priority nonce mempool iteration is not thread safe #675

mmsqe opened this issue Aug 21, 2024 · 3 comments

Comments

@mmsqe
Copy link
Collaborator

mmsqe commented Aug 21, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

fatal during concurrent call

Cosmos SDK Version

0.50

How to reproduce?

fatal error: concurrent map read and map write

goroutine 511 [running]:
github.com/cosmos/cosmos-sdk/types/mempool.(*PriorityNonceIterator[...]).Next(0x741ef80)
	github.com/cosmos/cosmos-sdk/types/mempool/priority_nonce.go:341 +0x188
github.com/cosmos/cosmos-sdk/types/mempool.(*PriorityNonceIterator[...]).iteratePriority(0x401a533ef0?)
	github.com/cosmos/cosmos-sdk/types/mempool/priority_nonce.go:310 +0x134
github.com/cosmos/cosmos-sdk/types/mempool.(*PriorityNonceIterator[...]).Next(0x741ef80)
	github.com/cosmos/cosmos-sdk/types/mempool/priority_nonce.go:329 +0x288
github.com/crypto-org-chain/cronos/v2/app.New.func1.(*DefaultProposalHandler).PrepareProposalHandler.3({{0x740f028, 0x98e71e0}, {0xffff3cdc8da0, 0x401af5e640}, {{0x0, 0x0}, {0x4000ac19b0, 0xc}, 0xb7, {0x571bf59, ...}, ...}, ...}, ...)
	github.com/cosmos/cosmos-sdk/baseapp/abci_utils.go:356 +0x658
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).PrepareProposal(0x4001cfc6c8, 0x4002d8e160)
	github.com/cosmos/cosmos-sdk/baseapp/abci.go:431 +0x94c
github.com/cosmos/cosmos-sdk/server.cometABCIWrapper.PrepareProposal(...)
	github.com/cosmos/cosmos-sdk/server/cmt_abci.go:36
github.com/cometbft/cometbft/abci/client.(*localClient).PrepareProposal(0x401dc16000?, {0x740f450?, 0x98e71e0?}, 0xffff3d680288?)
	github.com/cometbft/cometbft/abci/client/local_client.go:157 +0xe8
github.com/cometbft/cometbft/proxy.(*appConnConsensus).PrepareProposal(0x4002d8c0f0, {0x740f450, 0x98e71e0}, 0x4002d8e160)
	github.com/cometbft/cometbft/proxy/app_conn.go:84 +0x130
github.com/cometbft/cometbft/state.(*BlockExecutor).CreateProposalBlock(_, {_, _}, _, {{{0xb, 0x0}, {0x40016ea7c0, 0x7}}, {0x40016ea7d0, 0xc}, ...}, ...)
	github.com/cometbft/cometbft/state/execution.go:129 +0x5d8
github.com/cometbft/cometbft/consensus.(*State).createProposalBlock(0x4001889888, {0x740f450, 0x98e71e0})
	github.com/cometbft/cometbft/consensus/state.go:1307 +0x190
github.com/cometbft/cometbft/consensus.(*State).defaultDecideProposal(0x4001889888, 0xb7, 0x1)
	github.com/cometbft/cometbft/consensus/state.go:1214 +0x50
github.com/cometbft/cometbft/consensus.(*State).enterPropose(0x4001889888, 0xb7, 0x1)
	github.com/cometbft/cometbft/consensus/state.go:1193 +0x794
github.com/cometbft/cometbft/consensus.(*State).enterNewRound(0x4001889888, 0xb7, 0x1)
	github.com/cometbft/cometbft/consensus/state.go:1112 +0x9a4
github.com/cometbft/cometbft/consensus.(*State).handleTimeout(0x4001889888, {0x0?, 0x0?, 0x0?, 0x0?}, {0xb7, 0x0, 0x6, {0x2cbdf087, 0xede573d07, ...}, ...})
	github.com/cometbft/cometbft/consensus/state.go:1008 +0x838
github.com/cometbft/cometbft/consensus.(*State).receiveRoutine(0x4001889888, 0x0)
	github.com/cometbft/cometbft/consensus/state.go:865 +0x4b4
created by github.com/cometbft/cometbft/consensus.(*State).OnStart in goroutine 326
	github.com/cometbft/cometbft/consensus/state.go:398 +0xf0

@yihuang yihuang changed the title [Bug]: concurrent map read and map write from priority nonce mempool [Bug]: priority nonce mempool iteration is not thread safe Aug 27, 2024
@yihuang
Copy link
Collaborator

yihuang commented Aug 27, 2024

there are general data races other than map read/write:

WARNING: DATA RACE
Write at 0x00c000af6d70 by goroutine 31:
  github.com/huandu/skiplist.(*SkipList).Set()
      /Users/huangyi/go/pkg/mod/github.com/huandu/[email protected]/skiplist.go:201 +0x794
  github.com/cosmos/cosmos-sdk/types/mempool.(*PriorityNonceMempool[go.shape.int64]).InsertWithGasWanted()
      /Users/huangyi/src/cosmos-sdk/types/mempool/priority_nonce.go:274 +0xa44
  github.com/cosmos/cosmos-sdk/types/mempool.(*PriorityNonceMempool[go.shape.int64]).Insert()
      /Users/huangyi/src/cosmos-sdk/types/mempool/priority_nonce.go:285 +0x98
  github.com/cosmos/cosmos-sdk/types/mempool_test.(*MempoolTestSuite).TestIteratorConcurrency.func1.1()
      /Users/huangyi/src/cosmos-sdk/types/mempool/priority_nonce_test.go:452 +0x1bc

Previous read at 0x00c000af6d70 by goroutine 30:
  github.com/huandu/skiplist.(*Element).Next()
      /Users/huangyi/go/pkg/mod/github.com/huandu/[email protected]/element.go:51 +0x2b0
  github.com/cosmos/cosmos-sdk/types/mempool.(*PriorityNonceIterator[go.shape.int64]).Next()
      /Users/huangyi/src/cosmos-sdk/types/mempool/priority_nonce.go:338 +0x264
  github.com/cosmos/cosmos-sdk/types/mempool.(*PriorityNonceIterator[int64]).Next()
      /Users/huangyi/src/cosmos-sdk/types/mempool/priority_nonce.go:313 +0x38
  github.com/cosmos/cosmos-sdk/types/mempool_test.(*MempoolTestSuite).TestIteratorConcurrency.func1()
      /Users/huangyi/src/cosmos-sdk/types/mempool/priority_nonce_test.go:463 +0x3f0
  testing.tRunner()
      /nix/store/z6p2j8shdwi74kbm86jwdh03vxq91l0q-go-1.21.5/share/go/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /nix/store/z6p2j8shdwi74kbm86jwdh03vxq91l0q-go-1.21.5/share/go/src/testing/testing.go:1648 +0x40

@yihuang
Copy link
Collaborator

yihuang commented Aug 27, 2024

Actually, I only get the above error with the new unit test, I don't see the concurrent map access error.

@mmsqe
Copy link
Collaborator Author

mmsqe commented Aug 27, 2024

Actually, I only get the above error with the new unit test, I don't see the concurrent map access error.

error log was from some failed testground test

yihuang added a commit that referenced this issue Sep 2, 2024
* Problem: mempool iteration is not thread safe

Closes: #675

Solution:
- hold the lock during iteration

* Update CHANGELOG.md

Signed-off-by: yihuang <[email protected]>

* cleanup

* cleanup

* commit

* remove invalid txs out of the loop

* fix error handling

* Update baseapp/abci_utils.go

Signed-off-by: yihuang <[email protected]>

---------

Signed-off-by: yihuang <[email protected]>
@mmsqe mmsqe closed this as completed Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants