[R4R] Upgrade from 0301 to 0315 #87

rickyyangz · 2019-05-13T06:14:10Z

Updated all relevant documentation in docs
Updated all code comments where relevant
Wrote tests
Updated CHANGELOG_PENDING.md

As per #3043, this adds a ticker to sync the WAL every 2s while the WAL is running. * Flush WAL every 2s This adds a ticker that flushes the WAL every 2s while the WAL is running. This is related to #3043. * Fix spelling * Increase timeout to 2mins for slower build environments * Make WAL sync interval configurable * Add TODO to replace testChan with more comprehensive testBus * Remove extraneous debug statement * Remove testChan in favour of using system time As per tendermint/tendermint#3300 (comment), this removes the `testChan` WAL member and replaces the approach with a system time-oriented one. In this new approach, we keep track of the system time at which each flush and periodic flush successfully occurred. The naming of the various functions is also updated here to be more consistent with "flushing" as opposed to "sync'ing". * Update naming convention and ensure lock for timestamp update * Add Flush method as part of WAL interface Adds a `Flush` method as part of the WAL interface to enforce the idea that we can manually trigger a WAL flush from outside of the WAL. This is employed in the consensus state management to flush the WAL prior to signing votes/proposals, as per tendermint/tendermint#3043 (comment) * Update CHANGELOG_PENDING * Remove mutex approach and replace with DI The dependency injection approach to dealing with testing concerns could allow similar effects to some kind of "testing bus"-based approach. This commit introduces an example of this, where instead of relying on (potentially fragile) timing of things between the code and the test, we inject code into the function under test that can signal the test through a channel. This allows us to avoid the `time.Sleep()`-based approach previously employed. * Update comment on WAL flushing during vote signing Co-Authored-By: thanethomson <[email protected]> * Simplify flush interval definition Co-Authored-By: thanethomson <[email protected]> * Expand commentary on WAL disk flushing Co-Authored-By: thanethomson <[email protected]> * Add broken test to illustrate WAL sync test problem Removes test-related state (dependency injection code) from the WAL data structure and adds test code to illustrate the problem with using `WALGenerateNBlocks` and `wal.SearchForEndHeight` to test periodic sync'ing. * Fix test error messages * Use WAL group buffer size to check for flush A function is added to `libs/autofile/group.go#Group` in order to return the size of the buffered data (i.e. data that has not yet been flushed to disk). The test now checks that, prior to a `time.Sleep`, the group buffer has data in it. After the `time.Sleep` (during which time the periodic flush should have been called), the buffer should be empty. * Remove config root dir removal from #3291 * Add godoc for NewWAL mentioning periodic sync

Merge master back to develop

* cs: update wal comments Follow-up to tendermint/tendermint#3300 * Update consensus/wal.go Co-Authored-By: melekes <[email protected]>

The test was sometimes failing due to processFlushTicks being called too early. The solution is to call wal#Start later in the test.

* green pubsub tests :OK: * get rid of clientToQueryMap * Subscribe and SubscribeUnbuffered * start adapting other pkgs to new pubsub * nope * rename MsgAndTags to Message * remove TagMap it does not bring any additional benefits * bring back EventSubscriber * fix test * fix data race in TestStartNextHeightCorrectly ``` Write at 0x00c0001c7418 by goroutine 796: github.com/tendermint/tendermint/consensus.TestStartNextHeightCorrectly() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:1296 +0xad testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Previous read at 0x00c0001c7418 by goroutine 858: github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1631 +0x1366 github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1476 +0x8f github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg() /go/src/github.com/tendermint/tendermint/consensus/state.go:667 +0xa1e github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine() /go/src/github.com/tendermint/tendermint/consensus/state.go:628 +0x794 Goroutine 796 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 testing.runTests.func1() /usr/local/go/src/testing/testing.go:1119 +0xa8 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 testing.runTests() /usr/local/go/src/testing/testing.go:1117 +0x4ee testing.(*M).Run() /usr/local/go/src/testing/testing.go:1034 +0x2ee main.main() _testmain.go:214 +0x332 Goroutine 858 (running) created at: github.com/tendermint/tendermint/consensus.(*ConsensusState).startRoutines() /go/src/github.com/tendermint/tendermint/consensus/state.go:334 +0x221 github.com/tendermint/tendermint/consensus.startTestRound() /go/src/github.com/tendermint/tendermint/consensus/common_test.go:122 +0x63 github.com/tendermint/tendermint/consensus.TestStateFullRound1() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:255 +0x397 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ``` * fixes after my own review * fix formatting * wait 100ms before kicking a subscriber out + a test for indexer_service * fixes after my second review * no timeout * add changelog entries * fix merge conflicts * fix typos after Thane's review Co-Authored-By: melekes <[email protected]> * reformat code * rewrite indexer service in the attempt to fix failing test tendermint/tendermint#3227 * Revert "rewrite indexer service in the attempt to fix failing test" This reverts commit 0d9107a098230de7138abb1c201877c246e89ed1. * another attempt to fix indexer * fixes after Ethan's review * use unbuffered channel when indexing transactions Refs tendermint/tendermint#3227 (comment) * add a comment for EventBus#SubscribeUnbuffered * format code

* reject the shared secret if is all zeros in case the blacklist was not sufficient * Add test that verifies lower order pub-keys are rejected at the DH step * Update changelog * fix typo in test-comment

* bound mempool memory usage Closes #3079 * rename SizeBytes to TxsTotalBytes and other small fixes after Zarko's review * rename MaxBytes to MaxTxsTotalBytes * make ErrMempoolIsFull more informative * expose mempool's txs_total_bytes via RPC * test full response * fixes after Ethan's review * config: rename mempool.size to mempool.max_txs tendermint/tendermint#3248 (comment) * test more cases tendermint/tendermint#3248 (comment) * simplify test * Revert "config: rename mempool.size to mempool.max_txs" This reverts commit 39bfa3696177aa46195000b90655419a975d6ff7. * rename count back to n_txs to make a change non-breaking * rename max_txs_total_bytes to max_txs_bytes * format code * fix TestWALPeriodicSync The test was sometimes failing due to processFlushTicks being called too early. The solution is to call wal#Start later in the test. * Apply suggestions from code review

* libs/common: TrapSignal accepts logger as a first parameter and does not block anymore * previously it was dumping "captured ..." msg to os.Stdout * TrapSignal should not be responsible for blocking thread of execution Refs #3238 * exit with zero (0) code upon receiving SIGTERM/SIGINT Refs #3238 * fix formatting in docs/app-dev/abci-cli.md Co-Authored-By: melekes <[email protected]> * fix formatting in docs/app-dev/abci-cli.md Co-Authored-By: melekes <[email protected]>

Just a minor followup on the review if #3347: Fixes a comment. [#3347 (comment)](tendermint/tendermint#3347 (comment))

* rename WAL#Flush to WAL#FlushAndSync - rename auto#Flush to auto#FlushAndSync - cleanup WAL interface to not leak implementation details! * remove Group() * add WALReader interface and return it in SearchForEndHeight() - add interface assertions Refs #3337 * replace WALReader with io.ReadCloser

* docs: explain create_empty_blocks configurations Closes #3307 * Vagrantfile: install nodejs for docs * update docs instructions npm install does not make sense since there's no packages.json file * explain broadcast_tx_* tx format Closes #536 * docs: explain how transaction ordering works Closes #2904 * bring in consensus parameters explained * example for create_empty_blocks_interval * bring in explanation from tendermint/tendermint#2487 (comment) * link to formatting instead of duplicating info

@liamsi

This issue is related to #3107 This is a first renaming/refactoring step before reworking and removing heartbeats. As discussed with @liamsi , we preferred to go for a couple of independent and separate PRs to simplify review work. The changes: Help to clarify the relation between the validator and remote signer endpoints Differentiate between timeouts and deadlines Prepare to encapsulate networking related code behind RemoteSigner in the next PR My intention is to separate and encapsulate the "network related" code from the actual signer. SignerRemote ---(uses/contains)--> SignerValidatorEndpoint <--(connects to)--> SignerServiceEndpoint ---> SignerService (future.. not here yet but would like to decouple too) All reconnection/heartbeat/whatever code goes in the endpoints. Signer[Remote/Service] do not need to know about that. I agree Endpoint may not be the perfect name. I tried to find something "Go-ish" enough. It is a common name in go-kit, kubernetes, etc. Right now: SignerValidatorEndpoint: handles the listener contains SignerRemote Implements the PrivValidator interface connects and sets a connection object in a contained SignerRemote delegates PrivValidator some calls to SignerRemote which in turn uses the conn object that was set externally SignerRemote: Implements the PrivValidator interface read/writes from a connection object directly handles heartbeats SignerServiceEndpoint: Does most things in a single place delegates to a PrivValidator IIRC. * cleanup * Refactoring step 1 * Refactoring step 2 * move messages to another file * mark for future work / next steps * mark deprecated classes in docs * Fix linter problems * additional linter fixes

* fix dirty data in peerset startInitPeer before PeerSet add the peer, once mconnection start and Receive of one Reactor faild, will try to remove it from PeerSet while PeerSet still not contain the peer. Fix this by change the order. * fix test FilterDuplicate * fix start/stop race * fix err

When remove peer, block pool simple remove bpPeer, but do not stop timer, that cause stopError for recorrected peers. Stop timer when remove from pool.

1."abci_query": rpcserver.NewRPCFunc(c.ABCIQuery, "path,data,prove") "validators": rpcserver.NewRPCFunc(c.Validators, "height"), the parameters and function do not match, cause index out of range error. 2. the prove of query is forced to be true, while default option is false. 3. fix the wrong key of merkle

…om 1.1.0 to 1.3.0 (#3357) * deps: update gogo/proto from 1.1.1 to 1.2.1 - verified changes manually git diff 636bf030~ ba06b47c --stat -- ':!*.pb.go' ':!test' * deps: update golang/protobuf from 1.1.0 to 1.3.0 - verified changes manually git diff b4deda0~ c823c79 -- ':!*.pb.go' ':!test'

Before we're using it to get a round state in tests. Now it can be done by calling csX.GetRoundState. We will need to rewrite TestStateSlashingPrevotes and TestStateSlashingPrecommits, which are commented right now, to not rely on EventDataRoundState#RoundState field. Refs #1527

…… (#3048) * make BlockTimeIota a consensus parameter, not a locally configurable option Refs #2920 * make TimeIota int64 ms Refs #2920 * update Gopkg.toml * fixes after Ethan's review * fix TestRemoteSignerProposalSigningFailed * update changelog

Fixes: #3378 * Add stats to cleveldb implementation * update changelog * remote TODO also - sort keys - preallocate memory * fix const initializer []string literal is not a constant * add test

- update docker image on circleci - remove GOCACHE=off from Makefile (see: https://tip.golang.org/doc/go1.12#gocache) - update badge in readme - update in scripts/install - update Vagrantfile - update in networks/remote/integration.sh - tools/build/Makefile

… (#3371) downstream Signed-off-by: Silas Davis <[email protected]>

* fix failure in TestProposerFrequency * Add test to check priority order after updates * Changed applyRemovals() and removed Remove() Changed applyRemovals() similar to applyUpdates() Removed function Remove() Updated comments * review comments * simplify applyRemovals and add more comments * small correction in comment * Fix check in test * Fix priority check for centering, address review comments * fix assert for priority centering * review comments * review comments * cleanup and review comments added upper limit check for validator voting power moved check for empty validator set earlier moved panic on potential negative set length in verifyRemovals added more tests * review comments

Refs #3262

https://tools.ietf.org/html/rfc6962#section-2.1 "The largest power of two less than the number of items" is actually correct! For n > 1, let k be the largest power of two smaller than n (i.e., k < n <= 2k).

We're pinning repos without releases because it's very easy to upgrade all the dependencies by executing dep ensure --upgrade. Instead, we should just never run this command directly, only dep ensure --upgrade <some repo>. And we can defend that in PRs. Refs #3374 The problem with pinning to exact revisions: people who import Tendermint as a library (e.g. abci/types) are stuck with these revisions even though the code they import may not even use them.

Although the version we were pinning to is from Nov. 2016 there were no substantial changes: jmhodges/levigo@2b8c778 added go-modules support (no code changes) jmhodges/levigo@853d788 added a badge to the readme closes #3381

Update Gopkg.lock via dep ensure --update golang.org/x/crypto see #3391 (comment) (nothing to review here really).

V0.31

Merge master to develop

* [adr] Peer behaviour adr updates * [docs] fix Behaved function signature * [adr] typo fix in code example

@sunboshan

What happened: New code was supposed to fall back to last height changed when/if it failed to find validators at checkpoint height (to make release non-breaking). But because we did not check if validator set is empty, the fall back logic was never executed => resulting in LoadValidators returning an empty validator set for cases where `lastStoredHeight` is checkpoint height (i.e. almost all heights if the application does not change validator set often). How it was found: one of our users - @sunboshan reported a bug here tendermint/tendermint#3537 (comment) * use last height changed in validator set is empty * add a changelog entry

Fixes #3457 The topic of the issue is that : write a BlockRequest int requestsCh channel will create an timer at the same time that stop the peer 15s later if no block have been received . But pop a BlockRequest from requestsCh and send it out may delay more than 15s later. So that the peer will be stopped for error("send nothing to us"). Extracting requestsCh into its own goroutine can make sure that every BlockRequest been handled timely. Instead of the requestsCh handling, we should probably pull the didProcessCh handling in a separate go routine since this is the one "starving" the other channel handlers. I believe the way it is right now, we still have issues with high delays in errorsCh handling that might cause sending requests to invalid/ disconnected peers.

Co-Authored-By: melekes <[email protected]>

v0.31.5 changelog and version updates

[R4R] revert genesis change

[R4R] fix lightd client

[R4R] update dependency

thanethomson and others added 30 commits February 20, 2019 09:45

docs: fix rpc Tx() method docs (#3331)

db5d760

Merge pull request #3339 from tendermint/master

4f83eec

Merge master back to develop

cs: update wal comments (#3334)

ed1de13

* cs: update wal comments Follow-up to tendermint/tendermint#3300 * Update consensus/wal.go Co-Authored-By: melekes <[email protected]>

refactor decideProposal in common_test (#3343)

f22ada4

fix TestWALPeriodicSync (#3342)

67fd428

The test was sometimes failing due to processFlushTicks being called too early. The solution is to call wal#Start later in the test.

secret connection check all zeroes (#3347)

e0adc5e

* reject the shared secret if is all zeros in case the blacklist was not sufficient * Add test that verifies lower order pub-keys are rejected at the DH step * Update changelog * fix typo in test-comment

p2p: fix comment in secret connection (#3348)

6797d85

Just a minor followup on the review if #3347: Fixes a comment. [#3347 (comment)](tendermint/tendermint#3347 (comment))

docs: fix typo (#3373)

37a5484

fix pool timer leak bug, resolve#3353 (#3358)

976b1c2

When remove peer, block pool simple remove bpPeer, but do not stop timer, that cause stopError for recorrected peers. Stop timer when remove from pool.

libs/db: Add cleveldb.Stats() (#3379)

8c9df30

Fixes: #3378 * Add stats to cleveldb implementation * update changelog * remote TODO also - sort keys - preallocate memory * fix const initializer []string literal is not a constant * add test

golang 1.12.0 (#3376)

1eaa42c

- update docker image on circleci - remove GOCACHE=off from Makefile (see: https://tip.golang.org/doc/go1.12#gocache) - update badge in readme - update in scripts/install - update Vagrantfile - update in networks/remote/integration.sh - tools/build/Makefile

Copy secp256k1 code from go-ethereum to avoid GPL vendoring issues in…

858875f

… (#3371) downstream Signed-off-by: Silas Davis <[email protected]>

make dupl linter pass (#3385)

f25d727

Refs #3262

docs: fix the reverse of meaning in spec (#3387)

91b488f

https://tools.ietf.org/html/rfc6962#section-2.1 "The largest power of two less than the number of items" is actually correct! For n > 1, let k be the largest power of two smaller than n (i.e., k < n <= 2k).

update levigo to 1.0.0 (#3389)

28e9e9e

Although the version we were pinning to is from Nov. 2016 there were no substantial changes: jmhodges/levigo@2b8c778 added go-modules support (no code changes) jmhodges/levigo@853d788 added a badge to the readme closes #3381

update golang.org/x/crypto (#3392)

e415c32

Update Gopkg.lock via dep ensure --update golang.org/x/crypto see #3391 (comment) (nothing to review here really).

ebuchman and others added 16 commits April 15, 2019 08:16

Merge pull request #3553 from tendermint/v0.31

1c6d9d2

V0.31

Merge pull request #3563 from tendermint/master

d35c087

Merge master to develop

adr: PeerBehaviour updates (#3558)

f2119c3

* [adr] Peer behaviour adr updates * [docs] fix Behaved function signature * [adr] typo fix in code example

gitignore: add .vendor-new (#3566)

f1cf101

common: CMap: slight optimization in Keys() and Values(). (#3567)

5b8888b

update changelog

3cb7013

bump version

4474a5e

Apply suggestions from code review

18bd5b6

Co-Authored-By: melekes <[email protected]>

Merge pull request #3568 from tendermint/anton/release-v0.31.5

d2eab53

v0.31.5 changelog and version updates

Merge tag 'v0.31.5' into upgrade_0.30.1

9e40207

revert pr

506a075

update

c115442

update default iota

f740d6d

Merge pull request #88 from binance-chain/revert_iota

78d8d62

[R4R] revert genesis change

yutianwu requested a review from unclezoro May 17, 2019 13:16

yutianwu changed the title ~~[WIP] Upgrade from 0301 to 0315~~ [R4R] Upgrade from 0301 to 0315 May 17, 2019

yutianwu requested review from ackratos, darren-liu and abelliumnt May 17, 2019 13:17

yutianwu approved these changes May 20, 2019

View reviewed changes

yutianwu added 4 commits May 21, 2019 13:04

fix lightd

d2806d6

fix format

8960662

revert default block size config

3f4bf06

Merge pull request #89 from binance-chain/fix_lightd

3139879

[R4R] fix lightd client

abelliumnt approved these changes May 21, 2019

View reviewed changes

yutianwu added 2 commits May 21, 2019 16:55

update dependency

ab9b1e2

Merge pull request #90 from binance-chain/fix_lightd

a582441

[R4R] update dependency

yutianwu merged commit 500b85f into develop May 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[R4R] Upgrade from 0301 to 0315 #87

[R4R] Upgrade from 0301 to 0315 #87

rickyyangz commented May 13, 2019

[R4R] Upgrade from 0301 to 0315 #87

[R4R] Upgrade from 0301 to 0315 #87

Conversation

rickyyangz commented May 13, 2019