Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State sync support #5803

Closed
wants to merge 79 commits into from
Closed
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
e75f206
Use local iavl module
erikgrinaker Mar 4, 2020
c6201ba
Added initial snapshot settings
erikgrinaker Mar 4, 2020
bd2af17
Initial functional snapshot/restore API
erikgrinaker Mar 5, 2020
2424a7d
Added compression and chunking
erikgrinaker Mar 5, 2020
383cf94
Code cleanups
erikgrinaker Mar 5, 2020
aabc422
Added benchmarks
erikgrinaker Mar 5, 2020
dcc7673
More benchmarks
erikgrinaker Mar 5, 2020
780dae5
Buffer snapshot writers
erikgrinaker Mar 9, 2020
e91f273
Minor tweaks
erikgrinaker Mar 11, 2020
6bcf763
Type fix
erikgrinaker Mar 11, 2020
ef46535
Ignore caches during export
erikgrinaker Mar 13, 2020
a939897
Use local tm-db as well
erikgrinaker Mar 13, 2020
4585f8e
Initial snapshot store
erikgrinaker Mar 13, 2020
88e5355
Simplified Snapshotter interface
erikgrinaker Mar 13, 2020
6aed0b1
Split chunk writer and reader to separate file
erikgrinaker Mar 13, 2020
d7a6dbe
Cleaned up multistore snapshot/restore
erikgrinaker Mar 13, 2020
672db15
Improved snapshotting
erikgrinaker Mar 13, 2020
cd1b2d8
Properly close exporters and importers
erikgrinaker Mar 13, 2020
11de242
Added snapshot loading
erikgrinaker Mar 13, 2020
11fb197
Added snapshot pruning
erikgrinaker Mar 13, 2020
ad61eb1
Added snapshot listing
erikgrinaker Mar 13, 2020
6eb2ffe
Use prefix db for snapshot store
erikgrinaker Mar 13, 2020
dd76cb7
Added auxiliary snapshot function for BaseApp
erikgrinaker Mar 17, 2020
bf8acad
Merge branch 'master' into erik/snapshot
erikgrinaker Mar 17, 2020
a257ff3
go.mod: remove local tm-db.
erikgrinaker Mar 17, 2020
4251e06
Moved rootmulti snapshot contents to separate store/types
erikgrinaker Mar 17, 2020
31f8eee
Moved snapshot store to separate package
erikgrinaker Mar 17, 2020
46a3941
Added format parameter for Snapshotter interface
erikgrinaker Mar 18, 2020
c7a7513
Removed unused snapshotFormat variable
erikgrinaker Mar 18, 2020
bf54794
Don't set up a snapshot store automatically
erikgrinaker Mar 18, 2020
08713b1
Added tests for snapshots.Store
erikgrinaker Mar 18, 2020
56fc9d9
Minor tweaks
erikgrinaker Mar 18, 2020
394e097
go.mod: use iavl 0.13.2
erikgrinaker Mar 19, 2020
316dace
Updated changelog
erikgrinaker Mar 19, 2020
a39974f
Appease linter
erikgrinaker Mar 19, 2020
e984eee
Merge branch 'master' into erik/snapshot
erikgrinaker Mar 19, 2020
d8d455d
Added snapshot options
erikgrinaker Mar 19, 2020
bef3c86
Fix nil dereferencing in chunkWriter.CloseWithError()
erikgrinaker Mar 19, 2020
c3528dc
Merge branch 'master' into erik/snapshot
erikgrinaker Mar 26, 2020
076865a
Protobuf formatting fix
erikgrinaker Mar 26, 2020
fe8c88b
Return chunk metadata as well in Store.LoadChunk()
erikgrinaker Mar 26, 2020
33fa307
Add snapshots.Restorer()
erikgrinaker Mar 26, 2020
a34e12c
Typo
erikgrinaker Mar 27, 2020
ba031f6
Use table-driven tests for rootmulti.Store snapshot/restore error tests
erikgrinaker Mar 28, 2020
e507801
use zlib compression for snapshots
erikgrinaker Mar 28, 2020
3df5b25
add checksum test for snapshot format stability
erikgrinaker Mar 28, 2020
09ff5c1
use larger generated dataset for snapshot checksum test
erikgrinaker Mar 28, 2020
2f7d969
simplify snapshot management for new ABCI interface
erikgrinaker Mar 29, 2020
67c0faa
bump snapshot chunk size to 10 MB
erikgrinaker Mar 29, 2020
828ae14
use sha256 hashes for snapshot chunks
erikgrinaker Mar 29, 2020
5714fed
Merge branch 'master' into erik/snapshot
erikgrinaker Apr 1, 2020
624c4f5
snapshots: added Store.GetLatest()
erikgrinaker Apr 1, 2020
d3afa8b
baseapp: check for newer snapshots, to avoid snapshotting during replay
erikgrinaker Apr 1, 2020
c46438e
baseapp: fix nil dereferencing
erikgrinaker Apr 1, 2020
5cac251
Implemented ABCI snapshot skeleton
erikgrinaker Mar 26, 2020
857b3dd
Ported Tendermint API changes
erikgrinaker Mar 26, 2020
7d0009c
Implement ABCI snapshot interface
erikgrinaker Mar 26, 2020
994b731
don't limit ListSnapshots to 100, caller can do this
erikgrinaker Mar 28, 2020
ab71bcb
updated with simplified ABCI interface
erikgrinaker Mar 29, 2020
a753a0e
update with new chunk size
erikgrinaker Mar 29, 2020
76b79b1
use sha-256 chunk hashes
erikgrinaker Mar 29, 2020
5afe4f4
update with TM rpc/client changes
erikgrinaker Apr 3, 2020
836fa5e
add snapshots.Manager, restructure code, and write tests
erikgrinaker Apr 3, 2020
8efccd5
reduce timeout
erikgrinaker Apr 4, 2020
679c2f0
add test for restoring empty IAVL stores
erikgrinaker Apr 4, 2020
7a81f0a
go.mod: bump iavl
erikgrinaker Apr 5, 2020
fb6580c
check for error when importing IAVL nodes
erikgrinaker Apr 14, 2020
f895ca1
handle empty keys and values via Protobuf
erikgrinaker Apr 14, 2020
94f1e44
Merge branch 'master' into erik/snapshot
erikgrinaker Apr 24, 2020
d2fb90e
change tmkv.Pair to abci.EventAttribute
erikgrinaker Apr 24, 2020
432ff4a
initial port to new state sync ABCI interface
erikgrinaker Apr 24, 2020
3c52b05
don't snapshot mem.Store stores
erikgrinaker Apr 24, 2020
e7cb0be
Merge branch 'master' into erik/snapshot
erikgrinaker Apr 29, 2020
5a9def7
minor tweaks
erikgrinaker May 6, 2020
0888800
handle chunk hash verification
erikgrinaker May 6, 2020
683bd35
use pruning options for config
erikgrinaker May 6, 2020
bf04df1
improve error handling
erikgrinaker May 6, 2020
963432f
remove snapshot flags
erikgrinaker May 6, 2020
4c3cdac
use new ABCI enums
erikgrinaker May 7, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,13 @@ to now accept a `codec.JSONMarshaler` for modular serialization of genesis state
* (baseapp) [\#5837](https://github.com/cosmos/cosmos-sdk/issues/5837) Transaction simulation now returns a `SimulationResponse` which contains the `GasInfo` and
`Result` from the execution.
* (crypto/keys) [\#5866](https://github.com/cosmos/cosmos-sdk/pull/5866) Move `Keyring` and `Keybase` implementations and their associated types from `crypto/keys/` to `crypto/keybase/`.
* (store) [\#5803](https://github.com/cosmos/cosmos-sdk/pull/5803) The `store.CommitMultiStore` interface now includes the new `store.Snapshotter` interface as well.

### Features

* (x/ibc) [\#5588](https://github.com/cosmos/cosmos-sdk/pull/5588) Add [ICS 024 - Host State Machine Requirements](https://github.com/cosmos/ics/tree/master/spec/ics-024-host-requirements) subpackage to `x/ibc` module.
* (baseapp) [\#5803](https://github.com/cosmos/cosmos-sdk/pull/5803) Added support for taking state snapshots at regular height intervals, via options `snapshot-interval` and `snapshot-retention`.
* (store) [\#5803](https://github.com/cosmos/cosmos-sdk/pull/5803) Added `rootmulti.Store` methods for taking and restoring snapshots, based on `iavl.Store` export/import.

### Bug Fixes

Expand Down
42 changes: 42 additions & 0 deletions baseapp/abci.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
abci "github.com/tendermint/tendermint/abci/types"

"github.com/cosmos/cosmos-sdk/codec"
store "github.com/cosmos/cosmos-sdk/store/types"
sdk "github.com/cosmos/cosmos-sdk/types"
sdkerrors "github.com/cosmos/cosmos-sdk/types/errors"
)
Expand Down Expand Up @@ -262,6 +263,10 @@ func (app *BaseApp) Commit() (res abci.ResponseCommit) {
app.halt()
}

if app.snapshotInterval > 0 && uint64(header.Height)%app.snapshotInterval == 0 {
go app.snapshot(uint64(header.Height))
}

return abci.ResponseCommit{
Data: commitID.Hash,
}
Expand Down Expand Up @@ -289,6 +294,43 @@ func (app *BaseApp) halt() {
os.Exit(0)
}

// snapshot takes a snapshot of the current state and prunes any old snapshots
func (app *BaseApp) snapshot(height uint64) {
format := store.SnapshotFormat
app.logger.Info("Taking state snapshot", "height", height, "format", format)
if app.snapshotStore == nil {
app.logger.Error("No snapshot store configured")
return
}
if app.snapshotStore.Active() {
app.logger.Error("A state snapshot is already in progress")
return
}
chunks, err := app.cms.Snapshot(height, format)
if err != nil {
app.logger.Error("Failed to take state snapshot", "height", height, "format", format,
"err", err.Error())
return
}
err = app.snapshotStore.Save(height, format, chunks)
if err != nil {
app.logger.Error("Failed to take state snapshot", "height", height, "format", format,
"err", err.Error())
return
}
app.logger.Info("Completed state snapshot", "height", height, "format", format)

if app.snapshotRetention > 0 {
app.logger.Debug("Pruning state snapshots")
pruned, err := app.snapshotStore.Prune(app.snapshotRetention)
if err != nil {
app.logger.Error("Failed to prune state snapshots", "err", err.Error())
return
}
app.logger.Debug("Pruned state snapshots", "pruned", pruned)
}
}

// Query implements the ABCI interface. It delegates to CommitMultiStore if it
// implements Queryable.
func (app *BaseApp) Query(req abci.RequestQuery) abci.ResponseQuery {
Expand Down
6 changes: 6 additions & 0 deletions baseapp/baseapp.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import (
"github.com/tendermint/tendermint/libs/log"
dbm "github.com/tendermint/tm-db"

"github.com/cosmos/cosmos-sdk/snapshots"
"github.com/cosmos/cosmos-sdk/store"
sdk "github.com/cosmos/cosmos-sdk/types"
sdkerrors "github.com/cosmos/cosmos-sdk/types/errors"
Expand Down Expand Up @@ -70,6 +71,11 @@ type BaseApp struct { // nolint: maligned
idPeerFilter sdk.PeerFilter // filter peers by node ID
fauxMerkleMode bool // if true, IAVL MountStores uses MountStoresDB for simulation speed.

// snapshot storage, i.e. dumps of app state at certain intervals
snapshotStore *snapshots.Store
snapshotInterval uint64 // interval (in blocks) between snapshots (0 to disable)
snapshotRetention uint32 // number of snapshots to keep (0 for all)

// volatile states:
//
// checkState is set on InitChain and reset on Commit
Expand Down
29 changes: 29 additions & 0 deletions baseapp/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (

dbm "github.com/tendermint/tm-db"

"github.com/cosmos/cosmos-sdk/snapshots"
"github.com/cosmos/cosmos-sdk/store"
sdk "github.com/cosmos/cosmos-sdk/types"
)
Expand Down Expand Up @@ -44,6 +45,16 @@ func SetInterBlockCache(cache sdk.MultiStorePersistentCache) func(*BaseApp) {
return func(app *BaseApp) { app.setInterBlockCache(cache) }
}

// SetSnapshotDB sets the snapshot store.
func SetSnapshotStore(snapshotStore *snapshots.Store) func(*BaseApp) {
return func(app *BaseApp) { app.SetSnapshotStore(snapshotStore) }
}

// SetSnapshotPolicy sets the snapshot policy.
func SetSnapshotPolicy(interval uint64, retention uint32) func(*BaseApp) {
return func(app *BaseApp) { app.SetSnapshotPolicy(interval, retention) }
}

func (app *BaseApp) SetName(name string) {
if app.sealed {
panic("SetName() on sealed BaseApp")
Expand Down Expand Up @@ -143,3 +154,21 @@ func (app *BaseApp) SetRouter(router sdk.Router) {
}
app.router = router
}

// SetSnapshotStore sets the snapshot store.
func (app *BaseApp) SetSnapshotStore(snapshotStore *snapshots.Store) {
if app.sealed {
panic("SetSnapshotStore() on sealed BaseApp")
}
app.snapshotStore = snapshotStore
}

// SetSnapshotPolicy sets the snapshotting policy. 0 interval disables snapshotting, and 0 retention
// keeps all snapshots.
func (app *BaseApp) SetSnapshotPolicy(interval uint64, retention uint32) {
if app.sealed {
panic("SetSnapshotPolicy() on sealed BaseApp")
}
app.snapshotInterval = interval
app.snapshotRetention = retention
}
1 change: 1 addition & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ github.com/golang/protobuf v1.3.0/go.mod h1:Qd/q+1AKNOZr9uGQzbzCmRO6sUih6GTPZv6a
github.com/golang/protobuf v1.3.1/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/golang/protobuf v1.3.2/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/golang/protobuf v1.3.3/go.mod h1:vzj43D7+SQXF/4pzW/hwtAqwc6iTitCiVSaWz5lYuqw=
github.com/golang/protobuf v1.3.4 h1:87PNWwrRvUSnqS4dlcBU/ftvOIBep4sYuBLlh6rX2wk=
github.com/golang/protobuf v1.3.4/go.mod h1:vzj43D7+SQXF/4pzW/hwtAqwc6iTitCiVSaWz5lYuqw=
github.com/golang/protobuf v1.4.0-rc.1/go.mod h1:ceaxUfeHdC40wWswd/P6IGgMaK3YpKi5j83Wpe3EHw8=
github.com/golang/protobuf v1.4.0-rc.1.0.20200221234624-67d41d38c208/go.mod h1:xKAWHe0F5eneWXFV3EuXVDTCmh+JuBKY0li0aMyXATA=
Expand Down
14 changes: 11 additions & 3 deletions server/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,12 @@ type BaseConfig struct {
// Note: Commitment of state will be attempted on the corresponding block.
HaltTime uint64 `mapstructure:"halt-time"`

// SnapshotInterval sets the interval between state snapshots (in blocks). 0 to disable.
SnapshotInterval uint64 `mapstructure:"snapshot-interval"`

// SnapshotRetention sets the number of recent snapshots to keep. 0 keeps all snapshots.
SnapshotRetention uint32 `mapstructure:"snapshot-retention"`

// InterBlockCache enables inter-block caching.
InterBlockCache bool `mapstructure:"inter-block-cache"`

Expand Down Expand Up @@ -74,9 +80,11 @@ func (c *Config) GetMinGasPrices() sdk.DecCoins {
func DefaultConfig() *Config {
return &Config{
BaseConfig{
MinGasPrices: defaultMinGasPrices,
InterBlockCache: true,
Pruning: store.PruningStrategySyncable,
MinGasPrices: defaultMinGasPrices,
InterBlockCache: true,
Pruning: store.PruningStrategySyncable,
SnapshotInterval: 0,
SnapshotRetention: 3,
},
}
}
6 changes: 6 additions & 0 deletions server/config/toml.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,12 @@ inter-block-cache = {{ .BaseConfig.InterBlockCache }}
# nothing: all historic states will be saved, nothing will be deleted (i.e. archiving node)
# everything: all saved states will be deleted, storing only the current state
pruning = "{{ .BaseConfig.Pruning }}"

# State snapshots can be taken at regular height intervals, given by snapshot-interval (0 to
# disable). Old snapshots can be removed by setting snapshot-retention, giving the number
# of recent snapshots to keep (0 to keep all).
snapshot-interval = {{ .BaseConfig.SnapshotInterval }}
snapshot-retention = {{ .BaseConfig.SnapshotRetention }}
`

var configTemplate *template.Template
Expand Down
8 changes: 8 additions & 0 deletions server/mock/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,14 @@ func (ms multiStore) SetInterBlockCache(_ sdk.MultiStorePersistentCache) {
panic("not implemented")
}

func (ms multiStore) Snapshot(height uint64, format uint32) (<-chan io.ReadCloser, error) {
panic("not implemented")
}

func (ms multiStore) Restore(height uint64, format uint32, chunks <-chan io.ReadCloser) error {
panic("not implemented")
}

var _ sdk.KVStore = kvStore{}

type kvStore struct {
Expand Down
4 changes: 4 additions & 0 deletions server/start.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ const (
FlagHaltTime = "halt-time"
FlagInterBlockCache = "inter-block-cache"
FlagUnsafeSkipUpgrades = "unsafe-skip-upgrades"
FlagSnapshotInterval = "snapshot-interval"
FlagSnapshotRetention = "snapshot-retention"
)

var (
Expand Down Expand Up @@ -102,6 +104,8 @@ which accepts a path for the resulting pprof file.
cmd.Flags().Uint64(FlagHaltHeight, 0, "Block height at which to gracefully halt the chain and shutdown the node")
cmd.Flags().Uint64(FlagHaltTime, 0, "Minimum block time (in Unix seconds) at which to gracefully halt the chain and shutdown the node")
cmd.Flags().Bool(FlagInterBlockCache, true, "Enable inter-block caching")
cmd.Flags().Uint64(FlagSnapshotInterval, 0, "Interval between state snapshots, in blocks (0 to disable)")
cmd.Flags().Uint32(FlagSnapshotRetention, 0, "Interval between state snapshots, in blocks (0 to disable)")
cmd.Flags().String(flagCPUProfile, "", "Enable CPU profiling and write to the provided file")

// add support for all Tendermint-specific command line options
Expand Down
104 changes: 104 additions & 0 deletions snapshots/restorer.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
package snapshots

import (
"errors"
"fmt"
"io"
"time"

store "github.com/cosmos/cosmos-sdk/store/types"
)

// Restorer is a helper that manages an asynchronous snapshot restoration process
type Restorer struct {
height uint64
format uint32
chunks uint32
nextChunk uint32
chChunks chan<- io.ReadCloser
chDone <-chan error
}

// NewRestorer starts a snapshot restoration for the given target. The caller must call Close(),
// and also Complete() when the final chunks has been given.
func NewRestorer(target store.Snapshotter, height uint64, format uint32, chunks uint32) (*Restorer, error) {
chChunks := make(chan io.ReadCloser, 4)
chDone := make(chan error, 1)
go func() {
chDone <- target.Restore(height, format, chChunks)
close(chDone)
}()

// Check for any initial errors from the restore. This is a bit of a code smell.
select {
case err := <-chDone:
close(chChunks)
if err == nil {
err = errors.New("restore ended unexpectedly")
}
return nil, err
case <-time.After(10 * time.Millisecond):
return &Restorer{
height: height,
format: format,
chunks: chunks,
nextChunk: 1,
chChunks: chChunks,
chDone: chDone,
}, nil
}
}

// Add adds a chunk to be restored. It will finalize the import when the final chunk is given,
// returning true. The returned error may not be caused by the given chunk, since the
// restore is asynchronous and since data records may span multiple chunks.
func (r *Restorer) Add(chunk io.ReadCloser) (bool, error) {
if r.chChunks == nil {
return false, errors.New("no restore in progress")
}

// check if any errors have occured so far
erikgrinaker marked this conversation as resolved.
Show resolved Hide resolved
select {
case err := <-r.chDone:
r.Close()
if err == nil {
err = errors.New("restore ended unexpectedly")
}
return false, err
default:
}

// pass the chunk, and wait for completion if it was the final one
r.chChunks <- chunk
r.nextChunk++
if r.nextChunk > r.chunks {
r.Close()
return true, <-r.chDone
}
return false, nil
}

// Close closes the restore, aborting it if not completed.
func (r *Restorer) Close() {
if r != nil && r.chChunks != nil {
close(r.chChunks)
r.chChunks = nil
}
}

// Expects checks if a chunk is the next expected one
func (r *Restorer) Expects(height uint64, format uint32, chunk uint32) error {
if r == nil || r.chChunks == nil {
return errors.New("no restore in progress")
}
if height != r.height {
return fmt.Errorf("unexpected height %v, expected %v", height, r.height)
}
if format != r.format {
return fmt.Errorf("unexpected format %v, expected %v", format, r.format)
}
if chunk != r.nextChunk {
return fmt.Errorf("unexpected chunk %v, expected %v", chunk, r.nextChunk)
}
return nil
}
Loading