-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Pruning to IAVL #158
Changes from 56 commits
bed4822
23f0b99
aef79a7
43d2409
c6f7eee
099ec05
6dc8e62
6f51bea
0e2a954
a840877
86e8371
6b3597a
8ea53a0
bb45bf1
bb129a9
f01c63a
78060fe
2f3502a
bce37e3
46d16a0
d244ed3
c5af5b9
2ab4609
e50b297
0dc0769
dacbd97
5cf4558
67f9e19
b5d09a7
f4b9941
0a1f25e
097f2c5
7bf79c0
48e714f
67a4131
6ed75ff
1a5311f
54b2dac
6383f8d
f1e117d
9df04a4
1e9bd2c
46785c6
9e8fde7
fcc92e4
7e10d5d
b15162c
cb05738
6367c2f
1772b7e
4d67ccf
0fe8a80
f661c70
e833cf4
1b74253
529695f
fbdde5d
126ac73
59317ba
8b2dc94
84deb83
533652c
c529855
bbb6aa3
72d9cdd
9b3e944
a382f71
3da2290
0b86aa0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -61,4 +61,4 @@ linters-settings: | |
# enabled-tags: | ||
# - performance | ||
# - style | ||
# - experimental | ||
# - experimental |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# Pruning | ||
tac0turtle marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Setting Pruning fields in the IAVL tree can optimize performance by only writing versions to disk if they are meant to be persisted indefinitely. Versions that are known to be deleted eventually are temporarily held in memory until they are ready to be pruned. This greatly reduces the I/O load of IAVL. | ||
AdityaSripal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
We can set custom pruning fields in IAVL using: `NewMutableTreePruningOpts` | ||
tac0turtle marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
## Current design | ||
|
||
### NodeDB | ||
NodeDB has extra fields: | ||
|
||
```go | ||
recentDB dbm.DB // Memory node storage. | ||
recentBatch dbm.Batch // Batched writing buffer for memDB. | ||
|
||
// Pruning fields | ||
keepEvery int64n // Saves version to disk periodically | ||
keepRecent int64 // Saves recent versions in memory | ||
``` | ||
|
||
If version is not going to be persisted to disk, the version is simply saved in `recentDB` (typically a `memDB`) | ||
If version is persisted to disk, the version is written to `recentDB` **and** `snapshotDB` (typically `levelDB`) | ||
|
||
#### Orphans: | ||
|
||
Save orphan to `memDB` under `o|toVersion|fromVersion`. | ||
|
||
If there exists snapshot version `snapVersion` s.t. `fromVersion < snapVersion < toVersion`, save orphan to disk as well under `o|snapVersion|fromVersion`. | ||
NOTE: in unlikely event, that two snapshot versions exist between `fromVersion` and `toVersion`, we use closest snapshot version that is less than `toVersion` | ||
|
||
Can then simply use the old delete algorithm with some minor simplifications/optimizations | ||
|
||
### MutableTree | ||
|
||
MutableTree can be instantiated with a pruning-aware NodeDB. | ||
|
||
When `MutableTree` saves a new Version, it also calls `PruneRecentVersions` on nodeDB which causes oldest version in recentDB (`latestVersion - keepRecent`) to get pruned. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
package benchmarks | ||
|
||
import ( | ||
"fmt" | ||
"math/rand" | ||
"os" | ||
"runtime" | ||
"testing" | ||
|
||
db "github.com/tendermint/tm-db" | ||
) | ||
|
||
type pruningstrat struct { | ||
keepEvery, keepRecent int64 | ||
} | ||
|
||
// To test effect of pruning strategy, we must measure time to execute many blocks | ||
// Execute 30000 blocks with the given IAVL tree's pruning strategy | ||
func runBlockChain(b *testing.B, prefix string, keepEvery int64, keepRecent int64, keyLen, dataLen int) { | ||
// prepare a dir for the db and cleanup afterwards | ||
dirName := fmt.Sprintf("./%s-db", prefix) | ||
defer func() { | ||
err := os.RemoveAll(dirName) | ||
if err != nil { | ||
b.Errorf("%+v\n", err) | ||
} | ||
}() | ||
|
||
runtime.GC() | ||
|
||
// always initialize tree with goleveldb as snapshotDB and memDB as recentDB | ||
snapDB := db.NewDB("test", "goleveldb", dirName) | ||
defer snapDB.Close() | ||
|
||
// var mem runtime.MemStats | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. leftover comments There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. currently in process of getting this to work correctly. will remove once i've fixed this |
||
// runtime.ReadMemStats(&mem) | ||
// memSize := mem.Alloc | ||
// maxVersion := 0 | ||
var keys [][]byte | ||
for i := 0; i < 100; i++ { | ||
keys = append(keys, randBytes(keyLen)) | ||
} | ||
|
||
// reset timer after initialization logic | ||
b.ResetTimer() | ||
t, _ := prepareTree(b, snapDB, db.NewMemDB(), keepEvery, keepRecent, 5, keyLen, dataLen) | ||
|
||
// create 30000 versions | ||
for i := 0; i < 5000; i++ { | ||
// set 5 keys per version | ||
for j := 0; j < 5; j++ { | ||
index := rand.Int63n(100) | ||
t.Set(keys[index], randBytes(dataLen)) | ||
} | ||
_, _, err := t.SaveVersion() | ||
if err != nil { | ||
b.Errorf("Can't save version %d: %v", i, err) | ||
} | ||
// // Pause timer to garbage-collect and remeasure memory usage | ||
// b.StopTimer() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ditto |
||
// runtime.GC() | ||
// runtime.ReadMemStats(&mem) | ||
// // update memSize if it has increased after saveVersion | ||
// if memSize < mem.Alloc { | ||
// memSize = mem.Alloc | ||
// maxVersion = i | ||
// } | ||
// b.StartTimer() | ||
b.StopTimer() | ||
runtime.GC() | ||
b.StartTimer() | ||
} | ||
//fmt.Printf("Maxmimum Memory usage was %0.2f MB at height %d\n", float64(memSize)/1000000, maxVersion) | ||
b.StopTimer() | ||
} | ||
|
||
func BenchmarkPruningStrategies(b *testing.B) { | ||
ps := []pruningstrat{ | ||
{1, 0}, // default pruning strategy | ||
//{1, 1}, | ||
{0, 1}, // keep single recent version | ||
{100, 1}, | ||
{100, 5}, // simple pruning | ||
{5, 1}, | ||
{5, 2}, | ||
{10, 2}, | ||
// {1000, 10}, // average pruning | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ditto |
||
// {1000, 1}, // extreme pruning | ||
// {10000, 100}, // SDK pruning | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are these commented out on purpose? |
||
} | ||
for _, ps := range ps { | ||
ps := ps | ||
prefix := fmt.Sprintf("PruningStrategy{%d-%d}-KeyLen:%d-DataLen:%d", ps.keepEvery, ps.keepRecent, 16, 40) | ||
|
||
b.Run(prefix, func(sub *testing.B) { | ||
runBlockChain(sub, prefix, ps.keepEvery, ps.keepRecent, 16, 40) | ||
}) | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
goos: darwin | ||
goarch: amd64 | ||
pkg: github.com/tendermint/iavl/benchmarks | ||
BenchmarkPruningStrategies/PruningStrategy{1-0}-KeyLen:16-DataLen:40-8 1 2837806322 ns/op | ||
BenchmarkPruningStrategies/PruningStrategy{0-1}-KeyLen:16-DataLen:40-8 1 1124373981 ns/op | ||
BenchmarkPruningStrategies/PruningStrategy{100-1}-KeyLen:16-DataLen:40-8 1 1255040658 ns/op | ||
BenchmarkPruningStrategies/PruningStrategy{100-5}-KeyLen:16-DataLen:40-8 1 1459752743 ns/op | ||
PASS | ||
ok github.com/tendermint/iavl/benchmarks 12.375s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a line somewhere how to reproduce the benchmarks?