Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Pruning to IAVL #158

Merged
merged 69 commits into from
Jan 16, 2020
Merged

Introduce Pruning to IAVL #158

merged 69 commits into from
Jan 16, 2020

Conversation

AdityaSripal
Copy link
Member

@AdityaSripal AdityaSripal commented Jul 23, 2019

Addresses issue #144 with spec here: #144 (comment)

Only persist versions to memDB if they are snapshot versions and keep recent versions in memDB
NodeDB is responsible for handling pruning logic. Application developers simply specify pruning parameters

mattkanwisher and others added 27 commits July 2, 2019 16:22
Storing intermidiary IAVL versions in memory and not to disk

Motivation: Both Cosmos and Loom Network save an IAVL version per block, then go back and delete these versions. So you have constant churn on the IAVL and underlying Leveldb database. When realistically what you want is to only store every X Blocks.

At Berlin Tendermint Conference, Zaki and I surmised a plan where new versions are in memory, while still pointing back to nodes on disk to prevent needing to load entire IAVL into main memory. Loom IAVL tree is around 256gb so this is not feasible otherwise.


Usage

OLD Code would be like

```go
hash, version, err := s.tree.SaveVersion()
```

New Caller code would look like

```go
	oldVersion := s.Version()
  	var version int64
 	var hash []byte
 	//Every X versions we should persist to disk
 	if s.flushInterval == 0 || ((oldVersion+1)%s.flushInterval == 0) {
 		if s.flushInterval != 0 {
 			log.Error(fmt.Sprintf("Flushing mem to disk at version %d\n", oldVersion+1))
 			hash, version, err = s.tree.FlushMemVersionDisk()
 		} else {
 			hash, version, err = s.tree.SaveVersion()
 		}
 	} else {
 		hash, version, err = s.tree.SaveVersionMem()
 	}
```

FlushMemVersionDisk:
Flushes the current memory version to disk

SaveVersionMem:
Saves the current tree to memory instead of disk and gives you back an apphash

This is an opt in feature, you have to call new apis to get it. 
We also have a PR that demonstrates its usage https://github.com/loomnetwork/loomchain/pull/1232/files

We are now commiting every 1000 blocks, so we store 1000x less. Also we have signficant improves in IO at least double from not having to Prune old versions of the IAVL Tree
@AdityaSripal AdityaSripal changed the title Aditya/pruning Introduce Pruning to IAVL Jul 23, 2019
@AdityaSripal
Copy link
Member Author

AdityaSripal commented Jul 23, 2019

Ready for initial review!! Ended up finding some pretty subtle bugs while testing this. Will need to add some more tests but it would be very helpful to get reviews both on the overall design of Pruning and also someone to look over my current tests and suggest any additional cases I should test for.

TODO:

  • Run Benchmarks

  • Write more tests

@tac0turtle tac0turtle added the R4R label Jul 23, 2019
logger.go Outdated Show resolved Hide resolved
logger.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
@zmanian
Copy link
Member

zmanian commented Dec 10, 2019

@tnachen is taking this over I believe.

@tnachen
Copy link
Contributor

tnachen commented Dec 10, 2019 via email

@tnachen
Copy link
Contributor

tnachen commented Dec 20, 2019

I just did some benchmarking with the Cosmos SDK test-sim-benchmark with different configurations, and measure the benchmark completion time, and also looking into CPU / mem usage. It's not an exhaustive benchmark but overall can see the big improvements with pruning in terms of performance.

https://docs.google.com/spreadsheets/d/1p9eZ4LIq5wSjogb03_l59AzOMK_Gfu8ncWT6Ujua4nU/edit?usp=sharing

Overall CPU processing time and memory doesn't seem to differ that much regardless what recent/every setting chosen. The biggest different is time (compaction, etc).

mutable_tree.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
testutils_test.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
pruning_test.go Outdated Show resolved Hide resolved
PRUNING.md Outdated Show resolved Hide resolved
@tac0turtle tac0turtle mentioned this pull request Jan 6, 2020
6 tasks
@tac0turtle
Copy link
Member

@tnachen is there a recommended config for users? We should document this.

b.StartTimer()
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment to pause timer so we don't measure db-closing time

Also, remove print statement comment

Copy link
Member Author

@AdityaSripal AdityaSripal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved after Tim includes the latest benchmarking results he got from the SDK.
Also think my old benchmark results should be deleted since they weren't benchmarking very useful tasks.

Should be placed in the folder in format explained by Marko above

I believe {KeepEvery: 1, KeepRecent: 0} should also be benchmarked since this will test if there's been a major regression from the current master branch behavior with the new changes

mutable_tree_test.go Outdated Show resolved Hide resolved
nodedb.go Outdated Show resolved Hide resolved
nodedb.go Outdated Show resolved Hide resolved
nodedb.go Outdated Show resolved Hide resolved
nodedb.go Outdated Show resolved Hide resolved
nodedb.go Outdated Show resolved Hide resolved
* errors: add some error handling

tm-db 0.4.0 introduced more returns with errors, bubbling up more errors because of this.

Signed-off-by: Marko Baricevic <[email protected]>

* wrap errors
Copy link
Member

@tac0turtle tac0turtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tac0turtle tac0turtle merged commit be60f69 into master Jan 16, 2020
@tac0turtle tac0turtle deleted the aditya/pruning branch January 16, 2020 11:35
@erikgrinaker erikgrinaker mentioned this pull request Jun 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants