Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alpha: Enable bloom filter caching #5552

Merged
merged 1 commit into from
Jun 1, 2020
Merged

Conversation

jarifibrahim
Copy link
Contributor

@jarifibrahim jarifibrahim commented Jun 1, 2020

Badger supports caching of SST bloom filters, this PR enables caching
bloom filters in ristretto.
The bloom filter cache is enabled only for the p directory. The bloom filters for the w directory will be kept in memory. We can add those later if necessary.

This PR also removes old state todos.


This change is Reviewable

Badger supports caching of SST bloom filters, this PR enables caching
bloom filters in ristretto.
Copy link
Contributor

@ashish-goswami ashish-goswami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment. :lgtm:

Reviewed 3 of 3 files at r1.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jarifibrahim, @manishrjain, and @vvbalaji-dgraph)


worker/server_state.go, line 157 at r1 (raw file):

			WithKeepBlockIndicesInCache(true).
			WithKeepBlocksInCache(true).
			WithMaxBfCacheSize(500 << 20) // 500 MB of bloom filter cache.

Any reasoning/benchmarking around having default of 500 MB would be nice. Assuming max size of bloom filter as 1MB for each table. Cache can have bloom filtes for 500 SSTs.

Copy link
Contributor Author

@jarifibrahim jarifibrahim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @manishrjain, and @vvbalaji-dgraph)


worker/server_state.go, line 157 at r1 (raw file):

Previously, ashish-goswami (Ashish Goswami) wrote…

Any reasoning/benchmarking around having default of 500 MB would be nice. Assuming max size of bloom filter as 1MB for each table. Cache can have bloom filtes for 500 SSTs.

No. This is just an estimated size which I think should be enough to store a decent amount of bloom filters.. I did not do any benchmarks.

Copy link
Contributor

@ashish-goswami ashish-goswami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @manishrjain, and @vvbalaji-dgraph)

@jarifibrahim jarifibrahim merged commit 92328a7 into master Jun 1, 2020
@jarifibrahim jarifibrahim deleted the ibrahim/bloomfilter-cache branch June 1, 2020 15:11
jarifibrahim pushed a commit that referenced this pull request Jun 1, 2020
Badger supports the caching of SST bloom filters. This PR enables
caching bloom filters in ristretto. The bloom filter cache is enabled
only for the p directory. The bloom filters for the w directory will be
kept in memory. We can add caching for it later if necessary.

This PR also removes old stale todos.

(cherry picked from commit 92328a7)
jarifibrahim pushed a commit that referenced this pull request Jun 1, 2020
Badger supports the caching of SST bloom filters. This PR enables
caching bloom filters in ristretto. The bloom filter cache is enabled
only for the p directory. The bloom filters for the w directory will be
kept in memory. We can add caching for it later if necessary.

This PR also removes old stale todos.

(cherry picked from commit 92328a7)
jarifibrahim pushed a commit that referenced this pull request Jun 8, 2020
Badger supports the caching of SST bloom filters. This PR enables
caching bloom filters in ristretto. The bloom filter cache is enabled
only for the p directory. The bloom filters for the w directory will be
kept in memory. We can add caching for it later if necessary.

This PR also removes old stale todos.

(cherry picked from commit 92328a7)
jarifibrahim pushed a commit that referenced this pull request Jun 8, 2020
Badger supports the caching of SST bloom filters. This PR enables
caching bloom filters in ristretto. The bloom filter cache is enabled
only for the p directory. The bloom filters for the w directory will be
kept in memory. We can add caching for it later if necessary.

This PR also removes old stale todos.

(cherry picked from commit 92328a7)
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 18, 2020
Badger supports the caching of SST bloom filters. This PR enables
caching bloom filters in ristretto. The bloom filter cache is enabled
only for the p directory. The bloom filters for the w directory will be
kept in memory. We can add caching for it later if necessary.

This PR also removes old stale todos.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants