-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
increase blockstore 'writecache' functionality #2850
Comments
@whyrusleeping if you are going to use a bloom filter I would think you would want to implement that on the 'flatfs' datastore so that it can be complete and saved to disk. If the flatfs datastore has a bloom filter than I image the write cache can probably be eliminated. |
We can use bloom filter on for smart cache management, not for caching Scrap That For (Note to self, read for a future: Weighed bloom filter paper) |
@whyrusleeping @Kubuxu do we have any stats on the number of Has() requests on the gateway nodes that return false? With bitswap negative Has() requests would seam like a common thing. What I am proposing that might help is to cache Has() itself at the datastore level with a bloom filter can be kept complete by maintaining a copy on disk. I actually know very little about bloom filters so I have no idea if this is practical so consider this a question. Is this something work considering? I also have no idea if the cost of the bloom filter will outweigh the cost of negative stat system calls to check if a file exists in the flatfs datastore. Thoughts? |
Yes, we plan making a bloom filter with lru cache behind it. The bloom filter would have to be rebuilt on GC but it isn't a problem as implementation of the bloom filter I am working on now requires <100ns per insertion and <80ns per check. The bloom filter itself should be quite small. |
we can check the number of total has requests, we don't know if they return true or false though. |
we currently have a cache in the blocks/blockstore/writecache.go that caches blocks we've written for later checks to avoid duplicating writes.
The most common by far datastore call is
Has
, it causes a lot of disk contention, and you can view the number of calls per method in the expvars that ipfs exports (localhost:5001/debug/vars
).On all
Get
,Has
andPutMany
calls we should be setting the cache respectively. We should also investigate both increasing this cache size and take a look at what using a different datastructure (such as a bloom filter) might look like.The text was updated successfully, but these errors were encountered: