-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hash: bound the hashtable size #431
Conversation
b805df9
to
9f15df5
Compare
We could possibly make this a flag to specify at hashtable creation time if we don't always want to limit the size. |
thanks for raising this issue and solution. I think we should additional/configurable eviction mechanisms, the current one is useful for filter_kubernetes but for other users of the implementation might get problems since data will be unavailable. (still thinking...) |
Signed-off-by: Eduardo Silva <[email protected]>
Signed-off-by: Eduardo Silva <[email protected]>
Signed-off-by: Eduardo Silva <[email protected]>
Signed-off-by: Eduardo Silva <[email protected]>
After thinking I realized that the hashtable implementation lacks of two features:
I've pushed a branch called hashtable_eviction where I placed the proper structure to hook the eviction modes. Would you please review the changes and merge your eviction mode in that new schema ? |
@edsiper It looks good to me. I think ideally for the kubernetes plugin it would use an LRU policy. But I think we'd need a priority queue to support that efficiently. Same with the other policies in there (FLB_HASH_EVICT_OLDER and FLB_HASH_EVICT_LESS_USED). Does |
mk_list does not support priority queues. It's a generic type for nested node-lists. As a priority setting up the basic eviction modes should be enough, no complexity, since this needs to be merged into 0.12, then for 0.13 priorities and other stuff can be added without problems |
9f15df5
to
60c68c8
Compare
The hashtable is being used as a cache for kubernetes metadata and environment variables. The drawback is the hashtable is unbounded and doesn't resize itself. Thus over time it uses more memory and performance becomes worse as the chains get longer. For Kubernetes this can be particularly bad in busy environments as the cache key uses podname, which will change every time a deployment occurs. By bounding the hashtable we ensure memory doesn't become unbounded and performance remains constant. For a cache it should be okay to evict old entries. This implementation tries to keep things simple by performing eviction on adding a key to the table. As long as the hashtable size isn't exceeded, the behaviour remains the same as before - the table is looked up via hash function, and the table chain is extended. When the hashtable size is exceeded, the housekeeping will remove all prior keys in a random chain to bring down the hashtable size. The advantage of a random eviction is it's simple and fast versus constructing an LRU list (or something else).
@edsiper Okay I pushed my changes on top of your branch. Note that you accidentally flipped the argument of evict mode + size on flb_hash_create so I fixed that. |
thanks! |
The hashtable is being used as a cache for kubernetes metadata and
environment variables. The drawback is the hashtable is unbounded and
doesn't resize itself. Thus over time it uses more memory and
performance becomes worse as the chains get longer.
For Kubernetes this can be particularly bad in busy environments as the
cache key uses podname, which will change every time a deployment
occurs.
By bounding the hashtable we ensure memory doesn't become unbounded and
performance remains constant. For a cache it should be okay to evict
old entries.
This implementation tries to keep things simple by performing eviction
on adding a key to the table. As long as the hashtable size isn't
exceeded, the behaviour remains the same as before - the table is looked
up via hash function, and the table chain is extended.
When the hashtable size is exceeded, the housekeeping will remove all
prior keys in a random chain to bring down the hashtable size. The
advantage of a random eviction is it's simple and fast versus
constructing an LRU list (or something else).