Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hash: bound the hashtable size #431

Merged
merged 5 commits into from
Nov 22, 2017
Merged

Conversation

jsravn
Copy link
Contributor

@jsravn jsravn commented Nov 14, 2017

The hashtable is being used as a cache for kubernetes metadata and
environment variables. The drawback is the hashtable is unbounded and
doesn't resize itself. Thus over time it uses more memory and
performance becomes worse as the chains get longer.

For Kubernetes this can be particularly bad in busy environments as the
cache key uses podname, which will change every time a deployment
occurs.

By bounding the hashtable we ensure memory doesn't become unbounded and
performance remains constant. For a cache it should be okay to evict
old entries.

This implementation tries to keep things simple by performing eviction
on adding a key to the table. As long as the hashtable size isn't
exceeded, the behaviour remains the same as before - the table is looked
up via hash function, and the table chain is extended.

When the hashtable size is exceeded, the housekeeping will remove all
prior keys in a random chain to bring down the hashtable size. The
advantage of a random eviction is it's simple and fast versus
constructing an LRU list (or something else).

@jsravn
Copy link
Contributor Author

jsravn commented Nov 15, 2017

We could possibly make this a flag to specify at hashtable creation time if we don't always want to limit the size.

@edsiper
Copy link
Member

edsiper commented Nov 15, 2017

thanks for raising this issue and solution.

I think we should additional/configurable eviction mechanisms, the current one is useful for filter_kubernetes but for other users of the implementation might get problems since data will be unavailable.

(still thinking...)

@edsiper
Copy link
Member

edsiper commented Nov 17, 2017

@jsravn

After thinking I realized that the hashtable implementation lacks of two features:

  • define a maximum of entries supported
  • define an eviction mode

I've pushed a branch called hashtable_eviction where I placed the proper structure to hook the eviction modes. Would you please review the changes and merge your eviction mode in that new schema ?

@jsravn
Copy link
Contributor Author

jsravn commented Nov 20, 2017

@edsiper It looks good to me. I think ideally for the kubernetes plugin it would use an LRU policy. But I think we'd need a priority queue to support that efficiently. Same with the other policies in there (FLB_HASH_EVICT_OLDER and FLB_HASH_EVICT_LESS_USED). Does mk_ have a priority queue? What do you think about adding one?

@edsiper
Copy link
Member

edsiper commented Nov 20, 2017

mk_list does not support priority queues. It's a generic type for nested node-lists.

As a priority setting up the basic eviction modes should be enough, no complexity, since this needs to be merged into 0.12, then for 0.13 priorities and other stuff can be added without problems

The hashtable is being used as a cache for kubernetes metadata and
environment variables. The drawback is the hashtable is unbounded and
doesn't resize itself. Thus over time it uses more memory and
performance becomes worse as the chains get longer.

For Kubernetes this can be particularly bad in busy environments as the
cache key uses podname, which will change every time a deployment
occurs.

By bounding the hashtable we ensure memory doesn't become unbounded and
performance remains constant. For a cache it should be okay to evict
old entries.

This implementation tries to keep things simple by performing eviction
on adding a key to the table. As long as the hashtable size isn't
exceeded, the behaviour remains the same as before - the table is looked
up via hash function, and the table chain is extended.

When the hashtable size is exceeded, the housekeeping will remove all
prior keys in a random chain to bring down the hashtable size.  The
advantage of a random eviction is it's simple and fast versus
constructing an LRU list (or something else).
@jsravn
Copy link
Contributor Author

jsravn commented Nov 20, 2017

@edsiper Okay I pushed my changes on top of your branch.

Note that you accidentally flipped the argument of evict mode + size on flb_hash_create so I fixed that.

@edsiper edsiper merged commit fad0707 into fluent:master Nov 22, 2017
@edsiper
Copy link
Member

edsiper commented Nov 22, 2017

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants