-
Notifications
You must be signed in to change notification settings - Fork 36
Rollover AD result index less frequently #168
Conversation
Currently, we roll over the result index every 30 days or every 300000 docs. Assuming each doc has 1 KB and our result index has five shards, each shard takes about 60 MB, which is too small. Small shards are against ES best practice. This PR increases the rollover threshold to 9000000 docs, which increases the max shard size to roughly 1.8 GB.
// Suppose generally per cluster has 200 detectors and all run with 1 minute interval. | ||
// We will get 288,000 AD result docs. So set it as 300k to avoid multiple roll overs | ||
// per day. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor. comments are outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Lai said, please update the comments.
// per day. | ||
300 * 1000L, | ||
9_000_000L, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a problem for 200 detectors? From the comment 300k is taking care of 288k limit to avoid multiple roll overs right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not a problem for 200 detectors. It is a problem of small shard. Please see my PR descriptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right i get that. Comment is misleading here. Why not mention the message in PR description as comment? We are talking about 200 detectors and roll overs in the comment. Atleast mention the PR?
One more thing you might want to confirm is does having larger index impact performance when compared to previous smaller index?
Issue #, if available:
Description of changes:
Currently, we roll over the result index every 30 days or every 300000 docs. Assuming each doc has 1 KB and our result index has five shards, each shard takes about 60 MB, which is too small. Small shards are against ES best practice. This PR increases the rollover threshold to 9000000 docs, which increases the max shard size to roughly 1.8 GB.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.