-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support index cleaner for rollover indices and add integration tests #1689
Support index cleaner for rollover indices and add integration tests #1689
Conversation
Signed-off-by: Pavol Loffay <[email protected]>
7f5b45b
to
83e2621
Compare
Codecov Report
@@ Coverage Diff @@
## master #1689 +/- ##
=======================================
Coverage 98.49% 98.49%
=======================================
Files 193 193
Lines 9286 9286
=======================================
Hits 9146 9146
Misses 111 111
Partials 29 29 Continue to review full report at Codecov.
|
@objectiser could you please have a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - just a couple of questions/comments.
@@ -15,7 +15,8 @@ def main(): | |||
print('HOSTNAME ... specifies which Elasticsearch hosts URL to search and delete indices from.') | |||
print('TIMEOUT ... number of seconds to wait for master node response.'.format(TIMEOUT)) | |||
print('INDEX_PREFIX ... specifies index prefix.') | |||
print('ARCHIVE ... specifies whether to remove archive indices (default false).') | |||
print('ARCHIVE ... specifies whether to remove archive indices (only works for rollover) (default false).') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a change in behaviour, or did it never work for archive (with daily indices)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just a comment change.
It worked and works for archive - but it is only supported for archive managed by rollover. If there is one archive we do not touch it.
plugin/storage/es/esCleaner.py
Outdated
@@ -60,13 +64,36 @@ def main(): | |||
def filter_main_indices(ilo, prefix): | |||
ilo.filter_by_regex(kind='prefix', value=prefix + "jaeger") | |||
empty_list(ilo, "No indices to delete") | |||
ilo.filter_by_alias(aliases=[prefix + 'jaeger-span-read'], exclude=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these included, aren't they related to rollover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are excluded, see the last parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry poor use of words, I mean why are these lines included/added - as they seem to relate to aliases related to the rollover approach. From looking at the tests, it seems like it is so that existing aliases (from rollover) are not touched - but that would imply the user has gone from a rollover environment back to using daily indices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have simplified this in the last commit. Now there is a regex which matches exactly the daily inidices so no need to remove rollover indices afterwards.
# This excludes archive index as we use source='name' | ||
# source `creation_date` would include archive index | ||
ilo.filter_by_age(source='name', direction='older', timestring='%Y-%m-%d', unit='days', unit_count=int(sys.argv[1])) | ||
|
||
|
||
def filter_main_indices_rollover(ilo, prefix): | ||
ilo.filter_by_alias(aliases=[prefix + 'jaeger-*-read']) | ||
empty_list(ilo, "No indices to delete") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these required after each filter, or just at the end? Does it cause problems calling the filter_by_alias
on an empty list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's required otherwise it fails on empty cluster
# This excludes archive index as we use source='name' | ||
# source `creation_date` would include archive index | ||
# TODO it might be useful to allow filter_by_space | ||
ilo.filter_by_age(source='creation_date', direction='older', unit='days', unit_count=int(sys.argv[1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering about a particular scenario - if rollover criteria is based on index size (for example) in low activity environment, it might be possible that the same index is in use for more days that the unit_count
, in which case it would remove the index - even though it is in current use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right removing by size does not really make sense.
Signed-off-by: Pavol Loffay <[email protected]>
ilo.filter_by_regex(kind='regex', value=prefix + "jaeger-(span|service)-\d{6}") | ||
empty_list(ilo, "No indices to delete") | ||
# do not remove active write indices | ||
ilo.filter_by_alias(aliases=[prefix + 'jaeger-span-write'], exclude=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If filtering by the regex now, are these two aliases
filters required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment above. We need to exclude current active write indices.
|
||
|
||
def filter_archive_indices_rollover(ilo, prefix): | ||
# Remove only rollover archive indices | ||
# Do not remove active write archive index | ||
ilo.filter_by_alias(aliases=[prefix + 'jaeger-span-archive-write'], exclude=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to filter by regex here aswell to be consistent with the change in filter_main_indices_rollover
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunatelly this is not filter_by_regex
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what happened - the comment should have been associated with two lines down - so changing the ilo.filter_by_alias(aliases=[prefix + 'jaeger-span-archive-read'])
to be ilo.filter_by_regex(kind='regex', value=prefix + "jaeger-(span|service)-\d{6}")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I do not understand what you mean.
- we use regex to find indices which should be removed
- but we also need to exclude indices which are associated with write alias - to do that we use filter for an alias.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just mean that filter_main_indices
and filter_main_indices_rollover
are both using filter_by_regex
to identify the indices to be removed - whereas filter_archive_indices_rollover
uses ilo.filter_by_alias(aliases=[prefix + 'jaeger-span-archive-read'])
- just suggesting that it would be better if filter_archive_indices_rollover
also uses filter_by_regex
for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that sounds better #1693
Resolves #1681
Resolves #1682