Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support copy alias in rollover #892

Merged
merged 5 commits into from
Aug 17, 2023

Conversation

bowenlan-amzn
Copy link
Member

@bowenlan-amzn bowenlan-amzn commented Aug 15, 2023

Issue #, if available:
#734

Description of changes:

  • Introduce a new parameter copy_alias under rollover action, that can let user choose whether to copy the alias from old index to new index during rollover.
    • This param is default to false because rollover API by default doesn't copy over alias.
  • If enabled, copy alias will only be performed after rollover succeeded.
  • If copy alias failed, it can be retried base on the rolled_over_index_name saved in metadata. rolled_over_index_name is saved after rollover succeeded
  • New fields copy_alias, rolled_over_index_name are added to the ISM config index mapping, making sure it won't be added by dynamic mapping and causes mapping conflict regression
    • copy_alias in policy.action will be handled during create/update policy path. The policy field in managed_index is of object type so no problem there.
    • rolled_over_index_name in metadata will be handled during updateMetadata in Runner, the mapping will first be updated then the metadata document will be saved.
  • Add BWC to make sure it won't cause rollover job from old versions to fail.
  1. Start cluster using the old ISM, create a rollover job
  2. Upgrade the cluster using the new ISM, trigger the rollover, double check the managed_index and metadata content

CheckList:

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@bowenlan-amzn bowenlan-amzn force-pushed the copyalias branch 4 times, most recently from ce71221 to 181c359 Compare August 15, 2023 04:45
@codecov
Copy link

codecov bot commented Aug 15, 2023

Codecov Report

Merging #892 (f763453) into 2.9 (2d87a75) will increase coverage by 0.01%.
Report is 1 commits behind head on 2.9.
The diff coverage is 90.27%.

@@             Coverage Diff              @@
##                2.9     #892      +/-   ##
============================================
+ Coverage     75.95%   75.97%   +0.01%     
- Complexity     2891     2897       +6     
============================================
  Files           366      366              
  Lines         16558    16624      +66     
  Branches       2397     2411      +14     
============================================
+ Hits          12577    12630      +53     
- Misses         2601     2608       +7     
- Partials       1380     1386       +6     
Files Changed Coverage Δ
...atemanagement/step/rollover/AttemptRolloverStep.kt 74.07% <88.88%> (+9.74%) ⬆️
...ment/indexstatemanagement/action/RolloverAction.kt 93.10% <100.00%> (+0.79%) ⬆️
...ndexstatemanagement/action/RolloverActionParser.kt 92.85% <100.00%> (+0.85%) ⬆️
...ent/indexstatemanagement/util/ManagedIndexUtils.kt 77.97% <100.00%> (+0.09%) ⬆️

... and 6 files with indirect coverage changes

Signed-off-by: bowenlan-amzn <[email protected]>
@eirsep
Copy link
Member

eirsep commented Aug 15, 2023

why is copyAlias defaulting to false. it seems like a very useful feature.

@eirsep
Copy link
Member

eirsep commented Aug 15, 2023

have we done any upgrade testing to ensure we are not introducing regression in mapping to verify ISM config index mapping change works as expected?

@@ -632,4 +630,102 @@ class RolloverActionIT : IndexStateManagementRestTestCase() {
}
assertTrue("New rollover index does not exist.", indexExists("$indexNameBase-000002"))
}

@Suppress("UNCHECKED_CAST")
fun `test rollover with copy alias`() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also add a test case where copy alias fails.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the failed cases where alias API throw exception, we don't have an way to mock that in IT in ISM code base. Should be sth we can look into later.

Signed-off-by: bowenlan-amzn <[email protected]>
Signed-off-by: bowenlan-amzn <[email protected]>
if (conditions != null) "conditions" to conditions else null
).toMap()
} else {
stepStatus = StepStatus.FAILED
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we mark this failed after rollover is completed it would send a wrong message right.
what failed here is the copyalias step which should be logged and somehow notified to user.
Or can we have a different message that
"rollover completed but copy alias failed"

How will user handle this by himself?

Copy link
Member Author

@bowenlan-amzn bowenlan-amzn Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cx can copy by themselves, uses retry API to move to next state, or let this rollover step auto retry.

We don't support running 2 steps continuously right now, so copy alias is not implemented as a separate step. If separate, it will only be done in the next job run but the requirement is to do it right after rollover or with rollover.

@@ -288,5 +385,13 @@ class AttemptRolloverStep(private val action: RolloverAction) : Step(name) {
fun getSkipRolloverMessage(index: String) = "Skipped rollover action for [index=$index]"
fun getAlreadyRolledOverMessage(index: String, alias: String) =
"This index has already been rolled over using this alias, treating as a success [index=$index, alias=$alias]"
fun getSuccessCopyAliasMessage(index: String, newIndex: String) =
"Successfully rolled over and copied alias from [index=$index] to [index=$newIndex]"
fun getFailedCopyAliasMessage(index: String, newIndex: String) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when there is failure we should also append the exception cause message.

this is a current problem in ISM that we would have to deep dive logs for the cause.
let's add it here.

Copy link
Member Author

@bowenlan-amzn bowenlan-amzn Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General exception message will be put into the cause field of info. Specific exception message is rewritten in customer friendly way in message field.

@eirsep
Copy link
Member

eirsep commented Aug 16, 2023

Shouldn't we make this change in core and add option to copy aliases in rollover api?

Signed-off-by: bowenlan-amzn <[email protected]>
@bowenlan-amzn
Copy link
Member Author

Shouldn't we make this change in core and add option to copy aliases in rollover api?

Of course it's best to support this in core and the open issue is here #849
Now we are delivering this feature ASAP for the ISM customer

Copy link
Member

@eirsep eirsep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
thanks for the changes, @bowenlan-amzn

@eirsep
Copy link
Member

eirsep commented Aug 16, 2023

Hope users can update existing ISM policies and set copyAlias=true

@bowenlan-amzn bowenlan-amzn merged commit 1923003 into opensearch-project:2.9 Aug 17, 2023
@bowenlan-amzn bowenlan-amzn deleted the copyalias branch August 17, 2023 00:59
bowenlan-amzn added a commit to bowenlan-amzn/index-management that referenced this pull request Aug 26, 2023
* Support copy alias in rollover

Signed-off-by: bowenlan-amzn <[email protected]>

* Add more tests

Signed-off-by: bowenlan-amzn <[email protected]>

* Add bwc test

Signed-off-by: bowenlan-amzn <[email protected]>

* Enhance tests

Signed-off-by: bowenlan-amzn <[email protected]>

* Improve not acknowledge call scenario

Signed-off-by: bowenlan-amzn <[email protected]>

---------

Signed-off-by: bowenlan-amzn <[email protected]>
bowenlan-amzn added a commit that referenced this pull request Sep 1, 2023
* Support copy alias in rollover



* Add more tests



* Add bwc test



* Enhance tests



* Improve not acknowledge call scenario



---------

Signed-off-by: bowenlan-amzn <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants