-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement framework for migrating system indices #78951
Conversation
This PR adds a framework for migrating system indices as necessary prior to Elasticsearch upgrades. This framework uses REST APIs added in another commit: - GET _migration/system_features This API, which gets the status of "features" (plugins which own system indices) with regards to whether they need to be upgraded or not. As of this PR, this API also reports errors encountered while migrating system indices alongside the index that was being processed when this occurred. As an example of this error reporting: ```json { "feature_name": "logstash_management", "minimum_index_version": "8.0.0", "upgrade_status": "ERROR", "indices": [ { "index": ".logstash", "version": "8.0.0", "failure_cause": { "error": { "root_cause": [ { "type": "runtime_exception", "reason": "whoopsie", "stack_trace": "<omitted for brevity>" } ], "type": "runtime_exception", "reason": "whoopsie", "stack_trace": "<omitted for brevity>" } } } ] } ``` - POST _migration/system_features This API starts the migration process. The API for this has no changes, but when called, any system indices which need to be migrated will be migrated, with status information stored in the cluster state for later use by the GET _migration/system_features API.
Pinging @elastic/es-core-infra (Team:Core/Infra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've read through all the code so far and I like it. I have a few comments and requests, and I will be happy to review again once we have some good tests. But overall: very impressive!
GetFeatureUpgradeStatusResponse.UpgradeStatus initialStatus; | ||
if (featureName.equals(currentFeature)) { | ||
initialStatus = IN_PROGRESS; | ||
} else if (minimumVersion.before(Version.V_7_0_0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably pull this version number out into a constant to make it more visible and to use it in tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I half-heartedly introduced one while backporting to 7.x in order to get bwc tests working: 3b43555
@@ -593,6 +593,7 @@ public static void validateFeatureName(String name, String plugin) { | |||
* Class holding a description of a stateful feature. | |||
*/ | |||
public static class Feature { | |||
private final String name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I see this field, it's obvious that it always should have been here.
private final String currentFeature; | ||
private final Map<String, Object> featureCallbackMetadata; | ||
|
||
public SystemIndexMigrationTaskState(String currentIndex, String currentFeature, Map<String, Object> featureCallbackMetadata) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: A little javadoc explaining what the feature callback metadata contains and where to find more information about it
SystemIndexMigrationResult -> FeatureMigrationResults FeatureMigrationStatus -> SingleFeatureMigrationResult
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really, really good. I have some nitpicks and testing requests, but nothing substantial. However, I'm going to mark it as "request changes" anyway so I can get a notification when commits from end-to-end testing make it here.
One thing I want to look at is making sure that, at the end of the process, all the original aliases are pointing at the migrated index.
Great work!
...n/java/org/elasticsearch/action/admin/cluster/migration/GetFeatureUpgradeStatusResponse.java
Show resolved
Hide resolved
assertIndexHasCorrectProperties(finalMetadata, ".ext-unman-old-reindexed-for-8", EXTERNAL_UNMANAGED_FLAG_VALUE, false, false); | ||
} | ||
|
||
public void assertIndexHasCorrectProperties( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some assertions about aliases to this method? If I'm understanding things right, the original index name should now be an alias pointing to the migrated index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! And yes, that should be the case. I've also updated the code to copy any aliases the old index had to the new index, which was my original intent.
static final int EXTERNAL_UNMANAGED_FLAG_VALUE = 4; | ||
static final SystemIndexDescriptor INTERNAL_MANAGED = SystemIndexDescriptor.builder() | ||
.setIndexPattern(".int-man-*") | ||
.setAliasName(".internal-managed-alias") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we create the aliases for the managed indices in these tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The aliases for the managed indices should be created automatically at index creation time. The tests now verify that the final, post-migration managed indices have the alias set in the descriptor.
SetOnce<Boolean> secondPluginPostMigrationHookCalled = new SetOnce<>(); | ||
|
||
TestPlugin.preMigrationHook.set(clusterState -> { | ||
// None of the other hooks should have been called yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this way of verifying the order in which these hooks are called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I can't take credit for it, I picked this style up from uh... somewhere else in Elasticsearch, not sure where exactly.
@@ -199,34 +211,45 @@ public String toString() { | |||
"featureName='" + featureName + '\'' + | |||
", minimumIndexVersion='" + minimumIndexVersion + '\'' + | |||
", upgradeStatus='" + upgradeStatus + '\'' + | |||
", indexVersions=" + indexVersions + | |||
", indexVersions=" + indexInfos + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A very tiny nitpick: indexVersions=
could be changed to indexInfos=
.
} | ||
|
||
// visible for testing | ||
static List<GetFeatureUpgradeStatusResponse.IndexVersion> getIndexVersions(ClusterState state, SystemIndices.Feature feature) { | ||
static List<GetFeatureUpgradeStatusResponse.IndexInfo> getIndexVersions(ClusterState state, SystemIndices.Feature feature) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We could rename this method as well
|
||
assertThat(status.getUpgradeStatus(), equalTo(NO_UPGRADE_NEEDED)); | ||
assertThat(status.getUpgradeStatus(), equalTo(UPGRADE_NEEDED)); | ||
assertThat(status.getFeatureName(), equalTo("test-feature")); | ||
assertThat(status.getMinimumIndexVersion(), equalTo(Version.V_7_0_0)); | ||
assertThat(status.getIndexVersions().size(), equalTo(2)); // additional testing below | ||
} | ||
|
||
public void testGetIndexVersion() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tiny, tiny nit: rename to testGetIndexInfos
if we rename the getIndexVersions
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code since my last review is very straightforward, and testing a build of the branch worked like a charm.
This is looking really great. Thank you so much for all the good work here.
Also re-enables BWC following the backport of elastic#78951.
* upstream/master: (24 commits) Implement framework for migrating system indices (elastic#78951) Improve transient settings deprecation message (elastic#79504) Remove getValue and getValues from Field (elastic#79516) Store Template's mappings as bytes for disk serialization (elastic#78746) [ML] Add queue_capacity setting to start deployment API (elastic#79433) [ML] muting rest compat test issue elastic#79518 (elastic#79519) Avoid redundant available indices check (elastic#76540) Re-enable BWC tests TEST Ensure password 14 chars length on Kerberos FIPS tests (elastic#79496) [DOCS] Temporarily remove APM links (elastic#79411) Fix CCSDuelIT for skipped shards (elastic#79490) Add other time accounting in HotThreads (elastic#79392) Add deprecation info API entries for deprecated monitoring settings (elastic#78799) Add note in breaking changes for nameid_format (elastic#77785) Use 'migration' instead of 'upgrade' in GET system feature migration status responses (elastic#79302) Upgrade lucene version 8b68bf60c98 (elastic#79461) Use Strings#EMPTY_ARRAY (elastic#79452) Quicker shared cache file preallocation (elastic#79447) [ML] Removing some code that's obsolete for 8.0 (elastic#79444) Ensure indexing_data CCR requests are compressed (elastic#79413) ...
This PR adds a framework for migrating system indices as necessary prior to Elasticsearch upgrades. This framework uses REST APIs added in another commit: - GET _migration/system_features This API, which gets the status of "features" (plugins which own system indices) with regards to whether they need to be upgraded or not. As of this PR, this API also reports errors encountered while migrating system indices alongside the index that was being processed when this occurred. As an example of this error reporting: ```json { "feature_name": "logstash_management", "minimum_index_version": "8.0.0", "upgrade_status": "ERROR", "indices": [ { "index": ".logstash", "version": "8.0.0", "failure_cause": { "error": { "root_cause": [ { "type": "runtime_exception", "reason": "whoopsie", "stack_trace": "<omitted for brevity>" } ], "type": "runtime_exception", "reason": "whoopsie", "stack_trace": "<omitted for brevity>" } } } ] } ``` - POST _migration/system_features This API starts the migration process. The API for this has no changes, but when called, any system indices which need to be migrated will be migrated, with status information stored in the cluster state for later use by the GET _migration/system_features API.
Also re-enables BWC following the backport of #78951.
This PR adds a framework for migrating system indices as necessary prior
to Elasticsearch upgrades. This framework uses REST APIs added in
another commit:
This API, which gets the status of "features" (plugins which own system
indices) with regards to whether they need to be upgraded or not. As of
this PR, this API also reports errors encountered while migrating system
indices alongside the index that was being processed when this occurred.
As an example of this error reporting:
This API starts the migration process. The API for this has no changes,
but when called, any system indices which need to be migrated will be
migrated, with status information stored in the cluster state for later
use by the GET _migration/system_features API.