Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API for resetting state of a SystemIndexPlugin #69469

Merged
merged 33 commits into from
Mar 17, 2021

Conversation

williamrandolph
Copy link
Contributor

When we disable access to system indices, plugins will still need a way to erase their state. The obvious and most pressing use case for this is in tests, which need to be able to clean up the state of a cluster in between groups of tests.

In this first draft of the feature state reset API, a user with snapshot admin privileges POSTs a request to /_reset_feature_state. The node handling the endpoint then iterates through a list of features and executes their cleanup callbacks. By default, these callbacks use a client and a clusterState to delete all system indexes and associated indexes for a feature. The endpoint then returns a list of objects that include a feature name and a report of whether the reset operation succeeded or failed.

My initial cut used a TransportMasterNodeAction, which requires code
that carefully manipulates cluster state. At least for the first cut and
testing, it seems like it will be much easier to use a client within a
HandledTransportAction, which effectively makes the
TransportResetFeatureStateAction a class that dispatches other transport
actions to do the real work. This could leave cleanup vulnerable to
interference from other actions, e.g., indices deleted between lookup
and deletion actions, but I can try to tighten that up at a later stage
if I need to.
@williamrandolph
Copy link
Contributor Author

I still have some TODOs scattered in the code. I'll keep working on those, but I'm interested in feedback on the overall approach.

Copy link
Contributor

@gwbrown gwbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple comments, approach looks good though. Nice work!

@jaymode jaymode linked an issue Mar 2, 2021 that may be closed by this pull request
Comment on lines 338 to 346
List<String> systemIndices = indexDescriptors.stream()
.map(sid -> sid.getMatchingIndices(clusterService.state().getMetadata()))
.flatMap(List::stream)
.collect(Collectors.toList());

List<String> associatedIndices = new ArrayList<>(associatedIndexPatterns);

List<String> allIndices = Stream.concat(systemIndices.stream(), associatedIndices.stream())
.collect(Collectors.toList());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
List<String> systemIndices = indexDescriptors.stream()
.map(sid -> sid.getMatchingIndices(clusterService.state().getMetadata()))
.flatMap(List::stream)
.collect(Collectors.toList());
List<String> associatedIndices = new ArrayList<>(associatedIndexPatterns);
List<String> allIndices = Stream.concat(systemIndices.stream(), associatedIndices.stream())
.collect(Collectors.toList());
Stream<String> systemIndices = indexDescriptors.stream()
.map(sid -> sid.getMatchingIndices(clusterService.state().getMetadata()))
.flatMap(List::stream);
List<String> allIndices = Stream.concat(systemIndices, associatedIndexPatterns.stream())
.collect(Collectors.toUnmodifiableList());

We're unnecessarily creating objects here, so I provided a suggestion to clean it up. Also should associatedIndexPatterns be resolved from patterns to indices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the suggested change, but you're probably right about associatedIndexPatterns. I'll take a look and report back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works as written, because we can pass a valid index pattern to the delete request, which will then handle the resolution logic when executed. We can't do the same thing with system index descriptor patterns because they can use character-class regexes, for example .test-[ab]*.

It might be worth resolving the associated indices here so that we can skip submitting the delete request when no indices exist. Do you think this is important to do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine as it is unless we discover issues

Out of an abundance of caution, I think the "reset" part of this path
should have a leading underscore, so that if there's ever a reason to
implement "GET _features/<feature_id>" we won't have to worry about
distinguishing "reset" from a feature name.
@williamrandolph williamrandolph requested a review from jaymode March 16, 2021 19:44
Copy link
Member

@jaymode jaymode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@williamrandolph
Copy link
Contributor Author

@elasticmachine update branch

@williamrandolph
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/1

@williamrandolph williamrandolph merged commit 624ee45 into elastic:master Mar 17, 2021
williamrandolph added a commit to williamrandolph/elasticsearch that referenced this pull request Mar 17, 2021
When we disable access to system indices, plugins will still need
a way to erase their state. The obvious and most pressing use
case for this is in tests, which need to be able to clean up the
state of a cluster in between groups of tests.

* Use a HandledTransportAction for reset action

My initial cut used a TransportMasterNodeAction, which requires code
that carefully manipulates cluster state. At least for the first cut and
testing, it seems like it will be much easier to use a client within a
HandledTransportAction, which effectively makes the
TransportResetFeatureStateAction a class that dispatches other transport
actions to do the real work.

* Clean up code by using a GroupedActionListener

* ML feature state cleaner

* Implement Transform feature state reset

* Change _features/reset path to _features/_reset

Out of an abundance of caution, I think the "reset" part of this path
should have a leading underscore, so that if there's ever a reason to
implement "GET _features/<feature_id>" we won't have to worry about
distinguishing "reset" from a feature name.

Co-authored-by: Gordon Brown <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Mar 17, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

williamrandolph added a commit that referenced this pull request Mar 18, 2021
* Add API for resetting state of a `SystemIndexPlugin` (#69469)

When we disable access to system indices, plugins will still need
a way to erase their state. The obvious and most pressing use
case for this is in tests, which need to be able to clean up the
state of a cluster in between groups of tests.

* Use a HandledTransportAction for reset action

My initial cut used a TransportMasterNodeAction, which requires code
that carefully manipulates cluster state. At least for the first cut and
testing, it seems like it will be much easier to use a client within a
HandledTransportAction, which effectively makes the
TransportResetFeatureStateAction a class that dispatches other transport
actions to do the real work.

* Clean up code by using a GroupedActionListener

* ML feature state cleaner

* Implement Transform feature state reset

* Change _features/reset path to _features/_reset

Out of an abundance of caution, I think the "reset" part of this path
should have a leading underscore, so that if there's ever a reason to
implement "GET _features/<feature_id>" we won't have to worry about
distinguishing "reset" from a feature name.

Co-authored-by: Gordon Brown <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
@williamrandolph
Copy link
Contributor Author

Backport to 7.x: #70524

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label >feature Team:Core/Infra Meta label for core/infra team v7.13.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add an API to remove System Indices
6 participants