Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] finalize feature reset integration #71133

Merged

Conversation

benwtrent
Copy link
Member

@benwtrent benwtrent commented Mar 31, 2021

This commit updates transform feature reset to:

  • wait for transform tasks to complete
  • wait for all indexing actions to transform indices to complete
  • and prevents transform audit messages from being written while the reset is being processed

related to #70008 & #69581

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@benwtrent benwtrent force-pushed the feature/transform-feature-reset-integration branch from 546646d to 75ad1c5 Compare March 31, 2021 18:07
@@ -213,6 +220,7 @@ protected XPackLicenseState getLicenseState() {
new ActionHandler<>(GetTransformStatsAction.INSTANCE, TransportGetTransformStatsAction.class),
new ActionHandler<>(PreviewTransformAction.INSTANCE, TransportPreviewTransformAction.class),
new ActionHandler<>(UpdateTransformAction.INSTANCE, TransportUpdateTransformAction.class),
new ActionHandler<>(SetResetModeAction.INSTANCE, TransportSetResetModeAction.class),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes me wonder: all the other actions have Transform in the name, this one not

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason. I can rename it

.setActions(TransformField.TASK_NAME)
.setWaitForCompletion(true)
.execute(ActionListener.wrap(
listMlTasks -> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: listMlTasks -> listTransformTasks


StopTransformAction.Request stopTransformsRequest = new StopTransformAction.Request(Metadata.ALL, true, true, null, true, false);
client.execute(StopTransformAction.INSTANCE, stopTransformsRequest, afterStoppingTransforms);
client.execute(SetResetModeAction.INSTANCE, SetResetModeActionRequest.enabled(), afterResetModeSet);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if something goes wrong between set and reset? E.g. node crashes between the 2. Will reset mode stay in cluster state in this case? What is the mitigation? Do I have to call reset again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will reset mode stay in cluster state in this case?

Yes

What is the mitigation? Do I have to call reset again?

Yes

}

/**
* Merge the diff with the ML metadata.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: talks about ML not transform

ClusterState state,
ActionListener<AcknowledgedResponse> listener) throws Exception {

final boolean isResetModelEnabled = isResetMode(state);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an l too much: isResetModelEnabled -> isResetModeEnabled

ActionListener<AcknowledgedResponse> clusterStateUpdateListener = ActionListener.wrap(
acknowledgedResponse -> {
if (acknowledgedResponse.isAcknowledged() == false) {
logger.info(new ParameterizedMessage("Cluster state update is NOT acknowledged for [{}]", featureName()));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ looks like a debug leftover to me

success -> client.execute(SetResetModeAction.INSTANCE, SetResetModeActionRequest.disabled(), ActionListener.wrap(
resetSuccess -> finalListener.onResponse(success),
resetFailure -> {
logger.error("failed to disable reset mode after otherwise successful transform reset", resetFailure);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to make this actionable. If I see this error as a user I don't know what I should do, should I call reset again?

resetFailure -> {
logger.error("failed to disable reset mode after otherwise successful transform reset", resetFailure);
finalListener.onFailure(
ExceptionsHelper.serverError(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this one. It lacks a proper status code. If I get it right, it does not set one, which means it is based on the one from resetFailure, which could be anything. I think a status exception with a proper code would be better. My gut feeling would be a 503 - but that depends on the failure. Related to my comment above: what should a user do? A 503 would indicate a retry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serverError should throw an internal server error. If it doesn't I am misusing it and should throw an internal server error.

.setActions("indices:data/write/bulk")
.setDetailed(true)
.setWaitForCompletion(true)
.setDescriptions("*.transform-*", "*.data-frame-*")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider using contants from TransformInternalIndexConstants

@benwtrent benwtrent requested a review from hendrikmuhs April 7, 2021 11:53
Copy link

@hendrikmuhs hendrikmuhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@benwtrent benwtrent merged commit 55c7cbc into elastic:master Apr 7, 2021
@benwtrent benwtrent deleted the feature/transform-feature-reset-integration branch April 7, 2021 15:00
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Apr 7, 2021
This commit updates transform feature reset to:

- wait for transform tasks to complete
- wait for all indexing actions to transform indices to complete
- and prevents transform audit messages from being written while the reset is being processed

related to elastic#70008 & elastic#69581
benwtrent added a commit that referenced this pull request Apr 7, 2021
* [Transform] finalize feature reset integration (#71133)

This commit updates transform feature reset to:

- wait for transform tasks to complete
- wait for all indexing actions to transform indices to complete
- and prevents transform audit messages from being written while the reset is being processed

related to #70008 & #69581
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants