Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make feature reset API response more informative #71240

Merged
merged 23 commits into from
Apr 27, 2021

Conversation

williamrandolph
Copy link
Contributor

@williamrandolph williamrandolph commented Apr 2, 2021

Previously, the Reset API returned a list of which feature states it tried to reset, along with success and failure messages. However, the overall return status was always success (200 OK), even if an individual feature state failed. (For example, if wildcard deletions were disabled for associated indices #71150.)

This PR changes the Reset API so that it uses different response codes for different situations: 200 OK if all reset operations succeed, 207 MULTI STATUS if some operations succeed and others fail, and 500 INTERNAL SERVER ERROR if all of the reset operations fail. Clients should check the response code from this API.

The API response now includes formatted stack traces when reset operation fails.

  • add documentation about the possible 207 response

Fixes #71178

@williamrandolph williamrandolph requested a review from jaymode April 2, 2021 15:54
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Apr 2, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@jaymode
Copy link
Member

jaymode commented Apr 2, 2021

Hmm, I wonder if this is a prime example for using an HTTP 207 response code as discussed in #60442? It seems like it to me where the behavior of the API is desirable so we know all of the errors rather than the first and have to play whack a mole. Maybe we should instead fix the consumers of the API to check for failures? What do you think?

@williamrandolph
Copy link
Contributor Author

That seems like a good approach here, and I'll see if I can get it working quickly. Otherwise it might be worth making a new issue for it so that we can get some kind of fail-fast behavior into the tests.

Previously, the ResetFeatureStateStatus object captured its status in a
String, which meant that if we wanted to know if something succeeded or
failed, we'd have to parse information out of the string. This isn't a
good way of doing things.

I've introduced a SUCCESS/FAILURE enum for status constants, and added a
check for failures in the transport action. Instead of throwing the
first exception that comes back from the resets, we now throw an
exception that shows a complete response with all successes and
failures. We still "fail fast" when we can't reset the features, but now
the user won't have to iterate in order to find all the errors.

The response message could probably be improved from here. I've got some
todos in the code for reminders.
@williamrandolph williamrandolph requested review from jaymode and removed request for jaymode April 21, 2021 20:39
new RestToXContentListener<>(restChannel) {
@Override
protected RestStatus getStatus(ResetFeatureStateResponse response) {
if (response.hasAllFailures()) {
Copy link
Contributor Author

@williamrandolph williamrandolph Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I've got ES returning a 500 if all resets fail, a 207 for mixed responses, and a 200 if all goes well. Does this make sense?

@williamrandolph
Copy link
Contributor Author

I think these test failures have to do with this branch, so I'm looking into them.

@williamrandolph williamrandolph changed the title Reset API should fail fast when there is an error Make feature reset API response more informative Apr 23, 2021
Copy link
Member

@jaymode jaymode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good. I left some comments.

Copy link
Member

@jaymode jaymode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please make sure to backport to 7.x and the 7.13 branches so this makes it into the next 7.13 BC.

Co-authored-by: Jay Modi <[email protected]>
@williamrandolph williamrandolph merged commit fc7c06d into elastic:master Apr 27, 2021
williamrandolph added a commit to williamrandolph/elasticsearch that referenced this pull request Apr 27, 2021
Previously, the ResetFeatureStateStatus object captured its status in a
String, which meant that if we wanted to know if something succeeded or
failed, we'd have to parse information out of the string. This isn't a
good way of doing things.

I've introduced a SUCCESS/FAILURE enum for status constants, and added a
check for failures in the transport action. We return a 207 if some but not all
reset actions fail, and for every failure, we also return information about the
exception or error that caused it.

Co-authored-by: Jay Modi <[email protected]>
williamrandolph added a commit that referenced this pull request Apr 27, 2021
* Make feature reset API response more informative (#71240)

Previously, the ResetFeatureStateStatus object captured its status in a
String, which meant that if we wanted to know if something succeeded or
failed, we'd have to parse information out of the string. This isn't a
good way of doing things.

I've introduced a SUCCESS/FAILURE enum for status constants, and added a
check for failures in the transport action. We return a 207 if some but not all
reset actions fail, and for every failure, we also return information about the
exception or error that caused it.

* Fix 7.x backport compilation issues

* Feature reset integration test should tolerate failed resets (#72326)

Co-authored-by: Jay Modi <[email protected]>
williamrandolph added a commit to williamrandolph/elasticsearch that referenced this pull request Apr 27, 2021
…stic#72332)

* Make feature reset API response more informative (elastic#71240)

Previously, the ResetFeatureStateStatus object captured its status in a
String, which meant that if we wanted to know if something succeeded or
failed, we'd have to parse information out of the string. This isn't a
good way of doing things.

I've introduced a SUCCESS/FAILURE enum for status constants, and added a
check for failures in the transport action. We return a 207 if some but not all
reset actions fail, and for every failure, we also return information about the
exception or error that caused it.

* Fix 7.x backport compilation issues

* Feature reset integration test should tolerate failed resets (elastic#72326)

Co-authored-by: Jay Modi <[email protected]>
@williamrandolph
Copy link
Contributor Author

Backport to 7.x: #72332
Backport to 7.13: #72341

williamrandolph added a commit that referenced this pull request Apr 28, 2021
* Add docs for feature reset API (#71759)

* Make feature reset API response more informative (#71240)

Previously, the ResetFeatureStateStatus object captured its status in a
String, which meant that if we wanted to know if something succeeded or
failed, we'd have to parse information out of the string. This isn't a
good way of doing things.

I've introduced a SUCCESS/FAILURE enum for status constants, and added a
check for failures in the transport action. We return a 207 if some but not all
reset actions fail, and for every failure, we also return information about the
exception or error that caused it.

* Fix 7.x backport compilation issues

* Feature reset integration test should tolerate failed resets (#72326)

Co-authored-by: debadair <[email protected]>
Co-authored-by: Jay Modi <[email protected]>
@williamrandolph williamrandolph deleted the reset-api-fail-fast branch May 23, 2022 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label >non-issue Team:Core/Infra Meta label for core/infra team v7.13.0 v7.14.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature reset API should fail fast when action.destructive_requires_name is true
4 participants