-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Upgrade Assistant] Update logic for handling reindex failures #124571
[Upgrade Assistant] Update logic for handling reindex failures #124571
Conversation
body: { count }, | ||
} = await esClient.count({ index: reindexOp.attributes.indexName }); | ||
|
||
if (taskResponse.task.status!.created < count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a reliable check, as the documents could end up in an updated
state. This could occur if Kibana is restarted and a user resumes the reindex process in Upgrade Assistant, which actually kicks off the reindex again. As the code is here, we would actually throw an error when there was not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More info on the task response body here: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#docs-reindex-api-response-body
…dex_error_handling
@elasticmachine merge upstream |
Pinging @elastic/platform-deployment-management (Team:Deployment Management) |
@elasticmachine merge upstream |
@elasticmachine merge upstream |
💚 Build Succeeded
Metrics [docs]
History
To update your PR or re-run it, just comment with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM! I tried to reproduce locally the issue reverting your changes but could not get the error to be thrown. I tried both stopping Kibana at the beginning of the reindexing and in the middle of the 3rd step ("Reindex documents"). In both cases UA resumed and completed the reindex.
The code change make sense to me though. But it would be great to get to the bottom as exactly when this issue occurs. Could it be ES related?
Thanks for the review @sebelga!
The issue is likely occurring because we are actually kicking off another reindex operation when the user clicks "resume" in the UI so the documents may have already been created, and end up in an |
Fixes #124153
The Upgrade Assistant was incorrectly marking a reindexing process as a failure. This PR adjusts the logic for determining a reindex failure to only check for the existence of
failures
in the task API response.It also fixes a minor spacing issue when rendering an error callout.
How to test
This is a hard one to add automated tests for (open to suggestions!), as it involves restarting Kibana to reproduce.
I was able to reproduce the bug (pre change) fairly consistently using the manual steps documented in #123817 (comment). With this change, the reindexing process should no longer fail.
To trigger an actual reindexing failure (post change):
text
type. For example:Alternatively, you can use this zip file, which already contains some indices: 6.8.16-data.zip
yarn es snapshot --license=trial -E path.data=./path_to_6.x
Screenshots
Before/after of spacing issue.
Before:
After: