Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't halt policy execution on policy trigger exception #49128

Merged
merged 1 commit into from
Nov 15, 2019

Conversation

dakrone
Copy link
Member

@dakrone dakrone commented Nov 15, 2019

When triggered either by becoming master, a new cluster state, or a
periodic schedule, an ILM policy execution through
maybeRunAsyncAction, runPolicyAfterStateChange, or
runPeriodicStep throwing an exception will cause the loop the
terminate. This means that any indices that would have been processed
after the index where the exception was thrown will not be processed by
ILM.

For most execution this is not a problem because the actual running of
steps is protected by a try/catch that moves the index to the ERROR step
in the event of a problem. If an exception occurs prior to step
execution (for example, in fetching and parsing the current
policy/step) however, it causes the loop termination previously
mentioned.

This commit wraps the invocation of the methods specified above in a
try/catch block that provides better logging and does not bubble the
exception up.

When triggered either by becoming master, a new cluster state, or a
periodic schedule, an ILM policy execution through
`maybeRunAsyncAction`, `runPolicyAfterStateChange`, or
`runPeriodicStep` throwing an exception will cause the loop the
terminate. This means that any indices that would have been processed
after the index where the exception was thrown will not be processed by
ILM.

For most execution this is not a problem because the actual running of
steps is protected by a try/catch that moves the index to the ERROR step
in the event of a problem. If an exception occurs prior to step
execution (for example, in fetching and parsing the current
policy/step) however, it causes the loop termination previously
mentioned.

This commit wraps the invocation of the methods specified above in a
try/catch block that provides better logging and does not bubble the
exception up.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/ILM+SLM)

@dakrone
Copy link
Member Author

dakrone commented Nov 15, 2019

I re-opened #37581 (comment) for the failure (it is unrelated to this PR)

@elasticmachine run elasticsearch-ci/1

Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - great catch Lee!

@dakrone dakrone merged commit 74a9407 into master Nov 15, 2019
@dakrone dakrone deleted the ilm-dont-halt-on-policy-error branch November 15, 2019 14:57
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Nov 15, 2019
When triggered either by becoming master, a new cluster state, or a
periodic schedule, an ILM policy execution through
`maybeRunAsyncAction`, `runPolicyAfterStateChange`, or
`runPeriodicStep` throwing an exception will cause the loop the
terminate. This means that any indices that would have been processed
after the index where the exception was thrown will not be processed by
ILM.

For most execution this is not a problem because the actual running of
steps is protected by a try/catch that moves the index to the ERROR step
in the event of a problem. If an exception occurs prior to step
execution (for example, in fetching and parsing the current
policy/step) however, it causes the loop termination previously
mentioned.

This commit wraps the invocation of the methods specified above in a
try/catch block that provides better logging and does not bubble the
exception up.
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Nov 15, 2019
When triggered either by becoming master, a new cluster state, or a
periodic schedule, an ILM policy execution through
`maybeRunAsyncAction`, `runPolicyAfterStateChange`, or
`runPeriodicStep` throwing an exception will cause the loop the
terminate. This means that any indices that would have been processed
after the index where the exception was thrown will not be processed by
ILM.

For most execution this is not a problem because the actual running of
steps is protected by a try/catch that moves the index to the ERROR step
in the event of a problem. If an exception occurs prior to step
execution (for example, in fetching and parsing the current
policy/step) however, it causes the loop termination previously
mentioned.

This commit wraps the invocation of the methods specified above in a
try/catch block that provides better logging and does not bubble the
exception up.
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Nov 15, 2019
When triggered either by becoming master, a new cluster state, or a
periodic schedule, an ILM policy execution through
`maybeRunAsyncAction`, `runPolicyAfterStateChange`, or
`runPeriodicStep` throwing an exception will cause the loop the
terminate. This means that any indices that would have been processed
after the index where the exception was thrown will not be processed by
ILM.

For most execution this is not a problem because the actual running of
steps is protected by a try/catch that moves the index to the ERROR step
in the event of a problem. If an exception occurs prior to step
execution (for example, in fetching and parsing the current
policy/step) however, it causes the loop termination previously
mentioned.

This commit wraps the invocation of the methods specified above in a
try/catch block that provides better logging and does not bubble the
exception up.
dakrone added a commit that referenced this pull request Nov 15, 2019
This commit wraps the calls to retrieve the current step in a try/catch
so that the exception does not bubble up. Instead, step info is added
containing the exception to the existing step.

Semi-related to #49128
andreidan pushed a commit that referenced this pull request Nov 19, 2019
This commit wraps the calls to retrieve the current step in a try/catch
so that the exception does not bubble up. Instead, step info is added
containing the exception to the existing step.

Semi-related to #49128
andreidan pushed a commit to andreidan/elasticsearch that referenced this pull request Nov 19, 2019
This commit wraps the calls to retrieve the current step in a try/catch
so that the exception does not bubble up. Instead, step info is added
containing the exception to the existing step.

Semi-related to elastic#49128

(cherry picked from commit 72530f8)
Signed-off-by: Andrei Dan <[email protected]>
andreidan pushed a commit to andreidan/elasticsearch that referenced this pull request Nov 19, 2019
This commit wraps the calls to retrieve the current step in a try/catch
so that the exception does not bubble up. Instead, step info is added
containing the exception to the existing step.

Semi-related to elastic#49128

(cherry picked from commit 72530f8)
Signed-off-by: Andrei Dan <[email protected]>
andreidan added a commit that referenced this pull request Nov 19, 2019
This commit wraps the calls to retrieve the current step in a try/catch
so that the exception does not bubble up. Instead, step info is added
containing the exception to the existing step.

Semi-related to #49128

(cherry picked from commit 72530f8)
Signed-off-by: Andrei Dan <[email protected]>
andreidan added a commit that referenced this pull request Nov 19, 2019
This commit wraps the calls to retrieve the current step in a try/catch
so that the exception does not bubble up. Instead, step info is added
containing the exception to the existing step.

Semi-related to #49128

(cherry picked from commit 72530f8)
Signed-off-by: Andrei Dan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants