Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logging in GlobalCheckpointSyncIT #89185

Merged
merged 2 commits into from
Aug 16, 2022

Conversation

Tim-Brooks
Copy link
Contributor

The test GlobalCheckpointSyncIT#testBackgroundGlobalCheckpointSync
failed once recently due to an engine already close exception. It has
not occurred again and the reasoning is unclear.

This commit adds a log line to indicate exactly when it happens, which
shard it is, and what the current state of the index shard is.

Closes #88428.

The test GlobalCheckpointSyncIT#testBackgroundGlobalCheckpointSync
failed once recently due to an engine already close exception. It has
not occurred again and the reasoning is unclear.

This commit adds a log line to indicate exactly when it happens, which
shard it is, and what the current state of the index shard is.

Closes elastic#88428.
@Tim-Brooks Tim-Brooks added >test Issues or PRs that are addressing/adding tests :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. v8.5.0 :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. labels Aug 8, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Aug 8, 2022
Copy link
Contributor

@kingherc kingherc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Could also just add a logger.info/debug that always happens for all shard IDs/states, instead of try/catching alternatively. But I assume you'd like to log the specific shard ID/state, right @tbrooks8 ?

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2 but note that we're not including the AlreadyClosedException (plus stack trace) in the log message. Not sure of the context but maybe that'd be helpful?

@Tim-Brooks
Copy link
Contributor Author

LGTM2 but note that we're not including the AlreadyClosedException (plus stack trace) in the log message. Not sure of the context but maybe that'd be helpful?

@DaveCTurner

The exception gets thrown and ends up as the test failure. So that is why I left it out so it did not get printed twice? I mostly wanted to propagate the shard id and state if this happens again. Does this make sense?

@DaveCTurner
Copy link
Contributor

Does this make sense?

Yep, perfectly, just thought it worth checking.

@Tim-Brooks Tim-Brooks merged commit ac9f12f into elastic:main Aug 16, 2022
@Tim-Brooks
Copy link
Contributor Author

But I assume you'd like to log the specific shard ID/state, right @tbrooks8 ?

Yes pretty much just trying to catch in a face of an unexpected failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. >test Issues or PRs that are addressing/adding tests v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] GlobalCheckpointSyncIT testBackgroundGlobalCheckpointSync failing
4 participants