-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v22.3.x] archival: remove timeouts from archival_metadata_stm #12690
Merged
piyushredpanda
merged 4 commits into
redpanda-data:v22.3.x
from
andrwng:v22.3.x-archival-remove-timeout
Aug 10, 2023
Merged
[v22.3.x] archival: remove timeouts from archival_metadata_stm #12690
piyushredpanda
merged 4 commits into
redpanda-data:v22.3.x
from
andrwng:v22.3.x-archival-remove-timeout
Aug 10, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The underlying wait already takes one, this is just a pass-through. (cherry picked from commit 7be3f80)
There's an upcoming change to remove the timeout from the archival_metadata_stm. Making that change will be much simpler if the signature of persisted_stm::wait_no_throw() is a timeout instead of a duration, since the underlying raft::state_machine::wait() call uses the model::no_timeout deadline value to determine whether to set a timer. Along the way I added an optional abort source reference that is currently unused. (cherry picked from commit cac4ac9)
At a high level, the archival loop is implemented with the following steps: ``` while not aborted: wait until node is leader sync that offsets from previous terms are applied while node is leader: upload segments based on what is in the manifest replicate changes to manifest via archival stm wait for archival stm update to be applied (updates manifest) ``` The archival loop currently expects that updates to the manifest are either driven by it (or the housekeeping service, though ignoring this for the sake of example). One case where this invariant is broken is the case where the replication/apply of archival batches times out -- the actual apply of the op may proceed as a part of the raft::state_machine background fiber, and the archival loop itself may proceed without waiting, meaning the updates to the manifest may race with a subsequent iteration of the archival loop. This commit remediates this by removing the timeout from the archival_metadata_stm::do_replicate_commands() call, forcing the caller to wait indefinitely for the op to complete. (cherry picked from commit 0dd1dc8)
CONFLICT: - ntp_archiver_service has changed a bunch since 22.3, but the effective change is the same here: catch broken semaphore exceptions Replication may throw a ss::broken_named_semaphore when met with a shutting down error: ``` DEBUG 2023-03-24 22:54:01,854 [shard 0] archival - [fiber4 kafka/panda-topic/7] - ntp_archiver_service.cc:494 - Updating the archival_meta_stm in read-replica mode, in-sync offset: -9223372036854775808, last uploaded offset: 0, last compacted offset: -9223372036854775808 DEBUG 2023-03-24 22:54:01,854 [shard 0] cluster - ntp: {kafka/panda-topic/7} - archival_metadata_stm.cc:188 - command_batch_builder::replicate called WARN 2023-03-24 22:54:01,854 [shard 0] cluster - ntp: {kafka/panda-topic/7} - archival_metadata_stm.cc:333 - error on replicating remote segment metadata: raft::errc:19 ERROR 2023-03-24 22:54:01,854 [shard 0] archival - [fiber4 kafka/panda-topic/7] - ntp_archiver_service.cc:252 - upload loop error: seastar::broken_named_semaphore (Semaphore broken: mutex) ``` This commit handles the exception the same we handle other shutdown exceptions: by doing nothing. (cherry picked from commit cdb2b39)
andrwng
force-pushed
the
v22.3.x-archival-remove-timeout
branch
from
August 10, 2023 01:40
f9ff57c
to
c94ef84
Compare
Only failure seems to be #10140 |
VladLazar
approved these changes
Aug 10, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of #9633
Fixes #10777
CONFLICTS:
At a high level, the archival loop is implemented with the following steps:
The archival loop currently expects that updates to the manifest are either driven by it (or the housekeeping service, though ignoring this for the sake of example). One case where this invariant is broken is the case where the replication/apply of archival batches times out -- the actual apply of the op may proceed as a part of the raft::state_machine background fiber, and the archival loop itself may proceed without waiting, meaning the updates to the manifest may race with a subsequent iteration of the archival loop.
This commit remediates this by removing the timeout from the archival_metadata_stm::do_replicate_commands() call, forcing the caller to wait indefinitely for the op to complete. In the event of an unsuccessful wait, if the persisted_stm is still leader, it will explicitly step down to force the archival loop to re-sync in a new term.
Backports Required
Release Notes
Bug Fixes