Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v22.3.x] cloud_storage: move eviction under remote_partition #9933

Merged
merged 2 commits into from
Apr 27, 2023

Conversation

andrwng
Copy link
Contributor

@andrwng andrwng commented Apr 10, 2023

Backport of #9590

CONFLICT:

  • required adding an abort source to remote_partition
  • the move induced a UAF without pulling in 580c8ac and c10e266

Previously each remote_partition would wait for an eviction barrier to pass through the eviction loop, ensuring all segments are destructed before stopping the partition. Each segment references members of the remote_partition, so it's important the shutdown sequence stops the segments before destructing the remote_partition. At the same time, having each partition wait for another set of partitions to finish flushing can result in a slow shutdown.

This commit moves the eviction loop into the remote_partition, allowing partition shutdown to entirely avoid waiting for any other partition to shut down, while still ensuring that each underlying segment is destructed after the remote_partition.

Without this commit, I witnessed the period of partition shutdown in a heavily loaded server take 30 minutes. With this commit I see a similarly shaped shutdown taking 10 seconds.

Related #9569

(cherry picked from commit 03587d8)

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

Improvements

  • The shutdown sequence for partitions that use tiered storage is now faster in clusters with heavy read traffic that hydrates readers from object storage.

@andrwng andrwng changed the title cloud_storage: move eviction under remote_partition [v22.3.x] cloud_storage: move eviction under remote_partition Apr 10, 2023
@vshtokman vshtokman modified the milestones: v22.3.x-next, v22.3.16 Apr 11, 2023
@BenPope BenPope added the kind/backport PRs targeting a stable branch label Apr 13, 2023
@BenPope
Copy link
Member

BenPope commented Apr 18, 2023

@andrwng anything blocking triage of the failures, making this ready for review, and assigning reviewers?

@andrwng
Copy link
Contributor Author

andrwng commented Apr 18, 2023

I need to spend a bit more time on this. The failures I'm seeing in CI indicate a real race/unsafe memory access with shutdown.

@piyushredpanda piyushredpanda modified the milestones: v22.3.x-next, v22.3.17 Apr 22, 2023
andrwng and others added 2 commits April 25, 2023 01:13
CONFLICT:
- required adding an abort source to remote_partition

Previously each remote_partition would wait for an eviction barrier to
pass through the eviction loop, ensuring all segments are destructed
before stopping the partition. Each segment references members of the
remote_partition, so it's important the shutdown sequence stops the
segments before destructing the remote_partition. At the same time,
having each partition wait for another set of partitions to finish
flushing can result in a slow shutdown.

This commit moves the eviction loop into the remote_partition, allowing
partition shutdown to entirely avoid waiting for any other partition to
shut down, while still ensuring that each underlying segment is
destructed after the remote_partition.

Without this commit, I witnessed the period of partition shutdown in a
heavily loaded server take 30 minutes. With this commit I see a
similarly shaped shutdown taking 10 seconds.

Related redpanda-data#9569

(cherry picked from commit 03587d8)
...and use it in remote_partition::erase.  This is necessary
because we now require a usable abourt source in all cloud
storage paths, and the partition's abort source is already
fired once we get to removing the persistent state.
@andrwng andrwng force-pushed the v22.3.x-evict-from-partition branch from 9de1010 to c09f0c7 Compare April 25, 2023 17:03
@andrwng andrwng marked this pull request as ready for review April 25, 2023 17:03
@andrwng andrwng requested a review from jcsp April 25, 2023 17:04
@VladLazar
Copy link
Contributor

I don't see anything wrong with the code, but the test_many_partitions_shutdown failure in the release build makes me a bit suspicious.

@VladLazar
Copy link
Contributor

/ci-repeat 3
skip-units
dt-repeat=5
tests/rptest/tests/e2e_shadow_indexing_test.py::ShadowIndexingManyPartitionsTest.test_many_partitions_shutdown

@@ -186,11 +192,40 @@ class remote_partition
retry_chain_node _rtc;
retry_chain_logger _ctxlog;
ss::gate _gate;
ss::abort_source _as;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not have been possible to backport 632676c instead of adding the abort source manually?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that would have been possible. At the time I hadn't considered cherry-picking individual commits from the backport, but that is a better approach

@andrwng
Copy link
Contributor Author

andrwng commented Apr 26, 2023

The CI failures were because #10342 was missing. I triggered a rebuild to rebase, now that that's merged.

@andrwng andrwng merged commit 2099e16 into redpanda-data:v22.3.x Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/redpanda kind/backport PRs targeting a stable branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants