Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rm_stm: replace std::vector usage with fragmented_vector #9484

Merged
merged 3 commits into from
Mar 20, 2023

Conversation

bharathv
Copy link
Contributor

@bharathv bharathv commented Mar 15, 2023

A lot of state in rm_stm is backed by std::vector. With frequent snapshotting + non trivial pid counts large contiguous allocations cannot be satisfied if the memory is fragmented. This patch switches to using fragmented_vector for this usecase.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

  • none

@bharathv bharathv force-pushed the switch_to_frag branch 2 times, most recently from d249ea0 to 690b50c Compare March 15, 2023 22:47
@emaxerrno
Copy link
Contributor

Awesome. ! Should we do this for the Kafka api layer as well.

@bharathv bharathv changed the title rm_stm: switch all vectors to frag vectors rm_stm: replace std::vector usage with fragmented_vector Mar 16, 2023
@bharathv
Copy link
Contributor Author

Awesome. ! Should we do this for the Kafka api layer as well.

+1. We had some recent work in #8469

@bharathv
Copy link
Contributor Author

/ci-repeat 10

@bharathv bharathv marked this pull request as ready for review March 16, 2023 04:52
rystsov
rystsov previously approved these changes Mar 16, 2023
@@ -33,6 +34,14 @@ template<typename T, template<typename...> class C>
inline constexpr bool is_specialization_of_v
= is_specialization_of<T, C>::value;

template<class T, template<class, size_t> class C>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serde.h depends on reflection/type_traits.h and defines the same constexprs so you may remove a definition there

mmaslankaprv
mmaslankaprv previously approved these changes Mar 16, 2023
@bharathv
Copy link
Contributor Author

Fixed another vector compat issue in fragmented_vector and tests passing locally. Force pushed.

@piyushredpanda
Copy link
Contributor

Awesome. ! Should we do this for the Kafka api layer as well.

@dotnwat : can we have a tracking ticket for this?

- Adds additional typedefs for vector compatibility.
- Implement front()
- Make push_back() accept a forwarding reference.
- Defines a custom move c-tor / assignment that guarantees empty()
after move similar to std::vector.
Uses the same underlying serialization as std::vector for compatibility.
Adds a bounds check to limit vector size to int32_t_max which was an
implicit assumption but a vector can be much bigger.
@bharathv
Copy link
Contributor Author

Most failures are this.. seems to be fixed.. retrying CI.

403 Client Error: Forbidden for url: https://vectorized-public.s3.us-west-2.amazonaws.com/releases/redpanda/22.2.11/redpanda-22.2.11-amd64.tar.gz

@bharathv
Copy link
Contributor Author

/ci-repeat 2

@bharathv
Copy link
Contributor Author

CI Failure: #8496 (unrelated known issue)

@piyushredpanda piyushredpanda merged commit 0bbfaaa into redpanda-data:dev Mar 20, 2023
@@ -907,7 +907,7 @@ ss::future<cloud_storage::upload_result> ntp_archiver::upload_tx(

auto path = segment_path_for_candidate(candidate);

cloud_storage::tx_range_manifest manifest(path, tx_range);
cloud_storage::tx_range_manifest manifest(path, std::move(tx_range));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent. and since we have copies deleted we can also catch a bunch of inefficiencies!

}
_ranges.shrink_to_fit();
}
remote_segment_path spath, fragmented_vector<model::tx_range>&& range)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: prefer pass-by-value

@@ -1922,7 +1922,7 @@ rm_stm::do_aborted_transactions(model::offset from, model::offset to) {
continue;
}
if (_log_state.last_abort_snapshot.match(idx)) {
auto opt = _log_state.last_abort_snapshot;
auto& opt = _log_state.last_abort_snapshot;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably be const auto&? before it was a value, so without going to look at filter_interesting having the reference be const here would be helpful.

@@ -2825,11 +2825,13 @@ uint64_t rm_stm::get_snapshot_size() const {
return persisted_stm::get_snapshot_size() + abort_snapshots_size;
}

ss::future<> rm_stm::save_abort_snapshot(abort_snapshot snapshot) {
auto filename = abort_idx_name(snapshot.first, snapshot.last);
ss::future<> rm_stm::save_abort_snapshot(abort_snapshot&& snapshot) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please prefer pass-by-value, especially for coroutines.

@bharathv
Copy link
Contributor Author

/backport v23.1.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x c7dea50eb89f222e7b0498e29dbb44f207d2247c 1a5dde709997332822f8407d53b299fd7c0992c5 89e2b17eb4e4aad5f346316dd2bcdcc15ad2c4e1

Workflow run logs.

@bharathv bharathv deleted the switch_to_frag branch March 22, 2023 18:11
@piyushredpanda
Copy link
Contributor

@bharathv : Let's not forget backport to v22.3.x, please?

@bharathv
Copy link
Contributor Author

Just to close the loop here, as discussed offline, I already backported it manually via #9632 due to conflicts.

@vshtokman
Copy link
Contributor

v23.1.x backport: #9626

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants