Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed issue with offset translation that may lead to offset gaps #10232

Merged
merged 3 commits into from
Apr 21, 2023

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Apr 20, 2023

When hydrate_snapshot happens in Raft the offset_translator_state in offset_translator is reset with the prefix_truncate_reset. The underlying data structure wasn't updated and that lead the situation in which the rm_stm accessed the old not updated version of offset_translator_state.

Fixes: #8289

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

Bug Fixes

  • Fixed issue with offset translation in rm_stm on snapshot hydrate

When `raft::offset_translator` updated state during
`prefix_truncate_reset` all the users which cached the `_state` pointer
wouldn't see the state updates as the assignment didn't change the
underlying data structure but the pointer itself. Fixed the issue by
using an assignment operator that changes pointee not the pointer.

Signed-off-by: Michal Maslanka <[email protected]>
An `rm_stm` can always access the `consensus` to query the offset
translator state there is no need to cache the translator state in
partition.

Signed-off-by: Michal Maslanka <[email protected]>
rystsov
rystsov previously approved these changes Apr 20, 2023
src/v/cluster/partition.cc Show resolved Hide resolved
bharathv
bharathv previously approved these changes Apr 20, 2023
Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what a find 🔥

@piyushredpanda
Copy link
Contributor

@@ -104,7 +104,7 @@ offset_translator::start(must_reset reset, bootstrap_state&& bootstrap) {
if (reset) {
vlog(_logger.info, "resetting offset translation state");

_state = storage::offset_translator_state(
*_state = storage::offset_translator_state(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When raft::offset_translator updated state during
prefix_truncate_reset all the users which cached the _state pointer

nice catch. why would they have operator=(T&&) defined for the shared pointer? that seems really odd.

would the offset translator api be easier to use if any wanting to hold a reference to the state held a reference to the translator instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah the operator(T&&) is indeed confusing. I was thinking about changing the interface to return a reference but i do not know all the lifecycle implications. There may be a user that is relaying on the shared pointer semantics.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmaslankaprv

There may be a user that is relaying on the shared pointer semantics.

Isn't this untrue now that you removed all cases of users keeping a copy of the shared pointer?

@@ -399,7 +400,7 @@ kafka_stages partition::replicate_in_stages(
}
auto old_offset = r.value().last_offset;
auto new_offset = kafka::offset(
_translator->from_log_offset(old_offset)());
get_offset_translator_state()->from_log_offset(old_offset)());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we aren't going to cache any pointers to the state going forward, should we remove that interface and drop the first commit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are cahced pointers in different places

Copy link
Contributor

@ztlpn ztlpn Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I remember the whole point of introducing these shared ptrs was to allow caching it for the duration of the fetch request (and the offset_translator_state object could come from two places - for a local read we got it from raft and for a cloud read we reconstructed it on the fly while reading the segments). And just translating offsets on the fly was not enough because just before returning data to the client we had to go back and translate aborted tx ranges. Ah, implementing the first version of shadow indexing, good times.

Probably this can all be simplified somehow :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I was confused because the wording on one of the commits sounded like all cases in which the pointer were cached were being removed.

@piyushredpanda piyushredpanda merged commit a2561b6 into redpanda-data:dev Apr 21, 2023
@piyushredpanda
Copy link
Contributor

/backport v23.1.x

@piyushredpanda
Copy link
Contributor

/backport v22.3.x

@piyushredpanda
Copy link
Contributor

/backport v22.2.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x bdf06f64d67bdc04845e122cfba6d34d72eb82d9 b9906c7c8f3ade61be71ef5c8da9fce10738539e fad8797ef477b74565d865365ea26256768c7d0e

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x bdf06f64d67bdc04845e122cfba6d34d72eb82d9 b9906c7c8f3ade61be71ef5c8da9fce10738539e fad8797ef477b74565d865365ea26256768c7d0e

Workflow run logs.

jcsp added a commit to jcsp/redpanda that referenced this pull request Apr 26, 2023
This check acts as a reproducer for redpanda-data#10232
and more generally as a pervasive sanity check of our offset
handling and idempotency.
jcsp added a commit to jcsp/redpanda that referenced this pull request Apr 26, 2023
This check acts as a reproducer for redpanda-data#10232
and more generally as a pervasive sanity check of our offset
handling and idempotency.
mmaslankaprv added a commit that referenced this pull request Apr 27, 2023
bharathv added a commit that referenced this pull request Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants