-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rm_stm returning invalid_lso
as last stable offset after node restart
#11130
Labels
Comments
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 1, 2023
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 7, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 8, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 8, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 9, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 13, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 14, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 16, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 16, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 20, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 20, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 20, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 20, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 20, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 20, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 20, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
graphcareful
pushed a commit
to graphcareful/redpanda
that referenced
this issue
Jun 21, 2023
- This stm has a conditional in its last_stable_offset() method that returns an invalid offset in the case it hasn't completed bootstrapping. - The issue is that this bootstrap phase isn't considered finished after bootstrapping from apply_snapshot(). This would cause other stms to pause thinking the rm_stm had work to do at an offset at 0, causing that other stm to timeout and fail processing of said event. - Solution is simple, to set `_boostrap_committed_offset`within the `apply_snapshot()` method - Fixes: redpanda-data#11131 - Fixes: redpanda-data#11130
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Version & Environment
Found on dev while testing locally
What went wrong?
After a restart I noticed that the
rm_stm
is returning0
formax_collectible_offset
. When i look closer i see this condition is getting hit.It seems like this condition only occurs if the
stm
has consumed to the true end before the node restart. In the above snippetbootstrap_committed_offset
is null becauseapply()
hasn't yet been called due to the fact thatraft::state_machine_next
is already at the end of the log because the stm sucessfully snapshotted the final offset before it crashed.Therefore when the
last_stable_offset
is queried0
is returned instead of the actual value.What should happen?
The actual
last_stable_offset
the stm has processed should be returned instead of0
.How to reproduce the issue?
Shutdown a node after the
rm_stm
is up to date, then querylast_stable_offset
.The text was updated successfully, but these errors were encountered: