Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s/segment: fixed tracking offsets of empty segments #11450

Merged
merged 3 commits into from
Jun 19, 2023

Commits on Jun 15, 2023

  1. s/segment: fixed tracking offsets of empty segments

    When a segment is rolled due to the `segment.ms` property its base
    offset is set to the committed offset of last segment plus one. In a
    segment constructor all segment offset tracker offsets were set to the
    base offset passed as the constructor argument. This way an empty
    segment had exactly the same set of offsets as the segment containing
    single batch containing one record and base offset equal to the segment
    base offset.
    
    Incorrect offset accounting in empty segments lead to the situation in
    which eviction driven `log::truncate_prefix` was called with an
    incorrect offset, leading to the situation in which a log wasn't
    truncated at the segment boundary. This in the end resulted in the
    disconnection between the offset translator truncation point and first
    log segment start offset.
    
    For example:
    
    (we represent segment as `[base_offset,end_offset]`)
    
    In the case of log with segments:
    
    ```
    [0,10][11,15]
    ```
    
    After segment ms a new segment is created like this:
    
    ```
    [0,10][11,15][16,16]
    ```
    
    When eviction point is established the last offset is checked in this
    case if all the segments are to be evicted it will be equal to 16.
    `16` is a last offset that is going to be evicted (last included in raft
    snapshot). Hence the `log::truncate_prefix` is called with offset `17`.
    
    If some batches are appended to the rolled segment then the log will
    contain f.e.
    
    ```
     [0,10][11,15][16,25]
    ```
    
    Now if the log is prefix truncated at offset `17` it starts offset is
    updated to `17` but the underlying segment is kept.
    
    ```
    [16,25]
    ```
    
    Now when the reader start reading the log it will request to start from
    the log start offset which is `17` but it will have to skip over the
    batch starting at `16`. If a batch at `16` has more than one record it
    will still be returned to the reader and it will have to translate its
    base offset to kafka offset space. This is impossible as the
    offset_translator was already truncated to `16`. i.e. there is no
    information in the translator to correctly translate the offset.
    
    A fix is setting the empty segment dirty, committed and stable offset to
    its base_offset minus one. This way all the semantics of log operation
    holds as the offset returned is equal to previous segment last offset
    which would be the case if there would be no empty segment at the head
    of the log.
    
    Fixes: redpanda-data#11403
    
    Signed-off-by: Michal Maslanka <[email protected]>
    mmaslankaprv committed Jun 15, 2023
    Configuration menu
    Copy the full SHA
    76a9e6f View commit details
    Browse the repository at this point in the history
  2. s/e2e_tests: flush before validating offsets

    Validating offsets in the compaction test uses committed offset but it
    didn't flush the log before validation.
    
    Signed-off-by: Michal Maslanka <[email protected]>
    mmaslankaprv committed Jun 15, 2023
    Configuration menu
    Copy the full SHA
    ee3843b View commit details
    Browse the repository at this point in the history

Commits on Jun 16, 2023

  1. Configuration menu
    Copy the full SHA
    a9ec0ba View commit details
    Browse the repository at this point in the history