Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coarse index generation in chunk read path does not account for all data in index #11703

Closed
abhijat opened this issue Jun 27, 2023 · 0 comments · Fixed by #11705
Closed

Coarse index generation in chunk read path does not account for all data in index #11703

abhijat opened this issue Jun 27, 2023 · 0 comments · Fixed by #11705
Assignees
Labels
area/cloud-storage Shadow indexing subsystem kind/bug Something isn't working

Comments

@abhijat
Copy link
Contributor

abhijat commented Jun 27, 2023

Coarse index generation takes as input the remote segment index and produces a kafka offset to file offset mapping with a step size equal to chunk size. There are two issues with the coarse index generation code:

  1. It only examines the indices for file offset and kafka offsets. there is additional data in the _offsets fields. In certain cases where the segment is too small, the index fields may end up empty and all the data is contained in the offsets. This happens when the segment size is small enough that a write to the index does not take place. The coarse index should also examine the offset fields after it has gone through the index fields.
  2. The write to coarse index uses mod comparison to determine which data to write to index, this can run into a bug where consecutive entries have the same mod value. A simpler approach is to keep a running sum and use it to write to coarse index.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant