-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloud_storage: Use columnar projection to store spillover manifests #11294
cloud_storage: Use columnar projection to store spillover manifests #11294
Commits on Jun 15, 2023
-
cluster: Fix spillover in archival_metadata_stm
The spillover command was serialized with the wrong key. Because of that the spillover was never applied.
Configuration menu - View commit details
-
Copy full SHA for a72c0e3 - Browse repository at this point
Copy the full SHA a72c0e3View commit details -
cloud_storage: Update the partition_manifest
Add the list of spillover manifests to the partition manifest. The list is supposed to be used instead of the ListObjectsV2 api in S3. The 'segment_meta' structure is used to represent individual manifests. The compressed column-store which is used to store segments is also used to store spillover manifests.
Configuration menu - View commit details
-
Copy full SHA for e734c3f - Browse repository at this point
Copy the full SHA e734c3fView commit details -
cloud_storage: Use list of spillover manifests
Use list of spillover manifests from the partition manifest in the async_manifest_view.
Configuration menu - View commit details
-
Copy full SHA for fbb0174 - Browse repository at this point
Copy the full SHA fbb0174View commit details -
cloud_storage: Move materialized_manifest_cache
Move materialized_manifest_cache into a dedicated .cc to avoid cyclic dependency.
Configuration menu - View commit details
-
Copy full SHA for 42cfa29 - Browse repository at this point
Copy the full SHA 42cfa29View commit details -
cloud_storage: Construct materialized_manifest_cache...
... per shard and not per partition. The object is moved to the materialized_segments. Some cache methods are changed to accept retry_chain_logger of the caller (async_manifest_view). Previously the materialized_manifest_cache received retry_chain_logger reference through constructor.
Configuration menu - View commit details
-
Copy full SHA for 5579d1c - Browse repository at this point
Copy the full SHA 5579d1cView commit details -
cloud_storage: Rename materialied_segments...
... to materialized_resources since it manages not only segments but also spillover manifests.
Configuration menu - View commit details
-
Copy full SHA for 111e301 - Browse repository at this point
Copy the full SHA 111e301View commit details -
cloud_storage: Add accessors for individual columns
Add column accessors for c-store. The columns can be used to search by individual fields without materializing the whole rows of data. This allows us to speedup individual operations on metadata.
Configuration menu - View commit details
-
Copy full SHA for b4d49b3 - Browse repository at this point
Copy the full SHA b4d49b3View commit details -
cloud_storage: Use individual columns to search...
... for spillover manifests.
Configuration menu - View commit details
-
Copy full SHA for 4370429 - Browse repository at this point
Copy the full SHA 4370429View commit details -
cloud_storage: Update timequery
Use column-store to perform a timequery. The search is performed using only selectred columns (base_timestamp and max_timestamp) using linear search.
Configuration menu - View commit details
-
Copy full SHA for 6b48931 - Browse repository at this point
Copy the full SHA 6b48931View commit details -
cloud_storage: Add metric for spillover manifests
Add metric for uploads and downloads. Add new manifest_type variant.
Configuration menu - View commit details
-
Copy full SHA for ab43ecd - Browse repository at this point
Copy the full SHA ab43ecdView commit details -
cloud_storage: Put spillover manifest hydration
This commit fixes a bug in the code that causes async_manifest_view to put manifest into cache twice and returning empty manifest to the caller. This triggers assertion because async_manifest_view works only witn non-empty manifests. It also fixes manifest download code path that interpreted spillover manifests as json.
Configuration menu - View commit details
-
Copy full SHA for 80e00ab - Browse repository at this point
Copy the full SHA 80e00abView commit details -
archival: Add new spillover configuration parameter
Add 'cloud_storage_spillover_manifest_max_segments' parameter. The parameter is similar to 'cloud_storage_spillover_manifest_size' but instead of forcing manifest spillover based on byte size of the manifest it uses number of segments in the manifest.
Configuration menu - View commit details
-
Copy full SHA for 4a98478 - Browse repository at this point
Copy the full SHA 4a98478View commit details -
cloud_storage: Async manifest view fixes
Various fixes from the prev code review. A lot of renamed methods/variables. The semaphore in the materialized_manifest_cache is now named.
Configuration menu - View commit details
-
Copy full SHA for b704011 - Browse repository at this point
Copy the full SHA b704011View commit details -
cloud_storage: Add spillover manifests to snapshot
Persist list of spillover manifests in the archival STM snapshot
Configuration menu - View commit details
-
Copy full SHA for 590cc77 - Browse repository at this point
Copy the full SHA 590cc77View commit details -
Configuration menu - View commit details
-
Copy full SHA for 302a70f - Browse repository at this point
Copy the full SHA 302a70fView commit details -
cloud_storage: Rename get_archive_term_column method
In the 'segment_meta_cstore' change 'get_archive_term_column' to 'get_archiver_term_column' to match the name of the field.
Configuration menu - View commit details
-
Copy full SHA for b60e202 - Browse repository at this point
Copy the full SHA b60e202View commit details -
cloud_storage: Optimize 'get_term_last_offset'
Avoid full metadata scan by using 'get_segment_term_column' to locate the manifest that contains required term id.
Configuration menu - View commit details
-
Copy full SHA for 8961f90 - Browse repository at this point
Copy the full SHA 8961f90View commit details -
cloud_storage: Include NTP into the key
...used by the materialized_manifest_cache. The cache is used by several partitions simultaneosly so it has to be able to store manifests with the same base offsets. Update ducktape test to use more than one partition. Previously, this test was passing because it used only one parititon.
Configuration menu - View commit details
-
Copy full SHA for 64a666d - Browse repository at this point
Copy the full SHA 64a666dView commit details -
cloud_storage: Extract materialized manifest cache tests
Move cache tests into a separate translation unit. Rename cloud_storage_basic to clud_storage and move cache test there.
Configuration menu - View commit details
-
Copy full SHA for c32888d - Browse repository at this point
Copy the full SHA c32888dView commit details -
cloud_storage: Remove retries from init_cursor
Do not retry in the remote_partition::init_cursor because the async_manifest_view::get_cursor retries internally.
Configuration menu - View commit details
-
Copy full SHA for 51eaa66 - Browse repository at this point
Copy the full SHA 51eaa66View commit details