-
Notifications
You must be signed in to change notification settings - Fork 408
PDP 43A (Large Events Simplified)
Derek Moore edited this page Jul 12, 2021
·
1 revision
Status: UNDER REVIEW.
NOTE: this is an alternative to PDP-43. PDP-43 is not discarded or otherwise abandoned - this page (PDP-43A) presents an alternate approach to getting this done.
Discussion and comments in Issue 6052.
Table of Contents:
- Motivation
- Requirements
- Other Considered Approaches
- Design
- Segment Store Changes
- Client and Wire Protocol Changes
See PDP-43.
See PDP-43.
Changes:
- Requirement 2 is relaxed (ingestion speed may not match that of classical appends, as long as it stays within a reasonable range).
- Requirement 3 is dropped.
See PDP-43.
See PDP-43. The entire design stays the same, except for the changes below.
Changes:
- Transient Segments are created via a new wire command
CreateTransientSegment
. TheCreateTransientSegment
will name the parent segment. The Segment Store will generate a unique name and enter it into the metadata. This name is not returned to the Client. - Transient Segments will not allow any Extended Attributes.
- Unmerged Transient Segments will be cleaned up (eventually) after a container recovery.
- If a connection is dropped, the Append Processor will lose its reference to the Transient Segment.
- Transient Segments are cleaned up periodically (as PDP-43 explains), most likely triggered by
MetadataCleanupService
running in each container. - There is no "Merge With Header" operation like PDP-43 asks for. The Transient Segment will be merged as-is using the regular merge command. It is up to the Client/Append Processor to add appropriate Event envelopes for the large event.
- The section Writer Side is dropped.
- The section Reader Side is dropped.
- The section Event Serialization is dropped.
- NameUtils.java
- Create API to properly name transient segments. To make this a lot easier, we should borrow from the scheme we use to name transactions, and we should make sure those names cannot possibly clash.
- New Segment Type:
Transient
- This must be added as a "Role" to
SegmentType
. - A Segment may not be a
TableSegment
andTransient
at the same time.
- This must be added as a "Role" to
-
MetadataStore.java
- If
SegmentType.isTransient()
, then verify the name of the segment matches the transient pattern (as defined in NameUtils.java) and DO NOT insert it into the Metadata Table. Instead, create aStreamSegmentMapOperation
(withPinned==true
) and queue it up in the DurableLog. This ensures the segment is pinned to memory (will never be evicted) and we won't ever bother the Metadata Table with it.
- If
-
SegmentMetadataUpdateTransaction.preProcessAttributes
- Disallow extended Attributes (
Attributes.isCoreAttribute(...) == false
) for transient segments.
- Disallow extended Attributes (
- AppendProcessor
- When receiving a
CreateTransientSegment
, generate a transient segment name using NameUtils, and create that segment on the store, making sure theSegmentType
is set totransient
(see above). - Add a
close()
method on AppendProcessor which is invoked when the connection closes and will in turn delete any open Transient segments. - TODO: we need to figure out how to handle the merging of this segment. Merge is currently handled by PravegaRequestProcessor, which has no connection to AppendProcessor. Since the name of the transient segment is not sent back to the client, the client may not know which segment to seal.
- When receiving a
- MetadataCleanupService
- Every time it runs (even in force-runs due to memory pressure), collect all Transient Segments which haven't been used in a while and are not merged. Delete those segments.
- The Metadata Cleanup service already has a way to track segments to evict. This may or may not be useful as-is; for Transient Segments we must ensure that a certain amount of time has elapsed since they were last used. We currently track if they are currently used by anyone and execute every 60 seconds - this may not be appropriate. To handle this, we may need another field on
SegmentMetadata
to keep track of last access time (in clock time). This time should be set to Long.MIN_VALUE after recoveries (to ensure speedy deletions of transient segments that haven't been merged).
Wire Protocol
- A new wire command for
CreateTransientSegment
needs to be added. Additionally there should be a corresponding replyTransientSegmentCreated
- Either the created would contrin the segment ID so that the subsequent SetupAppend could write data to it. Or:
- Alternatively the Client may generate a unique name for the segment and send that name along. This would solve the PravegaRequestProcessor.merge problem discussed above.
Client
- The Writer needs to detect if an incoming Event is greater than the allowed limit of 8MB. If so, it must follow the Large Event protocol
- CreateTransientSegment
- SetupAppend
- Append to transient segment
- Call flush on the target segment (to preserve order)
- Merge transient segment into the target segment
- The writer need to ensure order. When a large event is detected, it must, under lock:
- create a new transient segment
- write the large event to the transient segment
- Flush target segment
- flush the transient segment
- merge the transient segment
- Wait for ack. (If a SegmentIsSealed error is instead returned 1-6 are repeated).
- Reader
- The reader will need to allocate enough memory to load the entire large Event.
- Contributing
- Guidelines for committers
- Testing
-
Pravega Design Documents (PDPs)
- PDP-19: Retention
- PDP-20: Txn Timeouts
- PDP-21: Protocol Revisioning
- PDP-22: Bookkeeper Based Tier-2
- PDP-23: Pravega Security
- PDP-24: Rolling Transactions
- PDP-25: Read-Only Segment Store
- PDP-26: Ingestion Watermarks
- PDP-27: Admin Tools
- PDP-28: Cross Routing Key Ordering
- PDP-29: Tables
- PDP-30: Byte Stream API
- PDP-31: End-to-End Request Tags
- PDP-32: Controller Metadata Scalability
- PDP-33: Watermarking
- PDP-34: Simplified-Tier-2
- PDP-35: Move Controller Metadata to KVS
- PDP-36: Connection Pooling
- PDP-37: Server-Side Compression
- PDP-38: Schema Registry
- PDP-39: Key-Value Tables Beta 1
- PDP-40: Consistent Order Guarantees for Storage Flushes
- PDP-41: Enabling Transport Layer Security (TLS) for External Clients
- PDP-42: New Resource String Format for Authorization
- PDP-43: Large Events
- PDP-44: Lightweight Transactions
- PDP-45: Health Check
- PDP-46: Read Only Permissions For Reading Data
- PDP-47: Pravega Consumption Based Retention
- PDP-48: Key-Value Tables Beta 2
- PDP-49: Segment Store Admin Gateway
- PDP-50: Stream Tags
- PDP-51: Segment Container Event Processor
- PDP-53: Robust Garbage Collection for SLTS
- PDP-54: Tier-1 Repair Tool
- PDP-55: New Reader API on segment level