Skip to content

Latest commit

 

History

History
68 lines (63 loc) · 12.9 KB

terminology.md

File metadata and controls

68 lines (63 loc) · 12.9 KB

Terminology

The glossary of terms related to Pravega is given below:

Term Definition
Pravega Pravega is an open source storage system that exposes stream as the main primitive for continuous and unbounded data.
Stream A durable, elastic, append-only, unbounded sequence of bytes that has good performance and strong consistency.
A Stream is identified by a Stream name and a Scope.
A Stream is comprised of one or more Stream Segments.
Stream Segment A shard of a Stream
The number of Stream Segments in a Stream might vary over time according to load and Scaling Policy.
In the absence of a Scale Event, Events written to a Stream with the same Routing Key are stored in the same Stream Segment and are ordered. 
When a Scale Event occurs, the set of Stream Segments of a Stream changes and Events written with a given Routing Key K before the Scaling Event are stored in a different Stream Segment compared to Events written with the same Routing Key K after the event.
In conjunction with Reader Groups, the number of Stream Segments is the maximum amount of read parallelism of a Stream.
Scope A namespace for Stream names.
A Stream name must be unique within a Scope.
Event A collection of bytes within a Stream.
An Event is associated with a Routing Key.
Routing Key A property of an Event used to route messages to Readers.
Two Events with the same Routing Key will be read by Readers in exactly the same order they were written.
Reader A software application that reads data from one or more Streams.
Writer A software application that writes data to one or more Streams.
Pravega Java Client Library A Java library used by applications to interface with Pravega
Reader Group A named collection of one or more Readers that read from a Stream in parallel.
Pravega assigns Stream Segments to the Readers ensuring that all Stream Segments are assigned to at least one Reader and that they are balanced across the Readers.
Position An offset within a Stream, representing a type of recovery point for a Reader.
If a Reader crashes, a Position can be used to initialize the failed Reader's replacement so that the replaced Reader resumes processing the Stream from where the failed Reader left off.
Tier 1 Storage Short term, low-latency, data storage that guarantees the durability of data written to Streams.
The current implementation of Tier 1 uses  Apache Bookkeeper.
Tier 1 storage keeps the most recent appends to streams in Pravega.
As data in Tier 1 ages, it is moved out of Tier 1 into Tier 2.
Tier 2 Storage A portion of Pravega storage based on cheap and deep persistent storage technology such as HDFS, DellEMC's Isilon or DellEMC's Elastic Cloud Storage.
Pravega Server A component of Pravega that implements the Pravega data plane API for operations such as reading from and writing to Streams.
The data plane of Pravega, also called the Segment Store, is composed of one or more Pravega Server instances.
Segment Store A collection of Pravega Servers that in aggregate form the data plane of a Pravega cluster.
Controller A component of Pravega that implements the Pravega control plane API for operations such as creating and retrieving information about Streams.
The control plane of Pravega is composed of one or more Controller instances coordinated by Zookeeper.
Auto Scaling A Pravega concept that allows the number of Stream Segments in a Stream to change over time, based on Scaling Policy.
Scaling Policy A configuration item of a Stream that determines how the number of Stream Segments in the Stream should change over time.
There are three kinds of Scaling Policy, a Stream has exactly one of the following at any given time.
- Fixed number of Stream Segments
- Change the number of Stream Segments based on the number of bytes per second written to the Stream (Size- based)
- Change the number of Stream Segments based on the number of Events per second written to the Stream (Event-based)
Scale Event There are two types of Scale Event: Scale-Up Event and Scale-Down Event. A Scale Event triggers Auto Scaling.
A Scale-Up Event occurs when there is an increase in load, the number of Stream Segments are increased by splitting one or more Stream Segments in the Stream
A Scale-Down Event occurs when there is a decrease in load, the number of Stream Segments are reduced by merging one or more Stream Segments in the Stream.
Transaction A collection of Stream write operations that are applied atomically to the Stream.
Either all of the bytes in a Transaction are written to the Stream or none of them are.
State Synchronizer An abstraction built on top of Pravega to enable the implementation of replicated state using a Pravega segment to back up the state transformations.
A State Synchronizer allows a piece of data to be shared between multiple processes with strong consistency and optimistic concurrency.
Checkpoint A kind of Event that signals all Readers within a Reader Group to persist their state.
StreamCut A StreamCut represents a consistent position in the Stream. It contains a set of Segment and offset pairs for a single Stream which represents the complete keyspace at a given point in time. The offset always points to the event boundary and hence there will be no offset pointing to an incomplete Event.