-
Notifications
You must be signed in to change notification settings - Fork 0
PDP 36: Connection Pooling
A Stream in Pravega is split into set of shards or partitions generally referred to as Segment
s. When an event is written to the Stream by the Pravega Client it is written into one of these segments based on the event routing key. These Segment
s which are a part of SegmentContainer
s are managed by the different Segment Store Service instances.
At present the Pravega EventStreamWriter
creates new connections to different segment stores for every Segment
it is writing to. A new connection to a segment store is created even when multiple segments are owned by the same segment store. In the case of EventStreamReader
, every Segment
being read by the Pravega client maps to a new connection. RevisionedStreamClient
, which enables to read and write to a Stream with strong consistency also creates new connection to a segment store for every Segment
it is reading from and a new connection for every Segment it is writing to.
The number of connections created increases if the user is writing and reading from multiple Streams.
The goal of connection pooling is to ensure a common pool of connections between the client process and the segment stores, which does not require a linear growth of the number of connections with the number of segments.
The requirement can be broken down into the following
- WireCommand Changes to enable connection pooling during Append path.
- WireCommand Changes to enable connection pooling during Segment Reads.
- WireCommand changes to enable connection pooling for the RevisionedStreamClient.
- Connection pool for Netty channels (
io.netty.channel.Channel
) and enable the ability to match 1:1 replies that a request generated.
No changes to the API are expected.
Netty channel (io.netty.channel.Channel
) provides a way to interact with a network socket or a component which is capable of I/O operations such as read, write, connect and bind. As a part of connection pooling we need to ensure that
The number of channels/connections between a particular client process and a given segment store is configurable.
2. Given a pool of connections we should have the ability to match 1:1 the reply that a request generated. This implies that the response WireCommands.SegmentRead
received for a WireCommands.ReadSegment
request sent by a Segment Reader should be sent to the same reader. Similarly, in the case of Append
request sent by a Segment writer the response WireCommands.DataAppended
should be sent to the same segment writer. Since writers batch multiple Append
s into an WireCommands.AppendBlock
we should ensure that connections from different Segment writers are shared based on the load.
- Close of Segment Writer or Segment Reader does not imply that the underlying connection should be closed.
Every Segment Writer or Segment Reader connects to the underlying io.netty.channel.Channel
using io.pravega.client.netty.impl.ClientConnection
which is obtained by passing the URI of the segment store and the registering its io.pravega.shared.protocol.netty.ReplyProcessor
. The ability to match 1:1 replies that a request generated can be implemented by using a request id field which is returned by the segment store with every reply. Based on the requestId
field, the client invokes the correct ReplyProcessor
of the client (Segment Writer/Reader) sharing the same underlying connection. The requestId
is present in most of the WireCommands
.
To manage the multiple underlying connections Netty4 already has multiple io.netty.channel.pool.ChannelPool
implementations.
-
io.netty.channel.pool.SimpleChannelPool
: This creates a new connection if there is no channel in the pool. There is no limit on the number of connections that can be created. -
io.netty.channel.pool.FixedChannelPool
: This is an implementation ofChannelPool
which enforces a maximum limit on the number of connections that can be created. These options are not useful, however, and alsoChannelPool
has been removed from Netty 5.(https://github.com/netty/netty/pull/8681
).
The pooling mechanism for Pravega will be implemented internally, since it would also need to be flexible to accommodate other heuristics like if an appending connection is sufficiently busy then we do not share with other Segment writers/readers. (This is valid if the connection pool per SSS > 1)
Also, since we cannot close the underlying connection when a given SegmentWriter
closes a ReferenceCountedResource
in commons which keeps track of the resources and closes the resource when all the users of this resource have invoked close needs to be implemented. This will be used to close io.netty.channel.Channel
(s) , the count of Channels is a configurable value which can be >=.
Pravega - Streaming as a new software defined storage primitive
- Contributing
- Guidelines for committers
- Testing
-
Pravega Design Documents (PDPs)
- PDP-19: Retention
- PDP-20: Txn timeouts
- PDP-21: Protocol revisioning
- PDP-22: Bookkeeper based Tier-2
- PDP-23: Pravega Security
- PDP-24: Rolling transactions
- PDP-25: Read-Only Segment Store
- PDP-26: Ingestion Watermarks
- PDP-27: Admin Tools
- PDP-28: Cross routing key ordering
- PDP-29: Tables
- PDP-30: Byte Stream API
- PDP-31: End-to-end Request Tags
- PDP-32: Controller Metadata Scalability
- PDP-33: Watermarking
- PDP-34: Simplified-Tier-2
- PDP-35: Move controller metadata to KVS
- PDP-36: Connection pooling
- PDP-37: Server-side compression
- PDP-38: Schema Registry
- PDP-39: Key-Value Tables
- PDP-40: Consistent order guarantees for storage flushes
- PDP-41: Enabling Transport Layer Security (TLS) for External Clients
- PDP-42: New Resource String Format for Authorization
- PDP-43: Large Events
- PDP-44: Lightweight Transactions
- PDP-45: Healthcheck
- PDP-46: Read Only Permissions For Reading Data
- PDP-47: Pravega Message Queues