PDP 36: Connection Pooling

Introduction:

A Stream in Pravega is split into set of shards or partitions generally referred to as Segments. When an event is written to the Stream by the Pravega Client it is written into one of these segments based on the event routing key. These Segments which are a part of SegmentContainers are managed by the different Segment Store Service instances.

At present the Pravega EventStreamWriter creates new connections to different segment stores for every Segment it is writing to. A new connection to a segment store is created even when multiple segments are owned by the same segment store. In the case of EventStreamReader, every Segment being read by the Pravega client maps to a new connection. RevisionedStreamClient, which enables to read and write to a Stream with strong consistency also creates new connection to a segment store for every Segment it is reading from and a new connection for every Segment it is writing to.

The number of connections created increases if the user is writing and reading from multiple Streams.

The goal of connection pooling is to ensure a common pool of connections between the client process and the segment stores, which does not require a linear growth of the number of connections with the number of segments.

Requirements:

The requirement can be broken down into the following

WireCommand Changes to enable connection pooling during Append path.
WireCommand Changes to enable connection pooling during Segment Reads.
WireCommand changes to enable connection pooling for the RevisionedStreamClient.
Connection pool for Netty channels (io.netty.channel.Channel) and enable the ability to match 1:1 replies that a request generated.

API Changes:

No changes to the API are expected.

Design:

Connection pool for Netty channels:

Netty channel (io.netty.channel.Channel) provides a way to interact with a network socket or a component which is capable of I/O operations such as read, write, connect and bind. As a part of connection pooling we need to ensure that

The number of channels/connections between a particular client process and a given segment store is configurable. 2. Given a pool of connections we should have the ability to match 1:1 the reply that a request generated. This implies that the response WireCommands.SegmentRead received for a WireCommands.ReadSegment request sent by a Segment Reader should be sent to the same reader. Similarly, in the case of Append request sent by a Segment writer the response WireCommands.DataAppended should be sent to the same segment writer. Since writers batch multiple Appends into an WireCommands.AppendBlock we should ensure that connections from different Segment writers are shared based on the load.

Close of Segment Writer or Segment Reader does not imply that the underlying connection should be closed.

Every Segment Writer or Segment Reader connects to the underlying io.netty.channel.Channel using io.pravega.client.netty.impl.ClientConnection which is obtained by passing the URI of the segment store and the registering its io.pravega.shared.protocol.netty.ReplyProcessor . The ability to match 1:1 replies that a request generated can be implemented by using a request id field which is returned by the segment store with every reply. Based on the requestId field, the client invokes the correct ReplyProcessor of the client (Segment Writer/Reader) sharing the same underlying connection. The requestId is present in most of the WireCommands .

To manage the multiple underlying connections Netty4 already has multiple io.netty.channel.pool.ChannelPool implementations.

io.netty.channel.pool.SimpleChannelPool : This creates a new connection if there is no channel in the pool. There is no limit on the number of connections that can be created.
io.netty.channel.pool.FixedChannelPool : This is an implementation of ChannelPool which enforces a maximum limit on the number of connections that can be created. These options are not useful, however, and also ChannelPool has been removed from Netty 5.(https://github.com/netty/netty/pull/8681).

The pooling mechanism for Pravega will be implemented internally, since it would also need to be flexible to accommodate other heuristics like if an appending connection is sufficiently busy then we do not share with other Segment writers/readers. (This is valid if the connection pool per SSS > 1)

Also, since we cannot close the underlying connection when a given SegmentWriter closes a ReferenceCountedResource in commons which keeps track of the resources and closes the resource when all the users of this resource have invoked close needs to be implemented. This will be used to close io.netty.channel.Channel (s) , the count of Channels is a configurable value which can be >=.

Pravega - Streaming as a new software defined storage primitive

Provide feedback

Saved searches